Saha et al. WebSci 2025 modeled 150 Russian ASNs individually, finding TSPU-driven RTT acceleration synchronized with policy events. That's the kind of granular forecasting Voidly wants.
We have ASN-tagged evidence (16,712 rows across 168 ASNs in 42 countries). Prototype training pipeline + endpoint built. Honest result: not viable today.
Three blockers
- Source monoculture: 100% of our ASN-tagged evidence is from CensoredPlanet. CP probes known blocked domains, so positive-class dominates and there's no "normal day" signal to contrast.
- Sparse irregular sampling: median ~17 measurement days per ASN over 106 calendar days. Sliding 7-day windows leave 10-20 training rows per ASN — below any reasonable threshold for binary classification.
- No ASN-resolved outage labels: country-level forecast uses incidents as labels; we have no ASN-resolved outage incidents.
Density audit
| Threshold | ASNs qualifying |
|---|---|
| ≥100 rows | 52 |
| ≥100 rows AND ≥30 measurement days | 6 |
| ≥100 rows AND ≥20 measurement days | 24 |
The 6 "tier-1" ASNs: SA AS8895, CN AS146812, ID AS135473, IQ AS215597 (EarthLink), RU AS47541 (ER-Telecom), RU AS43727.
Training result
Of the 6 tier-1 ASNs, only 1 trained (RU AS47541) — the rest had single-class folds. The trained model got AUC=1.0 on n_test=6, which is statistically meaningless.
0 ASNs forecast reliably today.
What we'd need to make this work
- Probe network ASN coverage: Voidly's own probes must record probe AS#; right now only CensoredPlanet supplies it. Target: ≥10 distinct ASNs per priority country with daily coverage. ~5× current row count (80K+ ASN-tagged rows).
- ASN-resolved incident labels: extend create-voidly-incidents.py to emit ASN field when an evidence cluster is single-ASN. Without ASN ground truth there's no honest outage target.
- Negative-class enrichment: pull OONI measurements (already country-tagged) and back-join probe ASN at measurement time to give the classifier "normal" days.
- Re-evaluate at ~6 months of clean data. Saha et al. used years of MetricsLab pings, not 3 months of CP bursts.
What we left in place
Prototype runs as a SEPARATE Flask app on port 5012 (NOT patched
into the production api_v3 — exploratory work doesn't leak).
Endpoints return experimental: true and a
clear disclaimer about per-ASN forecast unreliability.
Files: scripts/build-per-asn-forecast-dataset.py,
scripts/train-per-asn-forecast.py,
scripts/patch-per-asn-endpoint.py.
Filed as roadmap
"Expand probe network ASN coverage" is now a probe-network priority. Revisit per-ASN forecasting in Q4 2026 once ASN-tagged rows hit 80K+ from voidly-owned probes (not just CP imports).