The 24-hour push
Voidly Atlas is an open intelligence layer for global internet censorship — 84,464 verified evidence records spanning 214 countries, drawn from OONI, IODA, CensoredPlanet, and a network of 40 probe nodes. On top of that corpus we run a tower of machine-learning models: a country-day censorship classifier, a shutdown forecaster, an anomaly detector, a duration model, a causal-attribution engine, and several smaller experiments.
The 2026-05-21 push was a coordinated promote cycle. Every model shipped today either improved a published headline metric, added a previously-missing capability (uncertainty, attribution, survival), or failed its promote gate cleanly and was retired with a permalinked write-up. Three negatives were genuinely negative — they did not promote, and we have published the loss with the same emphasis as the wins.
This page is built for journalists, researchers, AI labs, and anyone evaluating Voidly Atlas as a citation source. Every number below links to either a live transparency endpoint or a stored sidecar JSON produced by the training script.
Thirteen wins
Each entry below names the model, the headline lift, and the transparency surface where you can audit the claim live.
- 1 · classifier
Country-day classifier v3.3 — regime-similarity-weighted contagion
v3.2 had regressed Equatorial Guinea (EG) from F1 1.0 to 0.55 by computing neighbor contagion uniformly across all countries. v3.3 reweights the contagion signal by regime similarity (Polity score, structural blocking rate), letting the classifier ignore noise from dissimilar neighbors. Stratified F1 climbed from 0.674 to 0.729 — a 4-percentage- point median lift, with the largest gains on the tail (CG +35pp, OM +29pp, ZW +24pp).
LOCO median F1 0.870, LOCO mean F1 0.711 on 4,237 samples / 131 countries. 16 features (13 base + 3 contagion). Honest caveat — 16 MENA + former-Soviet countries (OM, UZ, TN, LY, YE, JO, MA) regress 5-29pp because their neighbor- pair overlap is sparse. Live at /v1/classifier/info and /feature-importance.
- 2 · forecast
Multi-horizon forecast — 1-day, 7-day, 30-day
Previously the only public horizon was 7 days. The new multi-horizon model trains three XGBoost + isotonic calibrators side by side and ships per-horizon SHAP top-5, per-horizon 90% conformal intervals, and a monotonicity- consistency check (longer horizon should never be more certain than shorter).
LOCO AUC 0.91 / 0.88 / 0.84 on the 20 spotlight watched countries. Live at /atlas/multi-horizon and the per-country detail pages. API:
GET /v1/forecast/{cc}/multi-horizon. - 3 · calibration
Adaptive Conformal Inference — online α update
The static conformal interval had been drifting (Brier 0.59 triggered a manual recalibration on 2026-05-19). ACI follows Gibbs & Candès (NeurIPS 2021) — after each observation, the script updates α with
α_t+1 = α_t + γ · (α − 1{y ∉ interval})with γ = 0.01. This is a small ablation, but it kills the manual recalibration treadmill.Current state α = 0.21 (started 0.10, drifted up because the model misses long-tail positives), empirical coverage 91.3% over n=840 observations. Live in every /v1/forecast/{cc}/7day response as
aci_alpha+aci.*fields. Cron 03:45 UTC daily. - 4 · anomaly
DBSCAN unsupervised anomaly — second-opinion signal
Inspired by the CenDTect-style 2022 pattern. Rolling 45-day per-country window, DBSCAN with ε set to the 75th-percentile kNN distance and min_samples = 3, applied to 12 standardized OONI features. AUC 0.6506 against labeled incidents — just above our 0.65 promote floor.
Promoted as a second-opinion signal, not a replacement. The supervised classifier still wins (AUC 0.99); DBSCAN surfaces shape-anomalous days the labels never saw. Live at /atlas/anomaly +
/v1/anomaly/dbscan/{cc}. - 5 · domain drift
HDBSCAN per-domain drift — novel-mechanism detection
Where DBSCAN looks at country-day shapes, HDBSCAN clusters per-domain weekly fingerprints to surface mechanism shifts — for example, a domain that flips from TCP-reset to DNS-poison blocking in one country. Week-over-week clusters are compared and divergent domains are flagged.
Live at /atlas/domain-drift. Not a headline-AUC model; this is a research surface for analysts.
- 6 · per-measurement
Per-measurement classifier — Niaki et al. KDD23-style
XGBoost row-level censorship classifier trained on the full 84K evidence corpus with a stratified 80/20 split. Inspired by Niaki et al. (KDD 2023, “ICLab and the long tail of censorship”). AUC 1.0 on holdout, which we publish with a loud caveat: the model is recovering the labeling rule from
signal_valueandsourcepatterns rather than discovering novel signal. Top feature isasn_7d_rate(81% gain).It still lights up as a per-row interface to the same evidence the country-day v3.3 model uses, exposed at
POST /v1/measurement/classify. Treat the AUC as “reproduces the labeler perfectly”, not as novel ground truth. - 7 · graph nn
GraphSAGE over CAIDA AS topology — ASN-level forecast
Two-layer GraphSAGE, hidden dim 16, dropout 0.5, 60 epochs. Trained on the 7,060-node, 841K-edge CAIDA serial-2 May 2026 AS-AS peering graph with 58 labeled ASNs (40 positive). We evaluate with leave-one-out cross-validation across the 6 tier-1 ASNs: AUC 0.80, accuracy 5/6, permutation p = 0.32.
Honest caveat — the permutation test is underpowered at n=6, so the model ships with
passed_promote_floor=falseand surfaces an inline honest_caveat in every response. Live at/v1/forecast/asn-gnn/{asn}, accepts either bare digits or the AS47541 prefix. - 8 · fusion
Bayesian multi-source corroboration
Combines OONI, IODA, CensoredPlanet, and Voidly probe signals into a posterior P(censorship | sources) per country-day. Each source has its own sensitivity / specificity prior fitted from historical agreement.
AUC 0.916, expected calibration error 0.021 on the 30-day holdout. Live at /atlas/corroboration.
- 9 · causal
Synthetic Difference-in-Differences attribution
Builds a synthetic counterfactual from stable-democracy donor countries, measures the post-period gap, runs a permutation p-value, and surfaces nearby political events from the GDELT + Wikipedia event feeds. Adapted from Arkhangelsky et al. (arXiv:1812.09970) with NetLoss-style scoping (ISOC, ACM IMC 2024).
Live at /sentinel/attribute?country=X&date=Y. Implementation in
scripts/sdid_attribution.py. - 10 · survival
Random Survival Forest — shutdown duration
Closed a long-standing gap: until today the Atlas could tell you whether a shutdown was likely, but not how long it would last. The RSF is trained on the 343 confirmed historical shutdowns (n=343, c-index 0.728) and exposes a per-country expected-duration curve.
Honest caveat — 343 events is small for survival modeling, and right-censoring is heavy for tail-risk countries. Live at /atlas/duration.
- 11 · trajectory
Seq2seq 30-day trajectory forecast
Where multi-horizon ships three isolated heads, the trajectory model is a single sequence-to-sequence encoder- decoder that emits a smooth 30-day P(shutdown) curve with a 90% conformal band per day. Useful for journalists who want the shape of risk, not just three quantiles.
Median LOCO AUC 0.74 across spotlight countries. Live at /atlas/forecast-trajectory/{cc}.
- 12 · het. treatment effects
Causal forest heterogeneous treatment effects
Athey & Wager-style causal forest (2019) estimating the per-country effect of an election on shutdown risk. Global average treatment effect: +9.6 percentage points. Vietnam pops at +32pp; most stable democracies sit near zero.
Live at /atlas/hte.
- 13 · cohorts
Dynamic Time Warping cohort clustering
DTW distance between per-country daily-signal curves with Ward hierarchical clustering surfaces shape-similar regimes with phase offsets — something Pearson correlation cannot do. Silhouette score 0.47 at K=3.
Live at /atlas/cohorts.
Two infrastructure fixes are not in the list above but deserve a callout. First, the dual-holdout retrain gate (legacy + temporal) now blocks any promote with a temporal regression or a catastrophic legacy regression (≥−0.10 F1) — this unblocks the weekly retrain that had been stalled since the May recalibration drift. Second, the IODA disruption label fix: raw IODA outages had been flowing into target_7day as confirmed censorship, pushing April's monthly positive rate to 79% (1,011 of 1,074 incidents were disruption noise). The fix in scripts/build-forecast-features.py (WHERE incident_type != 'disruption') brought it back to 21%, and a regression test fails any retrain that lets a 12-month positive rate exceed 40%.
Three experiments that did not promote
We publish our failures with the same SHA-pinned permalinks as our wins. That is the entire point of a public model changelog.
- N1 · classifier v3.4
Regime-cluster fine-tuning — NOT promoted
Hypothesis: train per-regime sub-models (Western liberal, MENA, post-Soviet, East Asian autocracy) then blend by soft- assignment. Result: −3.6pp LOCO F1 versus v3.3 baseline. The blend introduced more variance than the per-cluster fits saved.
Sidecar JSON archived at
/opt/voidly-ai/ml-deploy/classifier_v3.4_REJECTED.jsonfor reproducibility. Indexed in the model changelog with status “rejected”. - N2 · classifier v3.5
TabPFN-v2 prior-data fitted network — NOT promoted
Hypothesis: Hollmann et al. (2023) TabPFN-v2 is a strong zero-training tabular classifier; let it replace the GradientBoosting tower. Result: −1pp stratified F1. Acceptable on aggregate but indistinguishable from baseline on the spotlight countries we actually care about, and inference is ~30× slower than v3.3.
Build script kept in repo at
scripts/build-classifier-v3.5-tabpfn.pywith a banner comment marking it rejected. Useful as a future fallback if v3.3 ever fails. - N3 · ssl pretrain
Self-supervised masked-autoencoder tabular pretrain — NOT promoted
Hypothesis: pretrain a tabular MAE on the full unlabeled evidence corpus, then fine-tune on the labeled subset. The pretrained representation should help where labels are sparse. Result: −15.6pp F1 versus v3.3 — the pretrain pulled the model toward the marginal evidence distribution (heavily dominated by noisy disruption rows) and away from the rare-event positive class.
Honest negative. Worth re-running if we ever balance the pretrain corpus by label class.
We publish our failures with the same SHA-pinned permalinks as our wins. That is the entire point of a public model changelog. If you cannot find the loss, you should not trust the win.
Fourteen new surfaces
Every model above is exposed through at least one Atlas frontend page. All pages are server-rendered, hourly-ISR, and link back to their underlying transparency endpoint.
The papers behind the push
We adapted four published methods into the Atlas tower today. Each is cited inline in the relevant training script so the chain of attribution is preserved.
- Gibbs & Candès (NeurIPS 2021)Adaptive Conformal Inference. Used in the daily ACI cron to keep the forecast's 90% interval calibrated as the underlying distribution drifts.
- Arkhangelsky et al. (arXiv:1812.09970)Synthetic Difference-in-Differences. Used as the counterfactual estimator in
scripts/sdid_attribution.pywith NetLoss-style scoping (ISOC, ACM IMC 2024). - Niaki et al. (KDD 2023)Per-measurement censorship classifier. Adapted to the Voidly 84K evidence corpus, with the published honest caveat that the AUC reflects label-rule reconstruction.
- Athey & Wager (2019)Causal forest for heterogeneous treatment effects. Used to estimate per-country election-on-shutdown ATEs.
- Hollmann et al. (2023)TabPFN-v2 prior-data fitted network. Evaluated and rejected for the v3.5 attempt — kept in repo as a documented fallback.
Verify, cite, integrate
Every claim on this page links to a live transparency endpoint. The fastest paths to verification and use:
- /atlas— the live hub, daily refresh, links to every Atlas surface.
- /atlas/findings— curated researcher-bylined deep-dives, one model per page.
- /atlas/journalist-toolkit— press kit, citation templates, embed widgets, contact.
- /atlas/changelog— the full model history, including every rejected build.
- /api-docs— REST API reference, OpenAPI schema, MCP server install.
- /press— embargo policy, logos, contact details for newsrooms.
Cite
License: CC BY 4.0. Reuse encouraged; please link back to this page so readers can audit our chain to the upstream sources.
Related Atlas surfaces
- /atlas — live hub
- /atlas/state-of-censorship-2026 — annual edition (the freeze-frame view)
- /atlas/recent-changes — what shifted in the last 24h / 7d
- /methodology — full pipeline + 3 honest accuracy splits