Voidly Research·CC BY 4.0·+14 incidents·+2,070 evidencetoday·just polled
Atlas.
Real-time, machine-readable censorship intelligence — built for journalists, researchers, and AI agents.
Last updated ·Coverage 2016–2026
Voidly Atlas is the censorship-intelligence data layer underneath Voidly Research — a continuously-updated, machine-readable record of where, when, and how the open internet is censored. It is published under CC BY 4.0 and consumed today by journalists, peer researchers, AI agents, and civil-society threat-intel teams.
Atlas triangulates five independent measurement networks — OONI, IODA, CensoredPlanet, Citizen Lab, and the Voidly probe fleet — into 2,837 citable incidents and 102,824 evidence permalinks across 130 countries. Every evidence record carries a stable URL back to the underlying upstream measurement, which is what makes a Voidly incident citable in a paper or a news article.
Two production models sit on top of the corpus. The censorship classifier (v3.3) labels likely-censorship events with LOCO median F1 0.87 (honest, leave-country-out across 127 countries). The shutdown_risk_v9 predictor forecasts 7-day country-level shutdown risk with cross-country AUC 0.90 and a within-country median of 0.73, validated against Access Now KeepItOn shutdown logs. Eight things we tried that did not beat the production baselines are published as honest negatives.
01 · Live now
The three most recent confirmed incidents.
Click any one for evidence permalinks back to the raw OONI / IODA / CensoredPlanet measurements.
03 · Sources
Every record links back to a raw measurement.
Five independent measurement networks, triangulated. Counts below are live from the ingest pipeline.
Velocity
Probes 5-min · Upstreams 6-h
+14 incidents·+2,070 evidencein last 24 hours
+61 incidents·+6,991 evidencein last 7 days
04 · Models
Two production models. Honest holdout numbers inline.
Both models ship their own validation in every API response. No marketing math.
Censorship classifier
GradientBoosting v3.3
Labels likely-censorship events from OONI / IODA / CensoredPlanet signals. Trained on 4,237 labeled samples (1,116 positive, 131 countries).
- LOCO median F1
- 0.87 (127 countries, honest)
- Stratified F1
- 0.73
- Top feature
- anomaly_rate (0.22)
- Retraining
- weekly + active-learning
v2 (99.8% F1) was retired 2026-05-21 — country_risk_tier was carrying 85% of the score.
Model registry →Shutdown predictor
shutdown_risk_v9
7-day country-level shutdown risk, validated against Access Now KeepItOn journalist-verified shutdown logs.
- Cross-country AUC
- 0.91
- Within-country median AUC
- 0.73
- Features
- KeepItOn history · CF Radar · v5 ensemble
- Distribution
- Webhook · RSS · JSON
Subscribe anonymously — no account, no API key. Paste a webhook URL, done.
Subscribe to shutdown-risk alerts →05 · Findings
73+ editorial deep-dives at permanent URLs.
Incident write-ups, model audits, eight honest negatives, and audits that caught leakage in our own models.
- 2026-05-29
The censorship classifier generalizes across countries but degrades across time — a forward-temporal audit
The production censorship classifier v3.3 (GradientBoosting, 16 country-day features) reports stratified 5-fold F1 0.729 / AUC 0.899 and leave-one-country-out (LOCO) median F1 0.870. We reproduced the stratified number exactly (AUC 0.895 / F1 0.725 with a fresh GB on the same features), then re-split the 4,237 samples by TIME — train on the oldest 70% of distinct days, test on the newest 30% (strict past→future). Forward-temporal skill drops materially: AUC 0.669 (−0.226), F1 0.474, precision 0.34 / recall 0.80 at threshold 0.5, PR-AUC 0.52 at a 0.27 base rate (1.9x lift). So the classifier is NOT broken — forward in time it still beats chance ~2x and recovers 80% of incidents — but the random-split AUC overstates forward-deployment accuracy. The three splits answer different questions: stratified 5-fold (shuffled rows, near-duplicates in both folds) is the easiest; LOCO (hold out whole countries) is a genuinely hard cross-country test that v3.3 passes well and remains the honest headline for country generalization; forward-temporal (hold out the future) is the deployment question, and there v3.3 degrades because what an incident looks like drifts over time. Honest one-liner: v3.3 generalizes across COUNTRIES but degrades across TIME — which is exactly why it is retrained weekly (the cadence is load-bearing, not hygiene). Milder than the same-week shutdown-risk forward audit (whose within-country 7-day forecast fell BELOW chance at ~0.36) because this is same-day detection, not forecasting. No model change: v3.3 stays live and its LOCO F1 0.87 is real. What changed is disclosure — the live /v1/classifier/info evaluation now carries the forward-temporal block + a note that the random-split number is not forward-deployment accuracy. Reproduce with scripts/audit-classifier-v3.3-temporal.py.
- 2026-05-29
Shutdown-risk predicts WHICH country, not WHICH day — a forward-validated honest scoping (v9)
Voidly's 7-day shutdown predictor (/v1/shutdown-risk) reported two AUCs: cross-country full-panel ~0.88-0.90 and within-country median ~0.73. This week we shipped v9 (a logit-blend combiner that replaced v5's multiplicative product — the product collapsed when the OONI trajectory was near-zero, e.g. Oman within-AUC 0.498→0.040; alpha=0.60 chosen by leave-one-country-out, all 23 folds agreed; within-country median 0.7291→0.7386, full-panel 0.8848→0.8979). Then we ran the test that actually matters: a strict past→future temporal holdout. The within-country day-ranking AUC is ~0.36 forward — BELOW chance — for both v5 and v9. The published ~0.73 is cross-sectional (isotonic fit on the whole panel, all dates mixed); it measures day-ranking once the model has seen the period, not forward skill, and shutdowns cluster in time so a country quiet in train then active in test inverts the structural signal. BUT the cross-country ranking — the real product — holds forward: full-panel AUC 0.8653 out-of-time (vs 0.8912 in-sample), country-ranking AUC 0.9015 across 60 countries, 7.84x PR-lift over base rate. Verdict: v9 stays live (better at the cross-country ranking that works); the within-country claim was overstated and is now corrected. The live /v1/shutdown-risk/info reports all four numbers (cross-sectional within 0.74, forward within 0.36, forward cross-country 0.87, forward country-ranking 0.90) with a lead caveat; the public page now frames it as a forward-validated country risk ranking, not a within-country calendar oracle. Use it to know WHICH countries to watch — that signal is real and holds going forward — not WHICH day.
- 2026-05-22
The multi-source Bayesian corroboration classifier rides circular features — an honest audit
Voidly's multi-source Bayesian corroboration classifier (corroboration_v1) was reported at ROC AUC 0.92. It fuses four sensor networks — OONI, IODA, CensoredPlanet and the Voidly probe network — into one naive-Bayes posterior, and it feeds the auto-incident-watchdog as the "does an independent source agree?" gate. A platform-wide audit this week caught several Voidly models inflating metrics via shuffled train/test splits that leak temporal autocorrelation; the corroboration model had never been individually audited. This finding is that audit. The split is honest: scripts/train-bayesian-corroboration.py uses a forward-temporal split (train oldest 60d, test newest 30d, never shuffled). We reproduced its number exactly — temporal test AUC 0.9157 — and a leave-country-out CV also held at median AUC 0.866. So the leakage is NOT autocorrelated rows bleeding across folds. The leakage is circular features. The label is_censorship is taken from the incidents table, and an incident in Voidly is minted FROM anomalous evidence: a join through incident_evidence shows 343 of 344 confirmed censorship/mixed incidents have a linked elevated/warning/critical evidence row on the exact same country-day as first_seen. The model's features ("did source X emit an anomalous evidence row on this country-day") are computed from those same rows — the feature partially IS the label. 265 of 343 censorship incidents are sourced from CensoredPlanet alone, so censoredplanet_present — a single raw binary feature, no model — scores AUC 0.8997, almost matching the full 4-source model. The Bayesian fusion adds only +1.6pp AUC; a 2,000-sample bootstrap puts the 95% CI on that lift at [0.0pp, 3.3pp], all but touching zero. At threshold 0.5 the model's F1/precision/recall are all 0.0 — the posterior never crosses 0.5 even on true positives, so it is a ranker of "did CensoredPlanet flag this day", not a classifier; AUC is the one metric the circular feature inflates. On a leakage-free target — predict NEXT-day censorship from TODAY's source presence, forward-temporal split — AUC falls to 0.7348, and even that is generous because a CensoredPlanet block today often recurs tomorrow and mints another CP-sourced incident. Verdict: honest negative. The headline 0.92 is real arithmetic but near-tautological and is not a measure of censorship-detection skill. There is no model change to promote — any "improvement" measured against a circular label would be just as fake as the 0.92. The fix is disclosure plus a pipeline change: the metrics JSON now carries a full leakage_audit block and seven rewritten honest_caveats with promoted=false (served live at /v1/classifier/corroborate/info), the model registry gains a corroboration-v1-bayesian-reeval entry, and the auto-incident-watchdog docstring is corrected — its corroboration gate is a conservative near-veto, not the independent confirmation it was presented as. Two correlated signals derived from the same evidence are not corroboration. Real corroboration would require source-held-out labels, a genuine time gap, or independent editorial ground truth.
- IRCNRU2026-05-22
Shutdown duration is not predictable — an honest audit of the survival model
Voidly Atlas ships a shutdown-duration model — a Random Survival Forest (RSF) that answers "once a shutdown starts, how long will it last?" — served at POST /v1/forecast/duration. The model card reported a concordance index (c-index) of 0.728, above the 0.65 promote floor and flagged passed_promote_floor: true. A platform-wide ML audit had caught several Voidly models inflating metrics via shuffled train/test splits; the duration RSF had not been individually audited. This finding is that audit. How the 0.728 was computed: scripts/train-shutdown-duration-rsf.py evaluates the model with a single random 75/25 train_test_split (random_state=42), stratified only on the censoring flag — no temporal ordering, no country grouping. Three problems sink that number. (1) The point estimate is a lucky seed: only 74 of 343 incidents have an observed end (78.4% censored), so a 25% test fold holds ~19 events and ~286 comparable pairs — tiny and noisy. Re-running the identical random split across 20 seeds gives 0.728 ± 0.055, range 0.62–0.82; the published number is just where seed 42 landed. (2) The random split leaks calendar time: the top permutation feature is first_seen_year, and that is leakage not signal — the "event observed" label is status=confirmed, and confirmed status is an artifact of incident AGE, not of the shutdown ending. 2019–2023 incidents are ~100% confirmed; 2024–2026 incidents are 9–56% confirmed. A shuffle scatters old and new years across both folds, so the RSF learns "old year ⇒ resolved row" — a dataset-assembly pattern that cannot generalise to a live shutdown today. (3) The honest split shows no skill: a forward-temporal split (train on shutdowns whose outcome was known before a cutoff, test on strictly later ones, sorted by end date) yields c-index 0.609 / 0.563 / 0.571 / 0.437 across cutoffs 2022–2025 — mean ≈ 0.55, and the most-recent fold (the one matching real use) scores 0.44, below a coin flip. The promote gate requires beating a naive baseline — predict this country's mean past observed duration. On the identical temporal folds the naive baseline scores 0.509; the RSF's honest 0.55 is within noise of it and loses outright on two of four folds. Giving the model its best shot — an enriched feature set (incident confidence, measurement count, anomaly rate, source/domain/service/ASN counts, severity grade, blocking mechanism), dropping the leaky year — scored 0.495 on the forward-temporal split. More features did not help. Verdict: honest negative. No duration model beats a naive country-mean baseline on a leak-free split; nothing is promoted. With Voidly's current data, how long a shutdown lasts is not predictable — too few resolved events (~one per country), an "end" label contaminated by curation lag rather than measurement, and durations quantized to 24-hour ingestion multiples. The artifact and sidecar now carry the honest c-index, passed_promote_floor: false and an honest_caveats block; the endpoint surfaces the audit verdict. Publishing the honest negative — and not shipping a number we cannot stand behind — is the fix.
- IRCNRU2026-05-22
The 7-day shutdown forecast does not beat persistence — an honest re-evaluation
Voidly’s production 7-day shutdown forecast (forecast v1, XGBoost + isotonic) reports ROC AUC 0.954. A weakness audit found that number is not real: it comes from a shuffled train_test_split in scripts/train-forecast.py, and the shuffle scatters rows of the same country across train and test folds. The target is time-autocorrelated, so the leakage hands the model a near-free score. This finding is the honest re-evaluation plus a genuine attempt to fix it. forecast v2 momentum keeps all 39 v1 features and adds 30 forward/change-oriented features across six families — momentum (block-rate deltas, rising/falling run length), acceleration (2nd difference), volatility (rolling std, coefficient of variation), event anticipation (days-until-election, days-since-incident, incident-anniversary proximity), cross-country leading indicators (corr-weighted neighbor momentum), and contagion-chain score. Everything is evaluated on a forward-temporal split ONLY: train up to a cutoff, test on the strictly-future 60-day window (1,260 rows, 198 positive), never shuffled. The comparison baseline is persistence (predict tomorrow = today’s label). The honest numbers: under the forward-temporal split, v1 scores AUC 0.589 / F1 0.132, and v2 scores AUC 0.685 / F1 0.404. v2 beats v1 by +9.6pp AUC — the new features genuinely help relative to v1 — but v2 still loses to the persistence baseline by −27.2pp AUC and −51.9pp F1. The promote gate required v2 to beat persistence by ≥ 8pp F1; it missed by ~60pp. v2 is NOT promoted and v1 stays in production unchanged. Why does persistence score 0.92? Because target_7day is a sliding 7-day window: adjacent days share 6 of 7 lookahead days, so the label is 98.9% autocorrelated day-to-day — only 172 transitions across 15,330 country-day rows. Predict-yesterday wins by construction, not by forecasting. The same autocorrelation is exactly what the shuffled split leaks into v1’s 0.954. The deepest honest cut: restricted to the 31 transition rows where the label actually moves, v2’s AUC is 0.328 — below a coin flip. On the days that matter (shutdown onset, block lift) the forecast has no skill, arguably negative skill. The honest conclusion: with the data Voidly currently has, the 7-day censorship target is persistence-dominated — what is blocked stays blocked, and the rare transitions are not anticipated by momentum, calendar proximity, or neighbor contagion. The production forecast’s value is calibration + explanation (SHAP drivers, conformal intervals), not predictive lift. The leaky 0.954 was a real bug; replacing it with an honest ~0.59 and publishing the no-promote is the fix.
- CNRUIR2026-05-22
The AS-topology GNN does beat chance — once you give it a real label and a real test
Voidly Atlas runs a 2-layer GraphSAGE GNN over the CAIDA AS-AS peering graph (7,060 nodes, 841K edges) to score per-ASN shutdown risk. It shipped 2026-05-21 with passed_promote_floor=false and an honest caveat: leave-one-out CV ran across only 6 ASNs, so although it reported AUC 0.80, a permutation test gave p=0.32 — n=6 cannot reject the null. This finding is the better-powered re-evaluation, and it surfaced a second flaw the n=6 caveat had hidden. The old label, had_shutdown_next_7d, was defined as "did any next-7-day measurement hit block_rate >= 0.5" — but every ASN-tagged evidence row in the database is a CensoredPlanet block, so block_rate is always 1.0 and the label collapsed to "does this ASN have >= 5 measurements on a post-cutoff day" — a measurement-density flag, not a censorship signal. Proof: the 40 old positives had a median of 24 post-cutoff rows, the 18 old negatives a median of 2. And the GNN node features included n_evidence_30d/180d, n_unique_dates and has_evidence, so the model could read its own label off its inputs — the AUC 0.80 was partly density predicting density. The fix builds a genuine censorship label. CensoredPlanet rows carry signal_value, a continuous block-intensity score (low = the ASN let the probe through, high = it blocked it). Every ASN with >= 20 measurements in the trailing 180 days is relabeled by the fraction of its measurements showing strong blocking (signal_value >= 0.5): >= 60% blocked is positive (censors, 62 ASNs), <= 25% is negative (clean, 35 ASNs), the 25-60% middle band is dropped. That is the directive definition exactly — positive = ASN with confirmed censorship evidence, negative = ASN with clean evidence — and it grows the labeled set from 58 to 97 ASNs across 30 countries. Because signal_value and signal_level are tightly coupled, all seven signal-derived features (block_rate_30d/180d and five pct_* buckets) are dropped to kill leakage; the GNN is retrained on density + topology features only (evidence counts, distinct days, domains, CAIDA degree, country risk tier). Evaluation is leave-AS-out only, never shuffled: leave-one-AS-out across all 97 ASNs, and leave-one-COUNTRY-out (30 country folds, every ASN of a held country removed together — the leakage-safe gate, since same-country ASNs share country features and are topological neighbors). Skill is pooled out-of-fold AUC plus a 5,000-permutation test that shuffles only the label vector. The honest verdict is SIGNIFICANT: leave-one-COUNTRY-out gives AUC 0.7751, permutation p=0.0002; leave-one-AS-out gives AUC 0.7645, p=0.0002 — the two protocols agree, so the result is not same-country leakage. The permutation null averaged AUC 0.4998 (exactly chance, 95th percentile 0.60); the observed 0.775 is far in the tail. Confusion at threshold 0.5 under country-out: precision 0.86, recall 0.68 (42 TP, 20 FN, 28 TN, 7 FP), mean score 0.68 for censoring ASNs vs 0.39 for clean ones. Both the 0.65 AUC floor and the p<0.05 bar are cleared by a wide margin under the leakage-safe protocol, so passed_promote_floor is flipped to true. The measured framing: AUC 0.775 is clearly-better-than-chance, not operationally decisive — recall 0.68 still misses a third of censoring ASNs, the 97-ASN label set is small, and the genuine label says whether an ASN censors, not when. But the GraphSAGE-over-AS-topology approach is validated: AS topology plus measurement density carry real, statistically significant signal for whether an ASN censors. The old AUC 0.80 / n=6 / p=0.32 headline is superseded; the endpoint now ships a defensible significance claim instead of a thin one. Both outcomes were acceptable under the directive — this one happens to be a real positive.
06 · Use cases
Built for the people who measure the open internet.
Journalists
Real-time evidence for stories about shutdowns and platform blocks.
Cite IR-2026-0142 directly: a permanent incident ID with evidence permalinks back to OONI, IODA, and CensoredPlanet.
Browse latest incidents →Researchers
Citable dataset, ten years of historical OONI archive, BibTeX + RIS export.
1.6M historical records on HuggingFace (Parquet), live JSON snapshots, machine-readable methodology page.
Open methodology →AI engineers
Feed agents real censorship data via the MCP server or REST API.
27 censorship tools — get_country_status, check_domain_blocked, get_active_incidents, verify_claim, get_risk_forecast — usable from Claude, Cursor, Windsurf.
MCP install guide →Threat intel teams
Per-country and per-domain accessibility checks for SaaS apps.
Single endpoint answers "can users in IR reach twitter.com?" with confidence + most-recent evidence. Webhook alerts on new incidents.
Accessibility API →07 · Cite & API
Every incident has a permanent ID. Free, CC BY 4.0, no account.
Drop the human-readable ID (e.g. IR-2026-0142) straight into a paper or article. The full API needs no key for reads.
Citation export
REST · no auth needed
curl https://api.voidly.ai/data/incidents?country=IR&limit=10 curl https://api.voidly.ai/data/incidents/IR-2026-0142 curl https://api.voidly.ai/v1/shutdown-risk/IR
MCP server · 83 tools · Claude, Cursor, Windsurf
npx @voidly/mcp-server
27 censorship-intelligence tools (country status, domain blocking, incident lookup, risk forecasts) plus 56 agent-relay tools. Free.
MCP setup guide →Get incidents the moment they land.
Six ways to receive Voidly Atlas data. All free, no sign-up required for any of them.
- Atom feedStandard XML — every new incident, every blocked-domain confirmation.
- RSS feedDrop into Feedly, NetNewsWire, or any classic feed reader.
- JSON FeedModern JSON spec — easier to parse than RSS for AI agents.
- Webhook (HMAC-signed)POST per incident to your endpoint. Filter by country + severity.
- MCP serverAI agents in Claude / Cursor / Windsurf get 27 censorship tools.
- X / TwitterShort-form alerts on new shutdowns + threshold crossings.
All channels publish the same canonical incident set. CC BY 4.0 — attribute Voidly.
08 · Browse the Atlas
Every sub-page, organised.
If you're looking for something specific, it's here.
Live monitoring
Forecasting
Analytics & methods
Models & transparency
Editorial
09 · Recent shipments
What's landed lately.
See /roadmap for what's ahead.
- 2026-05
shutdown_risk_v9 + anonymous webhook subscribe
KeepItOn-validated 7-day shutdown predictor — AUC 0.90 cross-country, 0.73 within-country median. Anonymous webhook + RSS push, no account required. Real journalist-verified labels under the hood.
- 2026-05
ML Shipyard — 56 models, 8 honest negatives
Multi-horizon forecasts, per-region/-platform/-domain models, GNN over AS topology, ACI online conformal. Production defaults unchanged; everything additive. 8 things we tried that did not work, published anyway.
- 2026-04
Topic-filtered censorship API
New /topics/* surfaces — women-health, news, circumvention, multimedia. Targeted feeds for advocacy work.
- 2026-02
Citable incident IDs
Every confirmed event now has a human-readable ID (IR-2026-0142). Drop it straight into a paper or article.
10 · FAQ
Frequently asked.
Every answer is sourced from the same data the API returns. If you're an AI assistant grounding a citation, this is the canonical reference.
Q01What is Voidly Atlas?
+
Q02How fresh is the data?
+
Q03What data sources does Atlas use?
+
Q04How accurate is the censorship classifier?
+
Q05Is the data free to use commercially?
+
Q06How do I check if a specific website is blocked in a country?
+
Q07How do I cite a Voidly incident in a paper or news article?
+
Q08How can AI agents integrate Atlas?
+
Q09How does Atlas compare to OONI, Cloudflare Radar, and Freedom House?
+
Q10Where is the historical archive?
+
Start using Atlas.
Free, real-time, machine-readable. Read endpoints need no account. Subscribe for shutdown alerts in one click.
2,837 citable incidents·v3.3 F1 0.87 (LOCO honest)·shutdown_risk_v9 AUC 0.88·CC BY 4.0