Atlas Findings
Curated deep-dives on the major censorship events Voidly Atlas has measured. Each finding gets a permanent URL with a journalist-friendly framing, relevant incident IDs, and links to the raw upstream data.
- 2026-05-29
The censorship classifier generalizes across countries but degrades across time — a forward-temporal audit
The production censorship classifier v3.3 (GradientBoosting, 16 country-day features) reports stratified 5-fold F1 0.729 / AUC 0.899 and leave-one-country-out (LOCO) median F1 0.870. We reproduced the stratified number exactly (AUC 0.895 / F1 0.725 with a fresh GB on the same features), then re-split the 4,237 samples by TIME — train on the oldest 70% of distinct days, test on the newest 30% (strict past→future). Forward-temporal skill drops materially: AUC 0.669 (−0.226), F1 0.474, precision 0.34 / recall 0.80 at threshold 0.5, PR-AUC 0.52 at a 0.27 base rate (1.9x lift). So the classifier is NOT broken — forward in time it still beats chance ~2x and recovers 80% of incidents — but the random-split AUC overstates forward-deployment accuracy. The three splits answer different questions: stratified 5-fold (shuffled rows, near-duplicates in both folds) is the easiest; LOCO (hold out whole countries) is a genuinely hard cross-country test that v3.3 passes well and remains the honest headline for country generalization; forward-temporal (hold out the future) is the deployment question, and there v3.3 degrades because what an incident looks like drifts over time. Honest one-liner: v3.3 generalizes across COUNTRIES but degrades across TIME — which is exactly why it is retrained weekly (the cadence is load-bearing, not hygiene). Milder than the same-week shutdown-risk forward audit (whose within-country 7-day forecast fell BELOW chance at ~0.36) because this is same-day detection, not forecasting. No model change: v3.3 stays live and its LOCO F1 0.87 is real. What changed is disclosure — the live /v1/classifier/info evaluation now carries the forward-temporal block + a note that the random-split number is not forward-deployment accuracy. Reproduce with scripts/audit-classifier-v3.3-temporal.py.
#ml#classifier#temporal-cv#honest-scoping#generalization#leave-one-country-out#accountability#atlas#api - 2026-05-29
Shutdown-risk predicts WHICH country, not WHICH day — a forward-validated honest scoping (v9)
Voidly's 7-day shutdown predictor (/v1/shutdown-risk) reported two AUCs: cross-country full-panel ~0.88-0.90 and within-country median ~0.73. This week we shipped v9 (a logit-blend combiner that replaced v5's multiplicative product — the product collapsed when the OONI trajectory was near-zero, e.g. Oman within-AUC 0.498→0.040; alpha=0.60 chosen by leave-one-country-out, all 23 folds agreed; within-country median 0.7291→0.7386, full-panel 0.8848→0.8979). Then we ran the test that actually matters: a strict past→future temporal holdout. The within-country day-ranking AUC is ~0.36 forward — BELOW chance — for both v5 and v9. The published ~0.73 is cross-sectional (isotonic fit on the whole panel, all dates mixed); it measures day-ranking once the model has seen the period, not forward skill, and shutdowns cluster in time so a country quiet in train then active in test inverts the structural signal. BUT the cross-country ranking — the real product — holds forward: full-panel AUC 0.8653 out-of-time (vs 0.8912 in-sample), country-ranking AUC 0.9015 across 60 countries, 7.84x PR-lift over base rate. Verdict: v9 stays live (better at the cross-country ranking that works); the within-country claim was overstated and is now corrected. The live /v1/shutdown-risk/info reports all four numbers (cross-sectional within 0.74, forward within 0.36, forward cross-country 0.87, forward country-ranking 0.90) with a lead caveat; the public page now frames it as a forward-validated country risk ranking, not a within-country calendar oracle. Use it to know WHICH countries to watch — that signal is real and holds going forward — not WHICH day.
#ml#shutdown-risk#forecasting#temporal-cv#honest-scoping#leave-one-country-out#keepiton#accountability#atlas#api - 2026-05-22
The multi-source Bayesian corroboration classifier rides circular features — an honest audit
Voidly's multi-source Bayesian corroboration classifier (corroboration_v1) was reported at ROC AUC 0.92. It fuses four sensor networks — OONI, IODA, CensoredPlanet and the Voidly probe network — into one naive-Bayes posterior, and it feeds the auto-incident-watchdog as the "does an independent source agree?" gate. A platform-wide audit this week caught several Voidly models inflating metrics via shuffled train/test splits that leak temporal autocorrelation; the corroboration model had never been individually audited. This finding is that audit. The split is honest: scripts/train-bayesian-corroboration.py uses a forward-temporal split (train oldest 60d, test newest 30d, never shuffled). We reproduced its number exactly — temporal test AUC 0.9157 — and a leave-country-out CV also held at median AUC 0.866. So the leakage is NOT autocorrelated rows bleeding across folds. The leakage is circular features. The label is_censorship is taken from the incidents table, and an incident in Voidly is minted FROM anomalous evidence: a join through incident_evidence shows 343 of 344 confirmed censorship/mixed incidents have a linked elevated/warning/critical evidence row on the exact same country-day as first_seen. The model's features ("did source X emit an anomalous evidence row on this country-day") are computed from those same rows — the feature partially IS the label. 265 of 343 censorship incidents are sourced from CensoredPlanet alone, so censoredplanet_present — a single raw binary feature, no model — scores AUC 0.8997, almost matching the full 4-source model. The Bayesian fusion adds only +1.6pp AUC; a 2,000-sample bootstrap puts the 95% CI on that lift at [0.0pp, 3.3pp], all but touching zero. At threshold 0.5 the model's F1/precision/recall are all 0.0 — the posterior never crosses 0.5 even on true positives, so it is a ranker of "did CensoredPlanet flag this day", not a classifier; AUC is the one metric the circular feature inflates. On a leakage-free target — predict NEXT-day censorship from TODAY's source presence, forward-temporal split — AUC falls to 0.7348, and even that is generous because a CensoredPlanet block today often recurs tomorrow and mints another CP-sourced incident. Verdict: honest negative. The headline 0.92 is real arithmetic but near-tautological and is not a measure of censorship-detection skill. There is no model change to promote — any "improvement" measured against a circular label would be just as fake as the 0.92. The fix is disclosure plus a pipeline change: the metrics JSON now carries a full leakage_audit block and seven rewritten honest_caveats with promoted=false (served live at /v1/classifier/corroborate/info), the model registry gains a corroboration-v1-bayesian-reeval entry, and the auto-incident-watchdog docstring is corrected — its corroboration gate is a conservative near-veto, not the independent confirmation it was presented as. Two correlated signals derived from the same evidence are not corroboration. Real corroboration would require source-held-out labels, a genuine time gap, or independent editorial ground truth.
#methodology#ml#corroboration#bayesian#honest-negative#data-leakage#circular-features#temporal-cv#accountability#atlas#api - IRCNRUBYEGAEPKIQSYVEMMCUTRKZUZ2026-05-22
Shutdown duration is not predictable — an honest audit of the survival model
Voidly Atlas ships a shutdown-duration model — a Random Survival Forest (RSF) that answers "once a shutdown starts, how long will it last?" — served at POST /v1/forecast/duration. The model card reported a concordance index (c-index) of 0.728, above the 0.65 promote floor and flagged passed_promote_floor: true. A platform-wide ML audit had caught several Voidly models inflating metrics via shuffled train/test splits; the duration RSF had not been individually audited. This finding is that audit. How the 0.728 was computed: scripts/train-shutdown-duration-rsf.py evaluates the model with a single random 75/25 train_test_split (random_state=42), stratified only on the censoring flag — no temporal ordering, no country grouping. Three problems sink that number. (1) The point estimate is a lucky seed: only 74 of 343 incidents have an observed end (78.4% censored), so a 25% test fold holds ~19 events and ~286 comparable pairs — tiny and noisy. Re-running the identical random split across 20 seeds gives 0.728 ± 0.055, range 0.62–0.82; the published number is just where seed 42 landed. (2) The random split leaks calendar time: the top permutation feature is first_seen_year, and that is leakage not signal — the "event observed" label is status=confirmed, and confirmed status is an artifact of incident AGE, not of the shutdown ending. 2019–2023 incidents are ~100% confirmed; 2024–2026 incidents are 9–56% confirmed. A shuffle scatters old and new years across both folds, so the RSF learns "old year ⇒ resolved row" — a dataset-assembly pattern that cannot generalise to a live shutdown today. (3) The honest split shows no skill: a forward-temporal split (train on shutdowns whose outcome was known before a cutoff, test on strictly later ones, sorted by end date) yields c-index 0.609 / 0.563 / 0.571 / 0.437 across cutoffs 2022–2025 — mean ≈ 0.55, and the most-recent fold (the one matching real use) scores 0.44, below a coin flip. The promote gate requires beating a naive baseline — predict this country's mean past observed duration. On the identical temporal folds the naive baseline scores 0.509; the RSF's honest 0.55 is within noise of it and loses outright on two of four folds. Giving the model its best shot — an enriched feature set (incident confidence, measurement count, anomaly rate, source/domain/service/ASN counts, severity grade, blocking mechanism), dropping the leaky year — scored 0.495 on the forward-temporal split. More features did not help. Verdict: honest negative. No duration model beats a naive country-mean baseline on a leak-free split; nothing is promoted. With Voidly's current data, how long a shutdown lasts is not predictable — too few resolved events (~one per country), an "end" label contaminated by curation lag rather than measurement, and durations quantized to 24-hour ingestion multiples. The artifact and sidecar now carry the honest c-index, passed_promote_floor: false and an honest_caveats block; the endpoint surfaces the audit verdict. Publishing the honest negative — and not shipping a number we cannot stand behind — is the fix.
#methodology#ml#survival-analysis#duration#honest-negative#data-leakage#temporal-cv#naive-baseline#accountability#atlas#api - IRCNRUBYMMVNKPCUSYSAVEEGTRPKBDTHIDMYKZUZTM2026-05-22
The 7-day shutdown forecast does not beat persistence — an honest re-evaluation
Voidly’s production 7-day shutdown forecast (forecast v1, XGBoost + isotonic) reports ROC AUC 0.954. A weakness audit found that number is not real: it comes from a shuffled train_test_split in scripts/train-forecast.py, and the shuffle scatters rows of the same country across train and test folds. The target is time-autocorrelated, so the leakage hands the model a near-free score. This finding is the honest re-evaluation plus a genuine attempt to fix it. forecast v2 momentum keeps all 39 v1 features and adds 30 forward/change-oriented features across six families — momentum (block-rate deltas, rising/falling run length), acceleration (2nd difference), volatility (rolling std, coefficient of variation), event anticipation (days-until-election, days-since-incident, incident-anniversary proximity), cross-country leading indicators (corr-weighted neighbor momentum), and contagion-chain score. Everything is evaluated on a forward-temporal split ONLY: train up to a cutoff, test on the strictly-future 60-day window (1,260 rows, 198 positive), never shuffled. The comparison baseline is persistence (predict tomorrow = today’s label). The honest numbers: under the forward-temporal split, v1 scores AUC 0.589 / F1 0.132, and v2 scores AUC 0.685 / F1 0.404. v2 beats v1 by +9.6pp AUC — the new features genuinely help relative to v1 — but v2 still loses to the persistence baseline by −27.2pp AUC and −51.9pp F1. The promote gate required v2 to beat persistence by ≥ 8pp F1; it missed by ~60pp. v2 is NOT promoted and v1 stays in production unchanged. Why does persistence score 0.92? Because target_7day is a sliding 7-day window: adjacent days share 6 of 7 lookahead days, so the label is 98.9% autocorrelated day-to-day — only 172 transitions across 15,330 country-day rows. Predict-yesterday wins by construction, not by forecasting. The same autocorrelation is exactly what the shuffled split leaks into v1’s 0.954. The deepest honest cut: restricted to the 31 transition rows where the label actually moves, v2’s AUC is 0.328 — below a coin flip. On the days that matter (shutdown onset, block lift) the forecast has no skill, arguably negative skill. The honest conclusion: with the data Voidly currently has, the 7-day censorship target is persistence-dominated — what is blocked stays blocked, and the rare transitions are not anticipated by momentum, calendar proximity, or neighbor contagion. The production forecast’s value is calibration + explanation (SHAP drivers, conformal intervals), not predictive lift. The leaky 0.954 was a real bug; replacing it with an honest ~0.59 and publishing the no-promote is the fix.
#methodology#ml#forecast#honest-negative#data-leakage#temporal-cv#persistence-baseline#accountability#atlas#api - CNRUIRIDINAEVNBDKZPKTRSATHEGMYAZMAUZSGKHBYVEIQ2026-05-22
The AS-topology GNN does beat chance — once you give it a real label and a real test
Voidly Atlas runs a 2-layer GraphSAGE GNN over the CAIDA AS-AS peering graph (7,060 nodes, 841K edges) to score per-ASN shutdown risk. It shipped 2026-05-21 with passed_promote_floor=false and an honest caveat: leave-one-out CV ran across only 6 ASNs, so although it reported AUC 0.80, a permutation test gave p=0.32 — n=6 cannot reject the null. This finding is the better-powered re-evaluation, and it surfaced a second flaw the n=6 caveat had hidden. The old label, had_shutdown_next_7d, was defined as "did any next-7-day measurement hit block_rate >= 0.5" — but every ASN-tagged evidence row in the database is a CensoredPlanet block, so block_rate is always 1.0 and the label collapsed to "does this ASN have >= 5 measurements on a post-cutoff day" — a measurement-density flag, not a censorship signal. Proof: the 40 old positives had a median of 24 post-cutoff rows, the 18 old negatives a median of 2. And the GNN node features included n_evidence_30d/180d, n_unique_dates and has_evidence, so the model could read its own label off its inputs — the AUC 0.80 was partly density predicting density. The fix builds a genuine censorship label. CensoredPlanet rows carry signal_value, a continuous block-intensity score (low = the ASN let the probe through, high = it blocked it). Every ASN with >= 20 measurements in the trailing 180 days is relabeled by the fraction of its measurements showing strong blocking (signal_value >= 0.5): >= 60% blocked is positive (censors, 62 ASNs), <= 25% is negative (clean, 35 ASNs), the 25-60% middle band is dropped. That is the directive definition exactly — positive = ASN with confirmed censorship evidence, negative = ASN with clean evidence — and it grows the labeled set from 58 to 97 ASNs across 30 countries. Because signal_value and signal_level are tightly coupled, all seven signal-derived features (block_rate_30d/180d and five pct_* buckets) are dropped to kill leakage; the GNN is retrained on density + topology features only (evidence counts, distinct days, domains, CAIDA degree, country risk tier). Evaluation is leave-AS-out only, never shuffled: leave-one-AS-out across all 97 ASNs, and leave-one-COUNTRY-out (30 country folds, every ASN of a held country removed together — the leakage-safe gate, since same-country ASNs share country features and are topological neighbors). Skill is pooled out-of-fold AUC plus a 5,000-permutation test that shuffles only the label vector. The honest verdict is SIGNIFICANT: leave-one-COUNTRY-out gives AUC 0.7751, permutation p=0.0002; leave-one-AS-out gives AUC 0.7645, p=0.0002 — the two protocols agree, so the result is not same-country leakage. The permutation null averaged AUC 0.4998 (exactly chance, 95th percentile 0.60); the observed 0.775 is far in the tail. Confusion at threshold 0.5 under country-out: precision 0.86, recall 0.68 (42 TP, 20 FN, 28 TN, 7 FP), mean score 0.68 for censoring ASNs vs 0.39 for clean ones. Both the 0.65 AUC floor and the p<0.05 bar are cleared by a wide margin under the leakage-safe protocol, so passed_promote_floor is flipped to true. The measured framing: AUC 0.775 is clearly-better-than-chance, not operationally decisive — recall 0.68 still misses a third of censoring ASNs, the 97-ASN label set is small, and the genuine label says whether an ASN censors, not when. But the GraphSAGE-over-AS-topology approach is validated: AS topology plus measurement density carry real, statistically significant signal for whether an ASN censors. The old AUC 0.80 / n=6 / p=0.32 headline is superseded; the endpoint now ships a defensible significance claim instead of a thin one. Both outcomes were acceptable under the directive — this one happens to be a real positive.
#methodology#ml#gnn#graphsage#as-topology#honest-positive#data-leakage#permutation-test#leave-one-out-cv#accountability#atlas#api - UZCNJOEGPKBYMA2026-05-22
A logistic stacker lifts the fused anomaly ensemble from 0.66 to 0.75 AUC
Voidly Atlas fuses four unsupervised anomaly detectors — a DBSCAN per-country shape detector, an STL seasonal-residual detector, a multi-country burst detector, and an HDBSCAN per-domain drift detector — into one composite anomaly score per country per day. The original fusion combined them with a hand-picked weighted average (0.35/0.25/0.20/0.20), then quietly tried three variants and kept whichever scored the highest AUC on the very labels it then reported against — in-sample model selection, so its published ~0.66-0.68 carried selection leakage. This finding is the rebuild. Fusion v2 changes two things. First, evaluation is rolling-origin forward-temporal cross-validation: each of three folds fits on earlier label dates and scores on a strictly later, held-out block — never a shuffled random split, because a prior Atlas audit showed shuffled splits leak time-autocorrelation between adjacent days and inflate AUC. The reported number is the mean across folds. Second, v2 evaluates five fusion methods, each fit only on the train fold: plain averaging, AUC-weighted averaging, rank-averaging, AUC-weighted rank-averaging, and a small logistic-regression stacker over the four detector scores plus four present/absent indicators. The logistic stacker won decisively — mean held-out composite AUC 0.745 (per-fold 0.753 / 0.769 / 0.715), versus 0.655 for the best averaging variant and 0.584 for plain averaging, which barely clears chance. The stacker beats the best single detector by about ten points, and its worst temporal fold (0.715) is the honest lower bound. So fusion v2 is both higher and more honest than the old number: 0.745 mean on a leak-free split versus 0.663 in-sample-selected. The live /v1/anomaly/fused/* endpoints are unchanged — they just load the updated artifact. Honest caveats baked in: the published 0.745 is the held-out mean while the live per-country composite re-fits the stacker on all labels (standard once a method is selected); two of the four detectors (bursts and HDBSCAN drift) are near-static current-snapshot signals that contribute weakly, so DBSCAN and STL carry most of the discriminative load; and "anomalous" is not "censored" — the fused ensemble is a second-opinion signal, with the supervised v3.3 classifier still the headline censorship predictor.
#methodology#ml#anomaly-detection#ensemble#stacking#temporal-cv#accountability#atlas#api - EGUZPKRUNGIRTMCUMM2026-05-22
Cutting the Sentinel alert false-alarm rate from 79% to 35%
The alert lead-time retrospective showed that 79.2% of Sentinel forecast-threshold alerts over 90 days were false alarms — four of five alerts that fired were not followed by a confirmed censorship incident. This finding is the fix, backtested against the same 90-day history and the same 14-day scoring rule. Four candidates were evaluated: per-country thresholds, a persistence gate, chronic-false-country suppression, and combinations. Two are honest negatives reported as findings: per-country thresholds collapse to the baseline (the forecast probabilities for censorship-heavy countries cluster just above 0.05 with no separation between true-positive and false-alarm days, so raising a bar kills the true positives too), and a 2-day persistence gate nudges the false-alarm rate up rather than down (Sentinel false alarms are chronically-near-threshold countries, not transient one-day spikes). The fix that shipped downgrades 37 of 49 watched countries from alert to watch under three honest rules — chronic false-positive (>= 3 alerts, 0% true-positive rate, catches Iran), no-incident-signal (zero confirmed incidents in the window, catches stable democracies), and low-precision (>= 4 alerts, precision < 0.40, catches Bangladesh / Kazakhstan / Vietnam). A downgraded country still has a fully computed forecast — only the webhook alert is withheld. Backtest result: false-alarm rate 80.6% -> 35.0%, true-positive rate 17.6% -> 60.0% (recall went up, not down — the removed alerts were overwhelmingly noise), median lead time essentially unchanged at ~3.9 days. Honest caveats baked in: thresholds and the suppression list are picked in-sample so the live forward false-alarm rate will be somewhat worse than 35%; suppression is a downgrade not a deletion; Iran is suppressed for alert hygiene, not because Iran is safe; multi-signal confirmation was considered but not shipped because the DBSCAN and contagion signals are point-in-time snapshots with no 90-day history to backtest.
#methodology#sentinel#alerts#false-alarm-rate#precision#honest-negative#accountability#atlas#api - SATHUZPKAZIQKZMAIR2026-05-21
Lead/lag cross-correlation: which countries’ censorship precedes others’
Where /atlas/correlation-matrix shows simultaneous co-movement and /atlas/cohorts shows shape similarity, this surface adds the time-shifted axis: for every pair of the 50 most-censored countries, the daily confirmed-censorship rate is cross-correlated at lags of -30 to +30 days, and the lag with maximum Pearson r is recorded. 129 pairs clear the bar of |r| >= 0.4 with a Benjamini-Hochberg FDR-corrected p < 0.05 (1,225 raw pairs evaluated; 30 of 50 countries kept after a sparse-country filter that drops degenerate r=1.0 artifacts). The strongest single signal is Saudi Arabia leading Thailand by 1 day (r = +0.91); the densest sub-graph is a Middle East -> Central Asia chain (UZ -> PK -> AZ at +-1-2 days, IQ -> KZ at +26 days, MA -> IR at +15 days). Lag-zero is intentionally excluded (it is covered by the correlation matrix). Honest caveats baked into every response: cross-correlation is descriptive, not causal — a lead/lag pair can reflect a shared regional driver rather than imitation; the FDR correction controls false discoveries but 7-day-centered smoothing inflates short-lag correlations; IODA disruption rows are excluded from the source signal so the series is confirmed-censorship only.
#methodology#lead-lag#cross-correlation#contagion#fdr#atlas#api - SGBDIDINBHEGAZCU2026-05-21
Per-mobile-carrier blocking detector: splitting censorship severity by mobile carrier vs fixed broadband
Internet access is not one thing. In much of the Global South the cheap, dominant path online is a mobile carrier — a SIM and a cellular data plan — while fixed broadband (FTTH, cable, DSL) reaches a smaller, often wealthier slice of the population. Censors know this: mobile DPI is operationally easier to deploy and update than fixed-line interception, and blocking the mobile path hits the most people for the least effort, so several regimes block mobile-first. The per-mobile-carrier blocking detector makes that split visible. It takes a hand-curated map of 62 telco ASNs — 34 tagged mobile, 24 tagged broadband, 4 tagged mixed — and, per country over the trailing 90 days, splits the evidence table's CensoredPlanet blocking observations by access type. The headline number is mobile_skew: how much harder HTTP-level blocking lands on a country's mobile carriers than on its fixed broadband. The metric is HTTP-only on purpose. Every ASN-tagged evidence row is already a blocking observation, so a classic block rate (blocks / all measurements) is degenerate — always 1.0. And dns-blocking rows saturate signal_value at exactly 1.0 while http-blocking rows carry a real intensity in [0,1] (the fraction of probes that saw the block), and CensoredPlanet typically probes a given ASN with either its DNS satellite or its HTTP hyperquack, not both — so mixing DNS and HTTP rows would make any mean intensity an artifact of the probe-protocol mix rather than of how hard the carrier is blocked. So mobile_skew = mobile_http_intensity / (broadband_http_intensity + epsilon), computed on HTTP rows only, an apples-to-apples comparison; skew above 1.15 on a trusted country (>=20 HTTP-block rows on each access type) is the mobile-first signature. DNS blocking is reported separately as a per-side binary side-channel because DNS rows carry no usable intensity gradient; the 4 mixed ASNs (incumbent telcos running both networks at scale) are excluded from the skew math. The first run is honestly a thin one and the artifact says so: across the 90-day window 29 countries had blocking evidence on a classified telco ASN, but only Singapore currently has enough HTTP-block rows on both a mobile and a broadband ASN to support the headline skew — SG lands at mobile_skew 0.896 (mobile HTTP intensity 0.064, broadband 0.022), roughly balanced with a slight broadband lean. The other 28 countries are probed DNS-only or HTTP-only on one side, get mobile_skew=null, and surface only via the DNS side-channel — which still cleanly separates where DNS-level blocking is observed on the mobile path (Bahrain, India, Jordan, plus mobile-and-broadband-both in Bangladesh and Indonesia) from where it shows up only on fixed broadband (Azerbaijan, Belarus, Cuba, Egypt). Honest caveats baked into every response: the ASN-to-type map is hand-curated and incomplete (only ASNs present in the evidence table are classified, the rest are unknown); some incumbent telcos genuinely run both networks and are tagged mixed and dropped rather than guessed at; the headline skew needs HTTP-block rows on both access types and CensoredPlanet vantage-point assignment means that holds for one country today (it widens as probe coverage grows); and this measures blocking severity, not prevalence — every classified ASN is blocked, the question is how hard and on which access type. Wired as a per-country reference surface, not a model input. Live at GET /v1/atlas/mobile-carrier-blocking/{cc} + /leaderboard + /info. Built by scripts/build-mobile-carrier-blocking.py.
#mobile-carrier#asn#isp#dpi#blocking-method#data-quality#ml-honesty#transparency#atlas#api - ERKPTMGBCA2026-05-21
Data-driven country risk tiers: re-deriving a hand-set model input from objective signals
Every country in Voidly Atlas carries a risk tier — an integer 1-5 in country_geography.risk_tier, tier 1 = highest censorship risk, tier 5 = lowest. That tier is not a measurement: an analyst hand-set it. And it is not inert — the v3.3 censorship classifier consumes it as an input feature, the 7-day shutdown forecast consumes it, and several downstream surfaces lean on it. A stale or wrong hand-set tier is silently baked into every model that reads it. This finding ships a data-driven alternative: a tier derived from six objective signals over the trailing 365 days — confirmed censorship incident count (disruption rows excluded, same rule the forecast labeler uses), mean block rate, distinct domains blocked, distinct blocking methods, mean 7-day forecast risk, and DBSCAN anomaly frequency. Each feature is rank-transformed then z-scored; the composite is their equal-weight mean; countries are clustered with KMeans(k=5) on the composite (highest-mean cluster relabeled tier 1), with an equal-frequency quantile binning reported alongside as a cross-check. The first cut produced nonsense — the US scored 60% blocked, Sweden 100% — and three data-quality fixes each corrected a real bias: IODA alerts excluded (they are ASN connectivity outages, not censorship), only signal_level=critical counts as blocked (elevated/warning are soft early-warning levels that community probes self-report — one mis-networked UK probe had flagged DuckDuckGo and WhatsApp as blocked), and the block rate is empirical-Bayes-shrunk toward the global pooled rate (0.328) so a 29-row country like Sweden is pulled to the prior while China, with thousands of rows, barely moves. Result: of 148 countries with a hand-set tier, 107 land in a different data-derived tier and only 41 agree — the data says higher risk for 41 and lower risk for 66, with 50 large (>=2 tier) disagreements. The hand-set column parks 87 countries in a tier-3 catch-all; the data does not. The five biggest disagreements are exactly the cases that prove why this stays a PROPOSAL: Eritrea, North Korea and Turkmenistan are hand-set tier 1 (the most repressive information environments on earth) but the data drops them to tier 4-5 — not because the data is right but because it is blind, there are almost no OONI probes inside those countries so total censorship and zero measurement look identical to a feature pipeline. The UK and Canada move the other way (hand-set tier 5, data tier 2-3) because they are heavily probed and accumulate DBSCAN anomaly frequency plus minor genuine blocking. PROPOSAL ONLY — this never writes the country_geography table; the hand-set tier stays authoritative until a human reviews the diff. The artifact makes a hand-set number auditable and hands a reviewer a ranked list of exactly which country tiers to re-examine first. Live at GET /v1/atlas/risk-tiers + /v1/atlas/risk-tiers/{cc}. Built by scripts/build-data-driven-risk-tiers.py.
#risk-tier#reclassification#proposal#clustering#kmeans#data-quality#empirical-bayes#ml-honesty#transparency#auditability#atlas#api - UZIRDZAZ2026-05-21
Probe scheduling optimizer: a Thompson-sampling priority list for where to point the probe network next
The Voidly probe network has a hard capacity ceiling — roughly 40 nodes, 62 domains, a 5-minute cadence — and today every node runs the same fixed domain list everywhere. That spends probe attention uniformly, which is wasteful: a (country, domain) pair measured 400 times and 100% blocked teaches nothing new on probe 401, while a pair seen four times, or one whose block state flipped last week, is exactly where a fresh probe ADDS knowledge. The probe scheduling optimizer is a recommendation engine for that problem: a ranked priority list of which (country, domain) pairs to probe next, so a scheduler or a human running the network can concentrate scarce probe cycles where the model is most uncertain. Method — Thompson sampling per (country, domain): each pair is a bandit arm carrying a Beta(alpha, beta) posterior over P(blocked), starting from a Jeffreys prior Beta(0.5, 0.5). alpha accumulates recency-weighted block observations, beta accumulates recency-weighted unblock observations, each weighted exp(-age_days/45) so a block state that flipped a month ago is not drowned by a year of stale agreement. The Beta posterior variance — a*b / ((a+b)^2 * (a+b+1)) — is the information-gain proxy: large when a pair is under-sampled OR genuinely 50/50, small when many consistent observations have piled up. One Thompson sample is drawn per Beta (the bandit exploration step), and pairs are ranked by posterior_variance * recency_weight * flip_weight * (1 + thompson_weight * (1 - |sample-0.5|*2)) — the flip weight (1.5x) favours pairs whose block state changed in the trailing 30 days, the Thompson term is bounded small so the deterministic variance/recency signal leads. First run 21 May 2026 scored 1,054 (country, domain) pairs from a 365-day evidence window, skipped 57 cold pairs with fewer than two observations, and found 269 pairs with a block-state flip in the trailing 30 days. The five highest-priority pairs to probe next: UZ/twitter.com, IR/binance.com, DZ/tiktok.com, IR/tumblr.com, AZ/telegram.org — all sparse-but-contested pairs where the posterior is wide and the last observation is months old. Honest caveats baked into every response: this is a RECOMMENDATION, not wired into probe_module.py or the live scheduler — nothing changes the probe cadence yet; Thompson sampling treats each (country, domain) pair as an independent bandit arm, false in practice because DPI policy correlates domains within a country and a censor flipping one news site often flips many; a high-variance pair may just be intermittently REACHABLE (flaky resolver, congested transit) rather than censorship-uncertain; cold pairs with fewer than two observations are out of scope here because the existing fixed-list scheduler already covers them. Live at GET /v1/atlas/probe-priority + /v1/atlas/probe-priority/info. Built daily 04:50 UTC.
#probe-network#thompson-sampling#bandit#scheduling#information-gain#recommendation#ml-honesty#transparency#atlas#api - EGPKVEINUZBRMMIRBDCNRU2026-05-21
Voidly Score: one continuous 0-100 daily number for "how censored is this country today"
Journalists asking Voidly Atlas "how censored is Iran today" kept hitting seven separate numbers — a supervised classifier probability, a 7-day forecast, a DBSCAN anomaly score, a source-agreement rate, an incident count, a blocked-domain tally, a mobile-messenger probe skew. A headline needs one. The Voidly Score is that one number: a continuous 0-100 daily index per country, built so a newsroom can write "Iran censorship intensity 78 today, up from 62 yesterday." It is deliberately different from Atlas Score v2 (an A-F structural rating dominated by a 50% base-rate term that captures chronic blockers like CN/KP even on a quiet week). Voidly Score answers "what is happening today" and moves. The composite: 30% v3.3 classifier probability + 20% 7-day forecast max risk + 15% DBSCAN anomaly (normalized, saturation cap 3.0) + 10% cross-source agreement + 10% log-scaled 24h incident rate (censorship/mixed weighted 3x, disruption 0.3x) + 10% log-scaled 30d unique-blocked-domain count + 5% mobile-messenger probe skew. Each component is normalized to [0,1], weighted, scaled to 100, then smoothed with a 3-day EMA (alpha=0.5) so a single noisy probe day cannot fake a 15-point headline swing. The endpoint returns raw_score, smoothed_score (the headline number), and delta_vs_yesterday. First run 21 May 2026 scored 30 watched countries: top five EG 42.4 (anomaly-led), PK 36.3 (forecast-led), VE 35.0 (classifier-led), IN 32.0 (anomaly-led), UZ 29.8 (forecast-led) — no single signal dominates the board, which is the point of a composite. Honest caveats baked into every response: weights are hand-tuned not learned (no ground-truth "intensity" label exists); the seven components are correlated (forecast leans on the same OONI block-rate the classifier uses) so the sum is not a clean information-theoretic average; "intensity" is editorial framing — the metric is a censorship-RISK-WEIGHTED average across live signals, not a direct measurement of how blocked any user is; mobile skew is approximated from OONI test_name= URL params and gated to >=10 probes; the EMA needs ~3 days to warm up; sparse-data countries (low OONI coverage, no incidents, no DBSCAN window) float near 0 not because they are uncensored but because the signals collapse there — use Atlas Score v2 with its risk-tier floor for chronic-blocker country pages. Live at GET /v1/atlas/voidly-score + /v1/atlas/voidly-score/{cc} + /v1/atlas/voidly-score/info. Built daily 04:35 UTC. The two scores are complements: Atlas Score v2 for "where is censorship the worst", Voidly Score for "what is the headline number today".
#voidly-score#composite#index#headline-metric#ml-honesty#transparency#ema#atlas#api - NGZWMXSIIQVE2026-05-21
Cohort migration tracker: which countries are shifting DTW censorship cohorts over time
Voidly Atlas already clusters the 50 highest-signal countries into DTW cohorts — C1 stable democracies, C2 bursty, C3 persistent authoritarian — by daily-signal SHAPE similarity under Dynamic Time Warping. But that clustering was a one-shot snapshot: it told you where a country sat, never whether it had moved. The cohort migration tracker closes that gap. It recomputes the cohorts on a rolling 90-day window every month, compares the new assignment to last month's, and emits the list of countries that crossed cohorts — with a direction-of-travel label (deteriorating / improving / lateral) and a per-country fit confidence. A country sliding C1→C3 is one of the cleanest leading indicators of a censorship regime change we can compute. The hard part is stable cohort identity: re-running clustering reassigns arbitrary cluster IDs, so the tracker gives each cluster a stable semantic label from anchor-set overlap (CN/RU/IR/MM pull toward C3; US/GB/DE/FR/CA pull toward C1) plus centroid burstiness. First snapshot pair (90d-now vs 90d-30d-ago, both from real evidence): 6 countries changed cohort. Four deteriorated C1→C3 — Nigeria (conf 0.53), Zimbabwe (0.51), Mexico (0.50), Slovenia (0.46) — and two improved C3→C1: Iraq (0.68) and Venezuela (0.66). Honest caveats baked into every response: DTW silhouette is only ~0.38 (modest separation, cohorts overlap), a cohort shift can be data-driven rather than regime-driven (treat as a signal to investigate not a verdict), and the monthly cadence will miss faster transitions. Live at GET /v1/atlas/cohort-migration; per-country history at /v1/atlas/cohort-migration/{cc}; rebuilds monthly on the 1st at 06:00 UTC.
#cohort#dtw#clustering#regime-change#time-series#transparency#ml-honesty#api - 2026-05-21
Synthetic baseline benchmark: every Atlas ML model vs predict_yesterday, base-rate, and four other trivial baselines
It is easy to claim "F1 0.87" for a censorship-forecast model. It is much harder to answer "is the model adding value over predict_yesterday?" honestly. This finding ships the synthetic-baseline benchmark suite: every Voidly Atlas ML model evaluated against six trivial baselines (always_zero, always_one, base_rate_constant, predict_yesterday, country_base_rate, random_with_base_rate), with the lift surfaced inline as a single number per row. Live at GET /v1/atlas/baseline-benchmark; weekly cron Thursday 04:45 UTC. First run benchmarked 23 models. 7 flagged barely_beats_baseline=true (F1/AUC lift < 5pp vs predict_yesterday) — including forecast_7day (F1 lift +2.29pp), forecast_1d/7d/30d (multi-horizon, all F1-negative vs persistence), classifier_v3.3 (F1 lift -21pp on rolling holdout — different task framing, documented honestly), and trajectory_d7/d30 (AUC lift -21 to -23pp because persistence dominates on 30-day horizons). Best by AUC lift is per_domain_pornhub.com at +50.74pp, but every AUC=1.000 row gets an explicit "code smell" honest_caveat — the model is likely reconstructing the labeling rule from a leaked feature, not discovering signal. Worst is trajectory_d30 at -22.78pp AUC. Brier sign-flipped so positive=better. Filters: family, barely, min_lift_pp. Six baselines documented in /info. The point is accountability: a model that barely beats predict_yesterday is on a list the user can see, not buried under a nice-looking F1.
#baseline#benchmark#accountability#ml-honesty#transparency#predict-yesterday#lift#api - MMNGSAAZEGRUPKUZBYTHBDKZVNIRSYUA2026-05-21
Pre-protest GDELT correlator: do news-mention spikes predict a shutdown 48h later?
Many internet shutdowns are reactive — they happen after a protest spike makes the news. We pulled daily GDELT counts of PROTEST + RIOT mentions for the 29 censorship-heavy countries already tracked by our event-ingest pipeline, computed a 30-day rolling z-score per country, and tested whether today's z-score predicts a confirmed-censorship/mixed shutdown in the next 48 hours. Across 16 countries with enough overlap (>=21 days of GDELT data, >=2 shutdowns in window), 2 cleared our promote-floor of Pearson r >= 0.30 + p < 0.05: Myanmar (r=+0.41, AUC=0.85, n=5 shutdowns) and Nigeria (r=+0.34, AUC=0.97, n=2). Saudi Arabia sits on the line (r=+0.30) but AUC is near-chance (0.48) so we don't count it. Russia and Egypt trend mildly positive (r=+0.10 each) but aren't individually significant on this 121-day window. Promote-floor required 5 significant countries — we hit 2 — so the model ships as promoted:false. We expose every country's number anyway, because the underlying signal is real for Myanmar and the methodology is transparent enough for downstream consumers to use the per-country significance flag. Pipeline already wired to a 04:15 UTC GDELT ingest + 04:30 UTC correlator rebuild cron. Live at GET /v1/sentinel/pre-protest-signal/{info,leaderboard,<cc>}. Honest caveats baked into every response: GDELT counts mentions not events, per-country baselines are narrow (30d), media-blackout countries have biased z-scores, correlation != causation.
#gdelt#protest#early-warning#sentinel#correlation#transparency#ml-honesty#not-promoted#api - IRNGMMBDKEVERUIQETSY2026-05-21
Pre-shutdown network signal detector: do BGP, TLS-reset and new-ASN precursors lead a blocking event?
A user-visible shutdown is the end of a process — by the time domains stop loading, the routing and filtering infrastructure has often already moved. This finding ships a per-country composite "pre-shutdown signal score" built from three technical precursors that TEND to precede user-visible blocking, computed daily over the trailing 90 days. bgp_signal: IODA BGP-datasource critical/warning alert counts (route withdrawals), normalized to that country's own 90-day baseline. tls_fail_spike: TLS / TCP-reset interference evidence rows (http-blocking-tcp-reset etc.) against a trailing 7-day mean — a DPI box being switched on shows here before the blockpage does. new_asn_signal: count of distinct ASNs lighting up with anomaly evidence today that were not seen the prior week — blocking infrastructure spreading. The composite is a 0.5/0.3/0.2 weighted sum, then z-scored within each country so every country is on a comparable scale. We back-tested the composite against 2,240 evaluable confirmed shutdowns (incident_type censorship/mixed/disruption, severity critical) in the window: for each shutdown, did the composite cross z>=1.5 in the 72 hours before the shutdown timestamp? 522 of 2,240 did — a 23.3% true-positive rate, with a median lead time of 31.3 hours (mean 39.9h). The honest other side: of 585 country-days where the signal fired, 180 had no shutdown in the following 72h — a 30.8% false-positive rate. So the signal is real but partial: it clears the promote-floor (>=3 historical shutdowns with >=1h lead — we got 522) and ships as promoted:true, but it catches under a quarter of shutdowns and fires falsely about a third of the time it fires. The reason is structural and stated in every API response: many shutdowns are SUDDEN with no measurable precursor, IODA BGP has a ~6h ingest lag that hides any sub-6h lead, and an elevated composite reflects anomaly activity that frequently resolves without a shutdown. Use it as a supporting signal alongside the 7-day forecast, never as a standalone trigger. Live at GET /v1/sentinel/pre-shutdown-signal/{info,leaderboard,<cc>}. Built by scripts/build-pre-shutdown-signals.py.
#bgp#tls#asn#pre-shutdown#early-warning#sentinel#precursor#back-test#transparency#ml-honesty#promoted#api - IRPKCNRUTRVEIDMYINTHSASGKZIQDZVNAEAZBDBYEGJOMAMM2026-05-21
Circumvention recommendation engine: ranked try-first / fallback / avoid per country, evidence-based
Activists, journalists, and refugees routinely need to know which VPN / Tor / Lantern variant actually works in their country. The historical answer has been "ask in a forum, try a few, see what survives." This endpoint ranks per-country circumvention tools on top of the evasion-success-rate sidecar (per-tool probe success, last 30 days). Rank score = success_rate * sqrt(confidence) * recency_decay. Tier buckets: try_first (success >= 70% AND confidence >= 0.70), fallback (success >= 30% OR confidence < 0.50), avoid (success < 10% AND confidence >= 0.50). Conservative gate refuses to emit tier labels for any country with < 100 total probes — a missing tool is almost always an OONI-coverage gap, not a working tool. First run: 22/34 countries with viable recommendations, 13 with at least one try_first tool. Iran (IR) is correctly gated out at only 83 probes (Tor=72/72 at 54%, Psiphon=11/11 at 0%) — too small to rank. Pakistan (PK) is the worst case: every measured tool falls into avoid (Tor 0.3% over 1,495 probes, generic-VPN 0% over 16, ProtonVPN 0%, Psiphon 0%, Lantern 0%); the endpoint surfaces Tor as least-bad with an explicit coverage_note saying the headline is NOT a working recommendation and suggesting self-hosted WireGuard / Cloudflare WARP / Snowflake-only Tor. Try-first headlines: Venezuela Tor 96.7%, Indonesia Tor 96.4%, Turkey Tor 96.2%, Malaysia Tor 96.1%, Iraq Tor 94.3%, India Tor 89.1%, Thailand Tor 88.8%, Russia generic-VPN 87.0%. Honest caveats hammered on every response: probe success != real-user success (DPI may rate-limit real traffic but not probes), recommendations stale within hours, "try first" != safe, domain-fronted tools under-estimated, self-hosted invisible. Response carries do_not_share_card_without_caveats:true by design. Live at GET /v1/atlas/circumvention/{cc} + GET /v1/atlas/circumvention/info.
#circumvention#tor#vpn#recommendation#evidence-based#ml-honesty#safety-disclaimer#transparency#api - IRRUBYTRPKEGUZAZVNBD2026-05-21
OONI test-type meta-classifier: per-country diagnostic ranking (which test type matters where)
Voidly Atlas runs eight OONI test types every six hours: web_connectivity, signal, whatsapp, telegram, facebook_messenger, tor, http_invalid_request_line, http_header_field_manipulation. Until today all eight were treated as equal contributors to country-day censorship labels. The OONI test-type meta-classifier asks: for each country, which test type is MOST diagnostic of actual censorship? Pipeline: bucket evidence at (country, day, test_type) by parsing test_name= out of OONI Explorer URLs in evidence.source_url, compute per-bucket anomaly rate from upstream_claim "N/M measurements anomalous" pattern (or fallback to fraction of non-ok signal_type), label country-days positive iff a confirmed censorship/mixed incident covers them +/- 1 day (343 incidents in scope), fit a per (country, test_type) logistic regression (class_weight=balanced, standardized input), require >=5 positive AND >=5 negative days, score AUC in-sample, rank. 30 countries cleared the 50-labeled-day floor; 13 show an AUC range > 0.10 between best and worst test type — both promote gates passed. Globally web_connectivity wins #1 most often (12 countries), then tor (5), http_invalid_request_line (4), the rest at 2 each. Spotlight: Iran top test = tor at AUC 0.838 (classic Tor-bridge interference), Russia top test = web_connectivity at AUC 0.977 (TSPU broad-protocol fingerprint), China not viable (only 4 incidents in the 2y window — too few positive days for AUC). Honest caveats: AUC is in-sample (no train/test split), test types have wildly different probe densities so AUC is not perfectly apples-to-apples, diagnostic != causal, the +/- 1 day label window can leak signal between adjacent days. Live at GET /v1/atlas/ooni-test-diagnostic + /info + /<cc>.
#ooni#test-types#diagnostic#per-country#feature-importance#transparency#ml-honesty#journalist-facing - EGJOMAUZTZINMMAZIR2026-05-21
Live next-24h contagion watchlist: which countries are most likely to block in the next 24-48h
The contagion-chain model shipped earlier this week is descriptive ("given country A blocked, P(B follows in 7d) = X"). This new endpoint is PROACTIVE: it consumes the triggers that ACTUALLY fired in the last 48h and ranks every candidate follower country by its risk of blocking in the next 24-48h. Pipeline: pull confirmed-censorship/mixed incidents in last 48h (dedup to country-day) → score every candidate follower at horizon=3d (closest available proxy to "tomorrow") → aggregate via noisy-OR across all active triggers (1 - prod(1-p_i)) → subtract base_rate_daily × 3 so countries that block every day anyway don't dominate → sort. First run (2026-05-21): 4 active triggers (EG, JO, MA, UZ all on 2026-05-20). Top-5 most-at-risk: Tanzania (score 0.94), India (0.89), Myanmar (0.87), Azerbaijan (0.82), Iran (0.81). Honest caveats baked into every response — inherits underlying classifier's 0.67 AUC, 24-48h is a TIGHT window (most contagion patterns are 3-7d), attribution is probabilistic not causal, noisy-OR overestimates when triggers share a driver, base-rate adjustment is a heuristic. Endpoints: GET /v1/atlas/contagion-watchlist (ranked top-K + active triggers + meta), /v1/atlas/contagion-watchlist/info (meta only), /v1/atlas/contagion-watchlist/{cc} (per-country attribution — which triggers are pushing this country up). Cron every 6h.
#contagion#forecast#live#proactive#watchlist#next-24h#transparency#ml-honesty#api - 2026-05-21
Daily domain delta: which domains gained or lost blocking countries overnight
Most censorship dashboards answer "is X blocked in Y right now?" The Voidly Atlas daily domain delta answers the question journalists actually need at 8 AM: which domains gained or lost blocking countries between yesterday and today. Pipeline diffs the per-domain set of blocking countries day-over-day across the 85K-row evidence table, requires min_observations=3 per (domain, country, day) to drop intermittent noise, and ships top 25 gainers + top 25 losers + a 7-day per-domain history as a pre-rendered sidecar. First build (2026-05-21) surfaced 15 movers across 13 domains — chat.openai.com and chatgpt.com both lost CN (overnight probe coverage shift, not a policy shift), Skype and Copilot picked up 14 countries each (probe rotation expanding into countries not probed yesterday). The endpoint refuses to dress this up: four honest caveats are baked into every response — probe coverage is uneven, intermittent connectivity flips bits, 24h windows have <500 global block measurements, and "lost a block" can just mean the country was probed once today vs three times yesterday. Endpoints: GET /v1/atlas/domain-delta (top movers + filters), /v1/atlas/domain-delta/<domain> (7d per-domain history), /v1/atlas/domain-delta/info (methodology + freshness). Cron 0 4 * * *.
#daily#domain-tracking#delta#journalist#leading-indicator#transparency#ml-honesty#api - RUIRPKBDEGTRCN2026-05-21
Government statement scraper: pairing ministry press releases with Voidly shutdown incidents
Voidly Atlas previously saw shutdowns only from the network side (OONI/IODA/Voidly probes). This v1 ships a curated government-statement scraper over 7 ministries (Russia Roskomnadzor, Iran MICT, Pakistan PTA, Bangladesh BTRC, Egypt MCIT, Turkey BTK, China MIIT), an NLP entity extractor (domain mentions, ASN refs, language-aware keyword vocabularies in en/ru/fa/ar/tr/bn/zh), and a correlation pass that pairs each statement with Voidly incidents in the same country within ±72h. First run ingested 13 statements (BD 8 + TR 5) across the reachable sources and surfaced 13 (statement, incident) pairs; the top pair (confidence 0.545) is a Turkey BTK regulatory draft published ~48h after a critical IODA-confirmed connectivity disruption on 2026-04-28 — a retroactive timing flagged correctly. 2 sources (RU, IR) are unreachable from our Vultr egress; 2 (PK, CN) reached but JS-shell-only — all documented up-front rather than papered over. Cron every 6h. Endpoints: GET /v1/atlas/government-statements, /v1/atlas/government-statements/info, /v1/atlas/government-statements/correlations, /v1/atlas/government-statements/{stmt_id}.
#scraper#government#press-releases#cross-source#investigative#transparency#ml-honesty - IRCNRUTRPKEGSAAEIDBYVE2026-05-21
Auto-fact-check service for journalist claims: natural-language → verdict + evidence permalinks in milliseconds
Most censorship-research platforms force a journalist to manually query a country page, then a service page, then cross-reference probe rows by hand. The Voidly Atlas auto-fact-check service inverts that flow: a journalist types a natural-language claim ("Twitter is blocked in Iran") and gets a verdict, a confidence score, and five evidence permalinks in under a second. The endpoint POST /v1/atlas/fact-check ships today with 95% accuracy on a 20-claim benchmark (median latency 178ms public / 7ms upstream). Verdicts: confirmed_block (≥3 block rows from ≥2 independent sources), partial_block (some block evidence, below the floor), not_observed (zero rows — and explicitly flagged as ambiguous between "accessible" and "no probe coverage"), contradicted (claim disagrees with evidence), insufficient_data (unparseable claim). Built on the 85K-row evidence table (OONI + IODA + CensoredPlanet + Voidly probes). Claim parser is heuristic (122 country aliases, 36 service aliases + 9 OONI test-name mappings); honest caveats baked into every response. Corroboration sources, top-5 permalinks, last_observed_at, signal_types breakdown, and average block signal all returned inline. Free tier, cached 5 minutes on the Worker edge.
#fact-check#journalism#claim-verification#evidence-based#transparency#ml-honesty#api - IRCNRUPKAZEGBYTRIQVEIDMYIN2026-05-21
Block-evasion success-rate index: which circumvention tools actually reach their bootstrap endpoint, per country
Activists and journalists routinely ask which circumvention tool actually works in a given country. The honest historical answer has been "try a few and see what survives." The Voidly evidence table records every measured block of a known bootstrap endpoint with probe counts embedded in the upstream_claim text ("11/11 probes anomalous for psiphon.ca"). We aggregate those rows into a per-country per-tool success rate over the last 30 days. Tools covered: Tor (rolled-up OONI tor test), Lantern, Psiphon, Snowflake, ProtonVPN, ExpressVPN, Mullvad, VPN Gate, plus a generic-VPN aggregate. First run: 17,144 candidate evidence rows, 131 (country, tool) pairs across 34 countries. Best-working combos are all Tor in countries with aggressive but non-GFW-grade filtering: Venezuela (96.7% n=153), Indonesia (96.4% n=584), Turkey (96.2% n=1,223), Malaysia (96.1% n=77), Iraq (94.3% n=176). Worst countries (even best tool fails): Azerbaijan (best=ProtonVPN at 0% over 70 probes), Pakistan (best=Tor at 0.3% over 1,495 probes — every commercial-VPN domain is near 100% blocked), Egypt, Belarus. Honest caveats inline in every API response: probe coverage uneven, "probe success" != "real user success" (DPI passes synthetic measurement traffic), domain-fronted tools (Snowflake, Lantern, Psiphon) are systematically under-estimated by static-domain probes, self-hosted WireGuard / OpenVPN on a private IP is invisible to OONI and cannot be measured. Tor rows aggregate every transport (vanilla / obfs4 / Snowflake / Meek); per-transport breakdown is out of scope for v1. Live at GET /v1/atlas/evasion/{cc} + GET /v1/atlas/evasion/leaderboard + GET /v1/atlas/evasion/info. Cron daily 06:00 UTC.
#evasion#tor#vpn#circumvention#evidence-based#transparency#ml-honesty#investigative - PKUZNISDERIRRUCNINTR2026-05-21
Per-day model uncertainty surfacer: which Voidly forecasts to question today
Voidly Atlas already surfaces model confidence universally (conformal intervals on the 7-day forecast, AUC/F1 on the classifier, online ACI alpha for drift). What was missing: a single SCORE for THIS day's prediction in THIS country, so a journalist asking "is the IR forecast trustworthy TODAY?" gets one number to anchor on. The new per-day uncertainty surfacer scores 30 watched countries across 5 production models (forecast-7day, multi-horizon-1d/7d/30d, classifier v3.3) as a linear composite of five components: conformal interval width (weight 0.30), data freshness (0.20), calibration drift (0.20), sample size (0.15), and cross-model disagreement (0.15). Today's most-uncertain country is Pakistan (mean 0.4443, 85.55pp calibration drift, cross-model agreement only 0.226 — forecast says 0.95 risk, classifier says 0.01). Today's least-uncertain: Turkey (mean 0.196, calibration in spec). Iran sits mid-pack at mean 0.2122 with cross-model agreement 0.6725. The score is explicitly a HEURISTIC composite, not a Bayesian posterior; weights are hand-tuned not learned; and "uncertain" does NOT mean "wrong" — it means lower confidence. Endpoint live at GET /v1/atlas/uncertainty/{cc}, GET /v1/atlas/uncertainty/most-uncertain, GET /v1/atlas/uncertainty/info.
#uncertainty#transparency#ml-honesty#journalist-facing#calibration#cross-model - CNRUIRMMPKTRBYAEBHSA2026-05-21
DPI fingerprint library: heuristic vendor attribution for 19,506 evidence rows across 10 device families
Voidly Atlas previously told you HOW a country blocks (DNS / TCP / TLS / blockpage) but not WHICH VENDOR. The new DPI Fingerprint Library v1 closes that gap with a curated, public, citation-backed library of 14 deep-packet-inspection devices — 7 state-deployed (Russia TSPU, China GFW, Iran ARIA DPI, Belarus Beltelecom, Turkey BTK, Myanmar Junta, Pakistan PTA) and 7 commercial appliances (FortiGate, Sangfor, Netsweeper, Blue Coat, Smartfilter, Cisco WSA, Palo Alto). Each fingerprint is a hand-curated rule with up to four components — country_prior (HARD GATE), signal_type_in, blocking_method_in, optional upstream_claim_regex — and a confidence floor. We emit at most one vendor match per evidence row, choosing the highest weighted-sum confidence above the floor; ties go to the rule with the most specific signal (regex > blocking_method > country_prior). Backfilled across 85,549 evidence rows: 19,506 matches (22.8%) spread across 10 of 14 vendors. Top vendors: China GFW (6,378), Russia TSPU (5,197), Iran ARIA DPI (1,870), Myanmar Junta DPI (1,709), Pakistan PTA WMS (1,449), Turkey BTK DPI (1,399), Belarus Beltelecom DPI (937), FortiGate (519), Blue Coat (30), Netsweeper (18). Honest caveats inline: heuristic matching not ML, public fingerprints lag vendor updates, an evidence row matching a vendor does NOT prove that vendor performed the block — only that the signal is consistent with that vendor's known behaviour pattern. The four vendors that did NOT fire heavily (Sangfor, Cisco WSA, Palo Alto, Smartfilter) need blockpage HTML text in upstream_claim, which OONI does not always preserve — that gap is honest, not hidden. Live at GET /v1/atlas/dpi-fingerprints + GET /v1/atlas/dpi-fingerprints/{vendor_slug} + GET /v1/atlas/dpi-distribution.
#dpi#vendor-attribution#fingerprints#investigative#transparency#ml-honesty - 2026-05-21
Multi-country anomaly burst detector: candidate coordinated censorship campaigns
Single-country anomaly detectors (DBSCAN, STL) catch local events. This burst detector catches CROSS-COUNTRY synchronized events — K>=3 countries flipping anomalous on the same day. Pipeline: 90d lookback, mirror the live DBSCAN scoring (45d rolling window, eps=75th-pct kNN, min_samples=3) over 3,718 (country, day) cells, group 820 flips by day, flag days with K>=3 distinct countries as candidate bursts, mine the underlying evidence to find the modal shared domain / blocking method / signal type as the hypothesized common factor. Significance: under independence per-country flip rates, p_any = 1 - (1 - product(rates))^N_days, Bonferroni-corrected over N_days. First run: 73 bursts in 90 days, 33 significant at p_adj < 0.05. Largest burst (2026-05-03, K=58, p_adj=0.0000) hypothesizes shared_domain:chat.openai.com — almost certainly OONI's coordinated probe sweep, not coordinated censorship (the exact failure mode we flag honestly). More-credible candidate bursts: protonvpn.com spread across 23-26 countries 2026-04-12 to 04-14 (anti-circumvention pattern), recurring twitter.com (K=22-24) and facebook.com (K=22-31) bursts. Sidecar at /opt/voidly-ai/ml-deploy/anomaly_bursts_v1.json. Live at GET /v1/atlas/anomaly-bursts + GET /v1/atlas/anomaly-bursts/{burst_id}. Cron daily 05:30 UTC. Honest caveats: co-occurrence != coordination (could be coincidence, shared infra failure, or OONI methodology change), 6h bucketing collapsed to day because upstream observed_at is day-granular, independence assumption is wrong (neighbors correlate), DBSCAN AUC is only 0.65 so the flip signal itself is noisy.
#anomaly-detection#burst#coordination#cross-country#ml-honesty#transparency - 2026-05-21
Competitive benchmark: Voidly vs Cloudflare Radar / Access Now / NetBlocks (20 landmark events)
Hand-curated lead/lag comparison across 20 landmark shutdown events (2019-2026): Mahsa Amini, Bangladesh quota protests, Brazil X/Twitter, Venezuela election, Kenya finance bill, Sudan coup, Myanmar coup, Uganda election, Cuba 2021, Kazakhstan 2022, Kashmir 552-day blackout, Bloody November, and 8 more. Of 20 events: 1 voidly_led (Lebanon 2026 BGP, +7 days vs Cloudflare via forecast-threshold-cross — our only true live-detection claim), 0 lagged, 0 tied, 19 backfilled (Voidly DB row from historical OONI ingestion, NOT a live pipeline detection — these do NOT support a real-time lead claim). Honest caveats baked in: we measure publication-vs-publication only (not internal detection at competing orgs), curated list is intentionally biased toward landmark events (use /v1/atlas/prediction-track-record for the non-cherry-picked sample), Cloudflare Radar's public event browser doesn't cover pre-2021 events. Sidecar at /opt/voidly-ai/ml-deploy/competitive_benchmark_v1.json. Live at GET /v1/atlas/competitive-benchmark + drilldown /v1/atlas/competitive-benchmark/{event_id} + GET /v1/atlas/competitive-benchmark/info. Cron weekly Tue 04:15 UTC.
#transparency#ml-honesty#benchmark#journalism#sources - 2026-05-21
Cross-protocol classifier: per-port blocking probability (8 protocol groups)
Eight small XGBoost classifiers, one per protocol group (HTTP-80, HTTP-headers, web_connectivity, TLS-WhatsApp, TLS-Signal, TLS-Telegram, TLS-FB-Messenger, Tor). Given a measurement to a (host, port) on a country/day, each model returns the probability that port is blocked. Built on OONI evidence by parsing the N/M anomalous-measurements ratio out of upstream_claim text (necessary because non-web_connectivity tests only store rows when blocking occurs). All 8 cleared promote floor: LOCO pooled-OOF AUC 0.98 to 0.999, per-country median AUC 0.89 to 1.0. Honest caveat: the high AUCs come from strong port-level blocking persistence (history is the dominant feature), not novel signal — and web-connect labels partially overlap with TLS-app labels (both touch 443). Live at GET /v1/classifier/protocol/{proto}/{cc} and /v1/classifier/protocol/info.
#classifier#protocol#ml#per-port#ml-honesty - 2026-05-21
STL seasonal anomaly detector (complement to DBSCAN, orthogonal signal)
New per-country anomaly detector using STL (Seasonal-Trend decomposition via Loess; Cleveland 1990) that learns each country's own weekly rhythm and flags days that break it. ORTHOGONAL to DBSCAN at /v1/anomaly/dbscan/{cc} — DBSCAN catches shape-anomalous days against a 45-day rolling cloud; STL catches days that deviate from THIS country's own seasonal pattern. Egypt may always have high anomaly_rate on Fridays — DBSCAN sees that as normal, STL flags non-Friday spikes as anomalous. Implementation in scripts/build-stl-seasonal-anomaly.py: 90-day per-country daily anomaly_rate time series, statsmodels.tsa.seasonal.STL(period=7, robust=True), residual = observed - (trend + seasonal), z-score within country, flag if |z|>2.0. Cron daily at 04:45 UTC. Sidecar at /opt/voidly-ai/models/stl_seasonal_anomaly_v1.json, time-series at /opt/voidly-ai/data/stl_seasonal_anomaly.parquet. First run: 43 of 211 countries had ≥60 days of data for viable fit. Today's top |z|: BY +4.38 (Belarus, only flagged today). Last-7-day STL set (BY, NG, QA, SA) has ZERO overlap with today's DBSCAN flagged set (AU, BR, CA, DE, EG, ES, FR, GB, IN, IQ, JP, KR, MA, MX, NL, SG, US, ZA) — 4 unique STL signals + 18 unique DBSCAN signals, exactly the orthogonal complement promised. Honest caveats: STL is descriptive not causal (high z != censorship — could be holiday, fiber cut, measurement-coverage artifact), sensitive to data sparsity (168 of 211 countries skipped for <60d data), period=7 assumes weekly seasonality (constant censors get a small seasonal component but still get residuals).
#anomaly-detection#stl#seasonal#orthogonal-signal#ml-honesty#transparency - 2026-05-21
ML serving reliability dashboard: 30+ endpoint health in one curl
With 30+ ML endpoints (/v1/forecast/*, /v1/classifier/*, /v1/anomaly/*, /v1/measurement/*, /v1/sentinel/*) in production, a journalist or partner asking "is the model live and working?" used to need 30+ curls. The new daily dashboard at GET /v1/atlas/serving-reliability collapses that into one. Cron runs scripts/build-serving-reliability.py at 04:00 UTC: 10 probes per endpoint, HTTP availability, p50 + p95 latency, JSON schema fingerprint (top-level-keys superset check), model staleness (days since the underlying sidecar was last written or trained_at declared), honest_caveats presence, and calibration drift (forecast/sentinel only). Sidecar at /opt/voidly-ai/ml-deploy/serving_reliability.json. Promote criteria: >= 25 endpoints probed and honest about any 500s/timeouts. Honest caveats: 10-probe sample is small so p95 is noisy, probes hit localhost upstream so CF gateway issues are NOT detected, schema check is a fingerprint not full JSON-schema, MCP server endpoints excluded.
#monitoring#reliability#ml-honesty#transparency#observability - 2026-05-21
Per-country forecast calibration drift monitor (auto-alert at +/-15pp)
New monitor at scripts/build-per-country-calibration-drift.py walks the top-50 most-active forecast countries daily at 05:00 UTC and computes mean predicted probability vs empirical positive rate (censorship/mixed only, IODA disruption excluded) over a trailing 30-day window. Any country crossing +/-15pp drift gets flagged and fires a calibration_drift event into the existing CenAlerts pipeline (24h dedup). Sidecar at /opt/voidly-ai/ml-deploy/calibration_drift_by_country.json. Live at GET /v1/sentinel/calibration-drift (full table) and /v1/sentinel/calibration-drift/{cc} (per-country). Why this exists: the post-refit global drift is ~0pp, but individual countries can still drift in opposite directions and cancel out in the aggregate. Honest caveats: 30-day windows are narrow, high-frequency-flip countries (VE, MM) produce noisy estimates, drift > 0 != broken model.
#monitoring#calibration#ml-honesty#transparency#cenalerts - 2026-05-21
Auto-incident watchdog: DBSCAN + Bayesian corroboration draft generator
New watchdog at scripts/auto-incident-watchdog.py cross-runs the DBSCAN unsupervised anomaly model with the Bayesian corroboration model every 6 hours. When DBSCAN flips a country-day AND the Bayesian posterior is at least 0.5 AND no incident exists for that country within +/-7 days AND there is at least one confirmed-source evidence row (OONI / CensoredPlanet / Voidly probes — IODA disruption rows are excluded), it writes a DRAFT JSON to /opt/voidly-ai/data/auto_incidents_queue/ for editorial review. Drafts are exposed at GET /v1/atlas/auto-incidents-pending and are NEVER counted in the public 343-citable-censorship headline. The watchdog never writes to the incidents table — promotion is manual. First run: 0 drafts (the 4 DBSCAN flips today all failed the corroboration gate). Honest caveat: false-positive rate is uncontrolled.
#watchdog#dbscan#corroboration#ml-honesty#editorial-queue - 2026-05-21
Forecast 7-day isotonic calibration refit (-56pp calibration drift)
The /v1/atlas/prediction-track-record endpoint surfaced a +56.45pp under-prediction drift on forecast_7day (mean predicted 4.9% vs empirical positive rate 61.4%). Two upstream bugs: a stale isotonic mapping fit on disruption-inflated labels, and an outcome joiner that counts IODA disruption rows as positives. Refit the isotonic on the last 30 days of (raw probability, censorship-only observed) pairs from sentinel.db. Brier 0.380 → 0.120, ECE 0.498 → ~0, headline drift +56.45pp → 0.00pp. Promoted via Brier-no-worse-than-+0.05 AND ECE-tighter gate. Old calibrator preserved with .bak suffix.
#forecast#calibration#ml-honesty#transparency - 2026-05-21
Per-individual-domain 7-day shutdown forecast (top 42 domains × 50 countries)
Per-individual-domain shutdown forecast at /v1/forecast/domain/{domain}/{cc}. Shared XGBoost across all (domain, country) pairs with domain one-hot — LOCO median AUC 0.999 across 28 evaluable domains, temporal-holdout AUC 0.983, Brier 0.061 → 0.041 after isotonic. Caveat: 75% positive base rate means the model is mostly predicting block-state persistence (~85% of pair-days), not novel state transitions.
#forecast#per-domain#ml#transparency - 2026-05-21
Live 30-day production track record across all forecast models
New endpoint /v1/atlas/prediction-track-record joins daily-logged forecasts against observed incidents. forecast_7day (v1) ships with empirical precision 0.69 / recall 0.39 over the last 720 predictions; the other 11 models surface training-time metrics with n_predictions=0 + an honest caveat until per-request prod logging lands.
#transparency#ml-honesty#forecast#calibration - VE2026-05-20
Venezuela: 63 confirmed censorship incidents in 90 days
Venezuela leads the world in incident volume on the Voidly Atlas — 63 confirmed events in the last 90 days, the highest count of any country we track.
#shutdown#elections#latin-america#leading-indicator - IR2026-05-20
Iran 2026 Presidential Election: 52% peak shutdown risk
Voidly's forecast model flags a 52% peak shutdown risk for Iran in the 7-day window leading into the 2026 presidential election, citing election-day as the primary driver.
#elections#middle-east#forecast#shutdown - CNIRRUTM2026-05-20
Anti-circumvention tools are universally targeted
Our probe network detects 100% block rate on getlantern.org globally and 23%+ block rates on Signal, Telegram, WhatsApp — the same anti-circumvention toolkit blocked in every restrictive regime.
#circumvention#global#media-freedom - 2026-05-21
Classifier v3: removed the 85% leakage feature, got 0.86 LOCO F1
The v2 classifier hit 99.8% F1 but country_risk_tier (a hardcoded label leakage) carried 85% of that signal. v3 drops it. Honest leave-country-out F1: 0.86 (Iran AUC 0.95).
#methodology#ml#classifier#transparency#no-leakage - 2026-05-20
How we fixed Sentinel's 15× miscalibration in one afternoon
The forecast was telling journalists "5% risk in Iran" when the actual incident rate was 65%. We refit isotonic regression on 810 live (predicted, observed) pairs from sentinel_outcomes. Brier dropped 0.59 → 0.22; Iran's forecast jumped from 0.15 to 0.74.
#methodology#ml#forecast#calibration#transparency - 2026-05-20
How we audited our own shutdown-forecast model and published the embarrassing numbers
Voidly Sentinel publishes three accuracy splits — stratified (inflated 0.98 AUC), time-based (random 0.50), and LOCO median (honest 0.91). We cite the honest number, not the impressive one.
#methodology#ml#transparency#forecast - 2026-05-21
Six new ML transparency surfaces shipped in one session
Every Sentinel forecast now ships with SHAP contributions + a conformal interval. The v3 classifier has public feature-importance and metadata endpoints. /sentinel/backtest renders the reliability diagram, /atlas/forecast lists every watched country, and /v1/sentinel/movers surfaces 7-day deltas.
#methodology#ml#transparency#shap#classifier#forecast - 2026-05-21
Classifier v3.1: trained on 13.5× more data, evaluated on 18× more countries
v3 was the leakage fix. v3.1 is the data fix. By mining the live incidents table for per-country-day labels, the training set jumps from 314 / 18 positive / 7 countries to 4,237 / 1,116 positive / 131 countries. LOCO median F1 is now an honest 0.82 across 127 countries.
#methodology#ml#classifier#training-data#honest-metrics - 2026-05-21
Cross-country contagion features: wins on the tail, regresses on EG. Held back.
We added 3 neighbor-risk features to v3.1 and retrained as v3.2. Stratified F1 jumped 0.673 → 0.712 and the targeted weak countries (PK, TH, SG) improved 3-9 points. But EG regressed 13.5 points and Western Europe got worse too. Net LOCO neutral. Holding v3.2 back; iterating the adjacency map.
#methodology#ml#classifier#contagion#experiment#honest-failure - 2026-05-21
Causal attribution for shutdowns: synthetic DiD applied to internet censorship
When a shutdown happens, we can now answer "what caused it?" with a defensible counterfactual. /v1/sentinel/attribute builds a synthetic control from weighted stable-democracy donors, measures the post-period gap, runs a permutation p-value, and surfaces nearby political events. Method: Arkhangelsky et al. (arXiv:1812.09970), adapted from Internet Society NetLoss (ACM JCSS 2024).
#methodology#attribution#causal-inference#sdid#novel - 2026-05-21
Classifier v3.3: regime-similarity contagion. Better on aggregate, MENA trade-off.
v3.2 weighted neighbors by geography (UN subregion) and the results were mixed. v3.3 weights neighbors by historical anomaly_rate correlation — and wins clearly on aggregate. Stratified F1 0.673 → 0.729 (+8%), LOCO median F1 0.818 → 0.870 (+5%), EG recovered 0.548 → 0.726 (+18pp), Western European democracies back to F1 ~1.0. But 16 countries regress >5pp vs v3.1, mostly MENA + former Soviet states whose neighbor correlations fall below the overlap threshold and drop to 0. Promoted with honest caveats.
#methodology#ml#classifier#contagion#regime-similarity#honest-trade-off - 2026-05-21
Classifier v3.4: regime-cluster fine-tuning didn't fix the tail. Held.
Tried per-regime-cluster fine-tuning heads (MENA, post-Soviet, East Asia, SE Asia, LATAM, Sub-Sahara) stacked on top of v3.3 to recover the 16 countries that regressed under v3.3. The stacking head learns to mostly ignore the cluster heads (base coef 9.8 vs cluster coefs in [-0.83, +0.64]). LOCO median F1 drops 0.870 to 0.833, only 1 of 16 regression countries improves by ≥3pp (UZ +7pp), and 2 countries regress further (GE -9pp, SY -5pp). Both promotion gates fail. v3.3 stays in production. Documented as a real negative result.
#methodology#ml#classifier#regime-cluster#fine-tuning#negative-result#honest-no-promote - 2026-05-21
Forecast v2 contagion: huge aggregate wins, IR regresses 27pp. Held back.
Applied the classifier v3.3 regime-weighted-contagion playbook to the XGBoost forecast model. Stratified F1 +4.9pp, LOCO median F1 +17.8pp (! — bigger than classifier got), 15 of 19 countries improve. But Iran — a flagship country — regresses 27.4pp F1 because its neighbors have no positive correlation. Honest no-promote.
#methodology#ml#forecast#contagion#iran#honest-no-promote - 2026-05-21
Forecast hyperparameter grid search: defaults already near-optimal
Ran a 27-cell GridSearchCV over XGBoost (n_estimators × max_depth × learning_rate) plus a follow-up min_child_weight/gamma sweep. Holdout AUC improved +0.007. But LOCO median AUC DROPPED -0.003. Best params lose in 7 of 10 most-active countries. The current defaults are at the practical ceiling for this feature set. Future gains require feature engineering, not hyperparams.
#methodology#ml#forecast#hyperparameters#honest-no-improvement#distribution-shift - 2026-05-21
Per-ASN forecasting: not viable today. The probe network needs 5× more ASN coverage first.
We prototyped per-ASN granular forecasting (one model per ISP/AS) per Saha et al. WebSci 2025. Of 168 ASN-tagged ASs in our evidence corpus, only 6 had ≥30 measurement days — and only 1 had enough class variance to train. The data isn't there yet. Filing this as a probe-network expansion priority instead.
#methodology#ml#forecast#per-asn#data-density#honest-not-yet - 2026-05-21
Stealth blackout detector: 458 candidate days where BGP held but the data plane didn't
Aryapour 2025 (arXiv 2507.14183) showed Iran can run a "stealth blackout" — keep BGP routes UP while throttling DNS/HTTP/HTTPS. Invisible to BGP-based IODA. We built a heuristic detector: ping-slash24 critical alerts ≥ 5 AND BGP relatively stable AND OONI blocking/interference corroboration. Found 458 candidate country-days (149 strong) — all already in our incidents table but with empty mechanism fields. The detector lets us back-classify opaque "Internet disruption" incidents as stealth-blackout-flavored.
#methodology#detection#stealth-blackout#iran#ioda#aryapour - 2026-05-21
Atlas Score v2: base-rate weighting promotes chronic blockers (CN, RU, KP).
v1 of the score rewarded change over level — Russia/China/North Korea scored as B- because nothing was actively changing. v2 weights 50% structural baseline (12-month censorship-weighted incidents + tier floor) and only 20% recent forecast. Result: CN +33pts, RU +24pts, KP +32pts. Iran moves to #1 at F grade. v2 is experimental at /v1/atlas/score-v2; v1 remains the default until grade bands are tuned.
#methodology#atlas-score#base-rate#china#russia#experimental - 2026-05-21
Multi-horizon forecast shipped: 1-day, 7-day, 30-day separate models
Voidly's forecast is no longer single-horizon. We trained 3 separate XGBoost + isotonic models (1d, 7d, 30d), all clearing honest thresholds (AUC 0.91 / 0.88 / 0.84 LOCO). Each horizon has its own conformal interval + per-horizon top-5 SHAP features. The drivers differ by horizon: 1d is operational telemetry, 7d is political tension (GDELT), 30d is repeat-risk + seasonal. SoTA literature (TFT, Sun et al. spatio-temporal conformal) says multi-horizon beats single. We confirmed and shipped.
#methodology#ml#forecast#multi-horizon#shap#conformal - 2026-05-20
Forecast retrain unblocked: dual-holdout gate (legacy + temporal)
The weekly forecast retrain has rejected every new model since May 3, 2026, because the frozen 2024-style holdout no longer reflects 2026 reality. We shipped a dual-holdout gate that requires the new model to not regress on RECENT data without catastrophically regressing on legacy.
#methodology#ml#forecast#retrain#distribution-shift#holdout#gate - 2026-05-21
CenDTect-style DBSCAN unsupervised anomaly: AUC 0.6506, promoted as second-opinion signal
Adapted the CenDTect approach (Aceto & Pescape 2025 — DBSCAN over OONI feature vectors) to Voidly's 80K-row evidence table. Per-country rolling 45-day window, DBSCAN(eps=75th-pct kNN, min_samples=3) on 12 standardized features. AUC vs v3.3 labeled incidents: 0.6506, just above the 0.65 promote floor. Promoted as a SECOND-OPINION signal — the supervised classifier still wins at 0.99, but DBSCAN surfaces shape-anomalous days the labels never saw. Live at /v1/anomaly/dbscan/{cc}.
#methodology#ml#anomaly#unsupervised#dbscan#cendtect#second-opinion#promoted - 2026-05-21
Per-domain HDBSCAN drift surface: novel-blocking detection orthogonal to per-country DBSCAN
Shipped a second unsupervised anomaly axis: per-DOMAIN HDBSCAN drift over the last-28-day feature vector for every domain with >= 10 measurements. Weekly cron compares this week vs last week — new clusters = novel blocking patterns, centroid drift = existing patterns intensifying, per-domain L2 distance = how much a domain's blocking profile changed. Orthogonal to /v1/anomaly/dbscan/{cc} (per-country DBSCAN). First run: 27 domains, 2 new clusters, all top-10 drift domains corroborated by critical/warning evidence in the last 14 days. Live at /v1/anomaly/domain-drift/leaderboard and /v1/anomaly/domain-drift/{domain}.
#methodology#ml#anomaly#unsupervised#hdbscan#drift#per-domain#novel-blocking#second-opinion#promoted - 2026-05-21
Forecast labels cleaned: IODA outages no longer count as confirmed censorship
The forecast target_7day label was treating IODA outage alerts as confirmed censorship — flooding April 2026 with 1,011 disruption labels across 167 countries (94% of all April incidents). We split the labeling so only confirmed-censorship incidents drive the forecast target. April positive rate dropped from 79% to 21% and the dual-gate now accepts new models. New model promoted to production.
#methodology#ml#forecast#labels#data-quality#ioda#fix - 2026-05-21
Adaptive Conformal Inference: forecast calibration that updates itself
The forecast model now ships with Adaptive Conformal Inference (ACI) — an online update from Gibbs and Candes 2021 that keeps 90 percent intervals close to nominal under distribution shift. No retraining required, just a daily cron over observed outcomes.
#ml#forecast#calibration#conformal#aci#online-learning#transparency - 2026-05-21
Row-level measurement classifier: per-measurement censorship scoring (Niaki KDD23 inspired)
New POST /v1/measurement/classify scores a single OONI, CensoredPlanet, IODA, or Voidly measurement and returns a probability + SHAP top-5 explanation. Inspired by Niaki et al. KDD 2023. Honest framing: the model learns to reconstruct the labeling rule, so the high AUC is real but reflects label leakage from raw signals, not novel detection ability.
#ml#classifier#row-level#measurement#xgboost#shap#transparency#niaki-kdd23 - 2026-05-21
GraphSAGE over CAIDA AS-AS topology: LOOCV AUC 0.80 but n=6 is statistically thin
Built a 2-layer GraphSAGE GNN over the May 2026 CAIDA AS-relationship graph (7,060 nodes, 841K edges) to forecast per-ASN 7-day shutdown probability. Leave-one-out CV across the 6 tier-1 ASNs with enough density gives AUC = 0.80, above the 0.65 promote floor — but a permutation test on the 6 fold predictions yields p = 0.32, so we honestly cannot reject the null at any reasonable level. Shipped live at /v1/forecast/asn-gnn/{asn} with passed_promote_floor=false and honest_caveats inline. The actual bottleneck is data sparsity (only 6 ASNs have ≥30 days of evidence), not the GNN architecture. SUPERSEDED 2026-05-22: a better-powered re-evaluation expanded the labeled set to 97 ASNs (62 censoring / 35 clean) with a genuine signal_value-based censorship label and leakage-audited features, then ran leave-one-AS-out and leave-one-COUNTRY-out CV with a 5,000-permutation test. The honest verdict is now SIGNIFICANT — AUC 0.7751, permutation p=0.0002 — and passed_promote_floor has been flipped to true. See the follow-up finding gnn-asn-reeval-genuine-label-2026-05.
#methodology#ml#forecast#per-asn#gnn#graphsage#caida#topology#small-n#superseded - 2026-05-21
TabPFN-v2 lost to v3.3 GradientBoosting (stratified F1 0.719 vs 0.729, LOCO 0.419 vs 0.870) — kept v3.3
We tested TabPFN-v2 (Hollmann et al. 2023, arXiv:2207.01848) as a v3.5 classifier candidate on the same 4,237-sample / 1,116-positive / 131-country / 16-feature dataset that v3.3 GradientBoosting uses. Published TabPFN benchmarks suggested +5-9pp F1 on small (less-than-10K) tabular data. On our dataset the result was the opposite: stratified 5-fold F1 0.719 +/- 0.031 (one point below v3.3 baseline 0.729), and LOCO sampled-30-largest-countries median F1 0.419 (less-than-half of v3.3 median 0.870). Promotion gates 0.78 stratified and 0.85 LOCO were both failed. v3.3 stays in production unchanged. Honest negative result.
#ml#classifier#negative-result#tabpfn#hollmann-23#honest#kept-v3.3#transformer#small-data - 2026-05-21
Shutdown duration forecast (Random Survival Forest) — c-index 0.728, n=343
Voidly Atlas now forecasts shutdown DURATION as well as probability. Random Survival Forest over 343 confirmed censorship incidents, test-set c-index 0.728, censoring rate 78%. Live at POST /v1/forecast/duration and /atlas/duration.
#ml#forecast#survival-analysis#random-survival-forest#duration#shipped - 2026-05-21
Tabular MAE self-supervised pretrain lost to v3.3 (stratified F1 0.573 vs 0.729, LOCO 0.645 vs 0.870) — kept v3.3
We tested SSL pretraining (tabular masked-autoencoder, He/Bahri-style) on a 9,722 country-day unlabeled superset, then fine-tuned on the same 4,237 labeled rows v3.3 uses. Stratified 5-fold F1 0.573 (v3.3 baseline 0.729) and LOCO median F1 0.645 (v3.3 baseline 0.870). Both promote floors failed by wide margins. v3.3 stays in production unchanged. Second SSL-style negative result after TabPFN — pattern documented.
#ml#classifier#negative-result#ssl#tabular-mae#pytorch#honest#kept-v3.3#self-supervised - 2026-05-21
Quantile regression forecast (p5/p50/p95) — failed promote gate (zero-inflated target), shipped as negative result
Trained three LightGBM quantile regressors (alpha=0.05/0.50/0.95) on target_sum_7day for a journalist-grade p5..p95 band. LOCO coverage: p5=81% (nominal 5%), p50=91% (nominal 50%), p95=98% (nominal 95%). Only the upper bound is within ±5pp of nominal — the lower quantiles fundamentally cannot calibrate on a 80%-zero target. CQR shift = 0.002 (no help). Model saved with promoted=false, partial_promote=true; endpoint not promoted to live API. Documented honestly.
#ml#forecast#quantile-regression#lightgbm#negative-result#zero-inflation#honest#not-shipped - 2026-05-21
v3.7 stacking ensemble over 4 base learners — failed F1 gate (+1.1pp vs needed +2.0pp), shipped as transparency endpoint
Stacked v3.3 GradientBoosting (OOF), DBSCAN unsupervised anomaly v1, Bayesian corroboration v1, and per-measurement classifier v1 into a meta-learner. Logistic regression won head-to-head vs MLP (16,8): stratified 5-fold F1 0.7534 vs v3.3 OOF baseline 0.7424 (+1.1pp), AUC 0.9033, LOCO median F1 0.8974 across 97 countries. Promote criteria required +2.0pp stratified F1 AND >=0.85 LOCO median — only LOCO gate passed. Endpoint shipped live at /v1/classifier/stacking/{cc} with passed_promote_gates=false for transparency. Coefficient analysis: v3.3 dominates (1.95), DBSCAN flag adds 0.32, per-measurement adds 0.12, Bayes posterior adds ~0 (already implicit in v3.3). Fourth honest negative result.
#ml#classifier#stacking#ensemble#meta-learner#logistic-regression#mlp#negative-result#honest#transparency#shipped-not-promoted - 2026-05-21
v3.8 cross-model meta-ensemble — fused 10 base classifiers, +8.4pp stratified F1 (PASS) but LOCO flat (FAIL), 7th honest negative
Calibrated Bayesian fusion: LogisticRegression + Isotonic over 10 base classifier outputs (v3.3 GBM, DBSCAN anomaly, Bayes corroboration, per-measurement XGB, per-method http+tls, per-category NEWS/ANON/GRP/COMT, STL seasonal z). Stratified 5-fold F1 = 0.8279 vs v3.3 baseline 0.7435 (+8.44pp, PASS +2pp gate). LOCO median F1 = 0.8750 vs gate 0.88 (FAIL by 0.5pp). Coefficient ranking: p_classifier dominates (β=+1.56), p_measurement gets negative weight (β=−0.98, redundancy flip), p_method_tls (β=+0.81), p_cat_GRP (β=+0.65) carry genuine new signal; STL adds essentially nothing (β=−0.01). Not promoted because cross-country generalization didnt clear the gate — v3.3 stays default at /v1/classifier/score/{cc}; v3.8 is additive at /v1/classifier/meta-ensemble/{cc} for transparency. Seventh honest negative result in the Atlas series — documents that stacking 10 weakly-diverse base models helps in-sample but does NOT meaningfully improve cross-country LOCO.
#ml#classifier#meta-ensemble#stacking#logistic-regression#isotonic-calibration#negative-result#honest#transparency#shipped-not-promoted - 2026-05-21
Per-country F1-optimal thresholds — +4pp median F1 without retraining v3.3
v3.3 uses a single 0.5 decision threshold across 131 countries. Computing per-country F1-optimal thresholds via precision-recall sweep lifts median F1 +4pp (mean +5.4pp) for 73 countries with sufficient labels. 41 countries improve by ≥3pp. CG (+35.7pp), OM (+31.7pp), ZW (+20.1pp) are the biggest movers. Live in /v1/classifier/score/{cc} as label_per_country + threshold_used + threshold_source fields.
#ml#classifier#v3.3#threshold-calibration#per-country#shipped#no-retrain#free-win - 2026-05-21
Hourly within-day shutdown forecast (K=6/12/24)
New XGBoost forecast predicts P(shutdown in next K hours) at hourly granularity, where the daily 7-day forecast collapses 24h of variance into a single bucket. Trained on a 75K-row country-hour panel; per-country median AUC 0.72 / 0.69 / 0.67 at K=6 / 12 / 24h on a 30-day temporal holdout. All three horizons clear the 0.60 promote floor. Honest caveat — about half the evidence lands at hour 00 UTC because OONI and CensoredPlanet are daily-granular, so hourly precision is bounded above by source granularity. Live at /v1/forecast/{cc}/hourly?horizon=6|12|24.
#methodology#ml#forecast#hourly#xgboost#shipped - 2026-05-21
Impact-aware active-learning ranker: uncertainty × volume × drift
The active-learning queue used to rank candidates by |p − 0.5| alone (uncertainty sampling, Lewis 1994). It treated a Mali day with 1 measurement the same as an Iran day with 762. The new default ranking is a 3-factor product — uncertainty × log-volume × DBSCAN drift — heuristic Estimated Error Reduction in the spirit of Settles 2009. Toggle back to the old ranking via ?rank_by=uncertainty. Live at /v1/sentinel/active-learning-queue.
#methodology#ml#active-learning#eer#settles-2009#heuristic#shipped - BZOMMAUZCN2026-05-22
Closing the active-learning loop — and the honest near-zero F1 lift
Voidly Atlas had an active-learning queue that ranked the unlabeled country-days the v3.3 classifier was least sure about — but the loop was OPEN. Human-consensus labels dead-ended at active_learning_labels.json; nothing fed them back, because v3.3 trains from a separate corpus the aggregator never touched. This finding closes the loop end to end: promote-al-labels-to-training.py rebuilds the full 16-feature v3.3 row from live evidence for every consensus-labelled candidate, de-dups on (country,date), excludes IODA disruption incidents from positive labels, and appends to labeled_incidents_v3.3.json with active-learning provenance; once 10 new labels accumulate it queues a classifier retrain via the same retrain-queue.json the weekly cron services (12h cooldown shared with the drift trigger, merges rather than overwrites). GET /v1/sentinel/active-learning-loop-status reports every stage. The honest part: simulate-al-loop-lift.py measured whether feeding the loop actually lifts LOCO F1. On the live queue only 13-19 of 50 candidates were reconstructable; adding them moved LOCO median F1 between -0.0124 and -0.0362 and mean F1 between +0.0002 and -0.0055 across two runs hours apart — negligible, and the runs disagreeing is itself proof the swing is sampling noise, not a stable effect. The per-country breakdown shows the augmented model improving on exactly the hard sparse-data countries the queue surfaces (Belize, Oman, Morocco, Uzbekistan, China all gained double-digit F1 on the larger run) but netting to zero — a volume problem, not a signal problem. A dozen rows over 4,237 is a rounding error. The loop is plumbed and verified but idle until reviewers label a few hundred candidates. We are not claiming a lift that is not there. Filed as an honest negative.
#methodology#ml#active-learning#honest-negative#loco#retraining#accountability#atlas#api#shipped - 2026-05-21
Classifier v3.3 adversarial robustness: 88-93% under realistic evasion, weakest to noise dilution
Honest detection numbers measure performance on data that didn't try to evade us. We ran 200 known-positive incidents through 6 perturbation strategies a censorship regime could plausibly use (halved anomaly rate, doubled noise, halved probe rate, smoothed spike, isolated regime, all combined). Baseline detection 93%; weakest single attack is noise_x2 at 88% detection / 90.3% retention; combined attack lands at 95% (decision-tree structure means stacking tactics doesn't monotonically help the attacker). Published as a trust-building exercise — these are the numbers a regime would have to design against.
#methodology#ml#classifier#adversarial#robustness#evasion#transparency - 2026-05-21
Zero-shot cross-country transfer: meta-features beat the v3.3 prior on the tail
v3.3 regresses on 16 MENA + former-Soviet countries with <5 labels (OM, UZ, TN, LY, YE, JO, MA…) because no parametric model can learn from so few samples. A meta-feature regressor (regime + geography + historical evidence, evaluated under LOOCV) achieves R² 0.487 and beats v3.3's flat prior on 10/16 tail countries (median +16pp improvement). Live at /v1/forecast/zero-shot/{cc}.
#ml#classifier#zero-shot#transfer-learning#meta-features#tail-countries#mena#shipped - 2026-05-21
Predictive shutdown contagion chain: pairwise XGBoost classifier (0.67 primary AUC)
New predictive piece in the Atlas stack: given country A had a confirmed censorship event today, score every other watched country by P(follow within N days). Pairwise XGBoost on (trigger, follower, horizon) tuples; 0.67 held-out AUC at 7d primary horizon, 0.69 at 14d, 92 trigger-follower pairs cross the P>0.30 significance line. Lead-lag is descriptive (who is correlated), causal forest is causal (election ATE) — this is the missing PREDICTIVE piece. Trained 2024-2026 on 311 events across 34 candidate countries; held-out 45-day test window. Top chains concentrate in MENA → Central Asia (OM/MA/EG/SY → JO/UZ/AZ/PK), consistent with the lead-lag findings and a regional regime-imitation prior.
#methodology#ml#forecast#contagion#predictive#temporal-point-process#xgboost - 2026-05-21
Per-blocking-method specialized classifiers (4 methods, 2 promoted)
v3.3 is a single classifier for "is this country-day censored?" But censorship has different mechanisms. We trained 4 specialized XGBoost classifiers (DNS / TCP / HTTP / TLS), each with the same 16-feature v3.3 input. Per-method positive rates are 2-12%, so F1 at threshold 0.5 is unfair — we added an alt-AUC promote path (strat AUC >= 0.80 AND LOCO median AUC >= 0.75 AND optimal F1 >= 0.50). HTTP-blocking (515 pos, strat AUC 0.901) and TLS-blocking (313 pos, strat AUC 0.918) clear the alt path. DNS (194 pos, AUC 0.952 but opt F1 0.484) and TCP (163 pos, AUC 0.903 but opt F1 0.389) skipped. Endpoints return available=false with honest reason for skipped methods. v3.3 stays as the default classifier — this is additive transparency for journalists asking HOW a country is blocking.
#methodology#ml#classifier#per-method#dns#tcp#http#tls#xgboost#transparency - 2026-05-21
Per-category censorship classifiers (NEWS / ANON / GRP / PORN / COMT / MMED / SRCH)
7 specialized XGBoost classifiers exposing WHAT a country is targeting, not just whether. NEWS, ANON (Tor/VPN/Lantern), GRP, PORN, COMT, MMED, SRCH all promoted via the alt-AUC path (stratified AUC 0.913-0.977, LOCO median AUC 0.849-0.957). 5 of 7 primary categories cleared (needed 4), so family promote passes. POLR (political opposition) and RELR (religion) honest-skipped due to zero coverage in evidence. Labels back-filled via a 37-domain hand-curated Citizen Lab category map because only 18 distinct domains in evidence have domain_category set. Endpoints return available=false with transparent reasons for skipped categories. v3.3 stays as the production default; this is additive transparency for journalists asking "is this regime suppressing journalism, circumvention tools, or messaging?"
#methodology#ml#classifier#per-category#news#anon#citizen-lab#xgboost#transparency - 2026-05-21
Atlas Digest: a daily "what changed in 24h" round-up for journalists + AI labs
Single-call pre-rendered daily summary at /v1/atlas/digest (JSON) and /v1/atlas/digest.html (email-friendly). Eight sections: new incidents in last 24h, forecast movers (>10pp), DBSCAN anomalies, multi-country blocked domains, lead/lag triggers, active-learning queue top-3, cross-source agreement rate, and model-drift watch. Built daily at 04:30 UTC by scripts/build-atlas-digest.py and served as a flat sidecar (no per-request compute). Honest caveat baked in: this is a SNAPSHOT — the live channel is the /v1/incidents/stream SSE feed.
#transparency#digest#sidecar#newsletter#journalists#ai-agents - 2026-05-21
Temporal Fusion Transformer (Lim et al. 2021) vs 3-XGBoost multi-horizon stack
Built a Temporal Fusion Transformer (TFT, Lim et al. arXiv:1912.09363) over the same 38-feature country-day panel that powers our XGBoost forecast. Single attention-based pass predicts a 30-day p10/p50/p90 trajectory, replacing 3 independent GBMs + a Gaussian conformal wrapper. Trained on CPU within a 60-minute budget (hidden=8, 1 attention head, quantile loss over 0.1/0.5/0.9). LOCO median AUC vs the 7-day XGBoost baseline (0.9236) is published inline at /v1/forecast/tft/info. Honest caveats inline: quantile collapse on rare positives, small 21-country corpus, CPU budget. Shipped as a research second opinion, NOT a production replacement — the headline 7-day forecast still flows through /v1/forecast/{cc}/7day.
#ml#forecast#tft#transformer#attention#multi-horizon#lim-2021#transparency - 2026-05-21
Fused anomaly ensemble: composite score over DBSCAN + STL + burst + HDBSCAN
Voidly Atlas runs four independent unsupervised anomaly detectors (DBSCAN per-country shape, STL seasonal residual, multi-country burst coincidence, HDBSCAN per-domain drift) that each surface a different axis of unusual behavior. Journalists previously had to query four separate endpoints and reconcile four different score scales. This fusion ships one composite per country per day, weighted 0.35/0.25/0.20/0.20 (heuristic, documented in every response). Evaluated at the (country, date) of every labeled incident in the last 60 days (n=2,806, 837 positive): raw all-4 fusion scores AUC 0.4949 — actually below chance, because burst (AUC 0.42) and HDBSCAN (0.44) are anti-correlated with the v3.3 label rule for this period. The build script tries 3 strategies (raw / sign-flip / drop-below-chance), picks the highest-AUC, and reports all three plus the choice. Today the winner is "dropped" — DBSCAN + STL only — yielding composite AUC 0.6842, beating the strongest single detector (DBSCAN at 0.6306) by 5pp. Every response still exposes the full all-4 agreement view (n_all4_strong, n_all4_present, all4_strong_flags) so the journalist-facing "how many detectors are firing?" question stays answerable regardless of eval-time weighting. Live at GET /v1/anomaly/fused/{cc} + /leaderboard + /info. Cron daily 05:30 UTC.
#ml#anomaly#ensemble#fusion#unsupervised#dbscan#stl#hdbscan#second-opinion#transparency#ml-honesty - 2026-05-21
Multilingual semantic search: 50+ languages over the incident corpus
Voidly Atlas semantic search at /v1/atlas/search ran on all-MiniLM-L6-v2, an English-only sentence-transformer. Journalists querying in Arabic, Persian, Russian, Chinese, Hindi, etc. got noise. Added a parallel multilingual index (paraphrase-multilingual-MiniLM-L12-v2, same 384-d shape, 50+ languages) over the same 2,696 incidents, stored in a separate table (incident_embeddings_multilingual) so the English-only endpoint stays byte-identical. New endpoint GET|POST /v1/atlas/search/multilingual?q=... returns the same shape plus `detected_language` from a script-based heuristic (Arabic/Persian/Cyrillic/CJK/Hebrew/Devanagari/Thai/Burmese/Ethiopic blocks). Test queries: AR "إيران الإنترنت" → top-3 Iran incidents (sim 0.71-0.73), FA "اینترنت ایران" → top-3 Iran (0.69-0.70), RU "блокировка интернета Россия" → top-3 Russia (0.76-0.79), CN "中国互联网封锁" → top-3 China (0.74), ES "censura internet" → Mauritius/South Sudan/Turkey censorship rows (0.74-0.77). Latency overhead +33-49ms vs English-only (40ms → 73-89ms) — the L12 encoder is ~3x larger but still well under the 200ms p95 target. Honest caveats inline in every response: (1) incident descriptions are mostly English so cross-language matching relies on the encoder mapping queries into a shared space, (2) multilingual models score 5-10pp lower precision than per-language, (3) script-based language detection cannot distinguish ES vs FR vs IT inside Latin. Weekly cron `5 5 * * 0` rebuilds the multilingual index 5 minutes after the English one.
#ml#search#multilingual#embeddings#sentence-transformers#i18n#accessibility#transparency - 2026-05-21
Real-time per-day SHAP attribution for the 7-day forecast
The /v1/forecast/{cc}/7day endpoint already exposed an aggregate top_features list (3 SHAP contributions for the whole forecast). Journalists kept asking the same follow-up: "okay, but WHY is day 5 higher than day 0?" Until now the only honest answer was "an event boost gets applied outside the model." This finding ships per-day SHAP attribution as a structured response field. The 7-day forecast response now carries a top_features_per_day array aligned with the 8-day forecast list, where each entry exposes the top-3 signed contributions ranked across two pools: (a) the model-side SHAP contributions from shap.Explainer permutation on the XGBoost + isotonic predict_proba (constant across days because the model takes one country feature vector), and (b) the deterministic post-model overlay (event_within_Xd, day_decay). On 2026-05-21 the IR forecast for day 7 attributes 0.121 risk to event:Iranian-Election-2026-in-4d (+0.0682) + day_decay_t+7 (+0.0350) — both overlay components — while the aggregate model-side top three (ooni_anomaly_7d -0.0253, block_rate_roll30_mean +0.0214, block_rate_lag1 -0.0175) is also returned separately. A new lean endpoint GET /v1/forecast/{cc}/7day-shap returns just the per-day attribution for callers that do not need the full forecast envelope. Cached 6h on (country, hour); permutation SHAP costs ~3ms once the explainer is warm. Latency overhead added vs raw /7day: <1ms p95 on the upstream service (within noise; explainer + cache amortize). Honest caveats baked into every response: SHAP explains the calibrated predict_proba (post-isotonic), the model features do NOT actually change per day (only the post-model overlay does — exposed honestly), and the multi-horizon model has its own per-horizon SHAP at /v1/forecast/{cc}/multi-horizon. Implementation: scripts/patch-forecast-per-day-shap.py + worker/src/routes/hydra.ts handleForecast7daySHAP.
#ml#forecast#shap#per-day#attribution#transparency#explainability#ml-honesty#api - 2026-05-21
Censorship-vs-natural-outage attribution meta-classifier
Many recorded shutdowns are ambiguous — a country goes dark for eight hours, was it state-ordered censorship or a fiber cut, BGP misconfiguration, DDoS, weather, planned maintenance? Confirmed-censorship attribution carries journalistic weight; misattributing a natural outage as censorship is reputation-damaging. This finding ships a meta-classifier at POST /v1/atlas/attribute-outage that takes a country / time window and returns probabilities across censorship | natural_outage (with ddos / infrastructure / weather / maintenance sub-cause hints) | mixed_unknown, plus a verdict, a band (high / medium / low / very_low), and the top-5 contributing features with their current values. Model: GradientBoostingClassifier (200 trees, depth 4, lr 0.05) trained on 2,696 incidents — 399 censorship+mixed positive, 2,297 IODA disruption negative — using 52 features across declared-incident fields, temporal shape (duration, time-of-day, weekday, very-long-flag), and evidence-window shape (signal-type mix, ASN diversity, source diversity, NEWS/ANON/GRP domain category fractions, blocking-method count, critical-signal fraction). Stratified 5-fold CV AUC 0.9974. PASSED both promote floors: censorship recall at P>=0.6 is 99.2% OOF / 100% in-sample (floor 80%), natural-outage correct at P<0.4 is 99.8% OOF / 100% in-sample (floor 60%). Per-incident-type recall on the 343 confirmed censorship rows: 100% in-sample, >99% OOF. Honest caveats inline in every response: (1) sub-causes are HEURISTIC feature-shape hints not supervised classes — we have no ground truth for ddos / fiber / weather; (2) the negative class is a PRIOR not labeled ground truth — some IODA disruption rows are probably mislabeled state shutdowns; (3) the headline AUC is partly recovering the labeling rule (top feature is has_ioda_source at 87.6% importance) so we also report honest_auc_no_source_features=0.9978 by retraining with the eight source-identity features dropped — the genuine shape-signal AUC; (4) the [0.40, 0.60) range is reserved as mixed_unknown for ambiguous windows. Implementation: scripts/build-outage-attribution-features.py + train-outage-attribution.py + patch-outage-attribution-endpoint.py.
#ml#attribution#meta-classifier#censorship#outage#transparency#ml-honesty#api#gradient-boosting - EGUZPKRUNGIRTMBRCUMM2026-05-21
Alert lead-time retrospective: did Sentinel actually warn early? (the accountability number)
Voidly Sentinel fires a forecast_threshold alert when a country's 7-day censorship-risk forecast crosses the alert threshold, and the pitch is "early warning." This finding is the honest audit of that pitch. For every forecast-threshold alert that fired in the last 90 days — 154 alerts across 30 countries — it measures whether a confirmed censorship/mixed incident actually followed within 14 days (a true positive), or never came (a false alarm), and if it did, how many days of lead time the alert gave (incident first_seen minus alert issued_at). The headline numbers, deliberately unsmoothed: 30 true positives (true-positive rate 19.5%), 122 false alarms (false-alarm rate 79.2%), 2 lagging alerts where the forecast reacted to a shutdown already underway. For the 30 true positives the lead-time distribution is median 4.2 days, mean 5.8 days, IQR 2.2-8.9 days, full range 0.9-13.9 days. The 79.2% false-alarm rate is high and the endpoint says so prominently in a headline_warning field rather than burying it — a single Sentinel alert is a watch signal, not a prediction; the genuine early-warning value is in the aggregate lead-time distribution. The per-country split is the real story: where Sentinel works it works (Egypt 3/3 true positives, 100%, median lead 10.9 days; Uzbekistan 7/7, 100%, 1.9 days; Pakistan 5/6, 83%, 2.9 days) and where it does not it does not (Iran 0/7, 100% false alarms — the worst country; Turkmenistan, Brazil, Cuba, Myanmar each 0-for-7). Honest caveats baked into every response: (1) "lead time" is alert-issued vs incident-DETECTION, not alert vs the real-world shutdown start — incident detection itself lags (OONI/IODA ingest + the 30-min incident builder) so a positive lead time is a LOWER bound on the true early-warning margin, it does not over-state; (2) the matched incident is a TEMPORAL match (next confirmed incident within the horizon), not a causal one — the alert did not necessarily predict that specific incident; (3) the 14-day horizon means a real incident the alert correctly anticipated but that arrived on day 16 is scored as a false alarm here, so 79.2% is an UPPER bound on true mis-fires; (4) only confirmed censorship/mixed incidents count — IODA disruption rows (real outages never confirmed as censorship) are excluded, which raises the measured false-alarm rate; (5) lagging alerts are reported separately and excluded from the median-lead-time headline; (6) countries with <2 alerts are excluded from best/worst ranking. Live at GET /v1/sentinel/alert-lead-time + /v1/sentinel/alert-lead-time/{cc} + /v1/sentinel/alert-lead-time/info. Rebuilt daily 04:50 UTC. Paired with /v1/sentinel/accuracy — accuracy grades every forecast, this grades every alert that actually fired. Implementation: scripts/build-alert-lead-time-retrospective.py + patch-alert-lead-time-endpoint.py.
#sentinel#forecast#early-warning#accountability#lead-time#false-alarm-rate#retrospective#ml-honesty#transparency#api - IRCNRUAEKZ2026-05-21
Country censorship-behavior similarity graph: which countries block like Iran?
Censorship analysts, journalists and the forecast model all keep asking the same shape of question: "which countries behave like Iran?" — for transfer learning (borrow a label-rich neighbor's signal for a label-poor country), journalist comparisons, and zero-shot priors. Until now Voidly Atlas had no single answer: the DTW shutdown cohorts grouped countries by the SHAPE of their daily signal series, but shape alone misses HOW a country blocks. This finding ships a country censorship-behavior similarity graph: every country is embedded into a 32-dimension vector built from its measured blocking behavior, and for each country the top-10 most behaviorally-similar countries are surfaced by cosine similarity. The feature vector, assembled over a trailing 365-day window, has seven families: blocking-method distribution (tcp-reset / dns-poisoned / ip-blocked / http-redirect fractions over method-tagged rows); signal-family distribution (the raw signal_type vocabulary collapsed into 8 stable families — dns-blocking, tcp-reset-http, interference, generic-blocking, tor-blocking, middlebox, header-manipulation, outage); category distribution (the Citizen Lab NEWS/GRP/COMT/MMED/PORN/SRCH share of block-like category-tagged evidence — what kinds of sites a country blocks); temporal pattern (monthly block-rate seasonality as std across 12 trailing months, burst frequency as the fraction of days exceeding the country's own mean+2sigma, active-day fraction); severity (critical-signal fraction, mean signal_value); DTW cohort membership (one-hot over shutdown_cohorts_v1 clusters, so two countries sharing a daily-signal shape get pulled together); and regime indicators (normalized risk tier, continent one-hot). All features are standardized to zero-mean unit-variance so a high-variance family like outage fraction does not drown out a low-variance one; pairwise cosine similarity ranks neighbors and a UMAP 2D projection (PCA fallback) gives coordinates for a behavioral map. The first run on 21 May 2026 embedded 135 countries — every country with at least 20 evidence rows in the trailing window — across 32 features. The neighbor lists are intuitive where they should be: Iran's nearest five (UAE, Thailand, Turkey, Saudi Arabia, Kazakhstan) are all interference-heavy mid-tier states with broad persistent blocking; China's nearest five (Iran, Russia, Vietnam, Pakistan, UAE) recover the expected heavy-censorship cluster without political input beyond a single risk-tier scalar; Russia sits closest to China, then Kazakhstan and Belarus. It also surfaces pairs worth a second look — Czechia and Albania at cosine 0.85 despite a two-tier risk gap — a reminder the graph measures the FOOTPRINT, not the politics. Honest caveats baked into every response: (1) similarity is BEHAVIORAL, not political — two countries are near each other because they BLOCK in similar ways, which does NOT imply comparable governments, laws or intent; a stable democracy with seasonal sports-piracy blocking can sit near an authoritarian state if their measured footprints rhyme; (2) feature vectors are SPARSE for low-coverage countries — a country with few OONI probes inside it populates only the regime/continent/cohort one-hot dimensions, is flagged sparse=true (fewer than 3 of 5 behavioral feature families populated), and its neighbors should be read with low confidence; (3) the 2D graph projection compresses 32 dimensions to 2 and loses information — use the cosine neighbor list, not projected pixel distance, for quantitative claims. Live at GET /v1/atlas/country-similarity/{cc} (top-10 nearest) + /v1/atlas/country-similarity/graph (2D projection of all 135 countries) + /v1/atlas/country-similarity/info. Rebuilt weekly Monday 05:10 UTC. Implementation: scripts/build-country-similarity-graph.py + patch-atlas-country-similarity-endpoint.py.
#atlas#similarity#embedding#cosine#umap#transfer-learning#censorship#clustering#zero-shot#ml-honesty#transparency#api - USCAGBFRINSG2026-05-21
Probe-node integrity detector: catching a compromised or misconfigured probe before it poisons incidents
The Voidly probe network is 40 nodes — 15 internal (Voidly-run) and 25 community (cp-* IDs, run by volunteers on hardware Voidly does not control). The community half is the trust weak point of the whole pipeline. Probes ARE the ground truth: every confirmed-censorship incident is built on their evidence. So a community probe that LIES — an adversary standing one up to report "everything is blocked" and manufacture a fake shutdown, or "nothing is blocked" to mask a real one, or simply a misconfigured node that mislabels benign traffic — corrupts incidents directly, and a lying ground-truth source is the worst failure mode the system has. The probe-integrity detector exists to catch that without trusting any single probe. For every probe node over a trailing 30-day window it asks one question: when this node tested a (domain, country) target, did its verdict AGREE with the consensus of every OTHER probe and upstream source that tested the same thing? Each evidence row is collapsed to a binary verdict — BLOCK or CLEAR — and the two row types are decoded differently because they encode the verdict differently. Voidly probe rows all land with signal_type=block (the row IS a block-test record, not a verdict) and a structured JSON blob carrying blockType: dns-poisoned / tcp-reset / blockpage / sni-blocked resolve to BLOCK (genuine DPI censorship signatures), while http-redirect resolves to CLEAR — a redirect such as copilot.github.com to github.com/copilot or skype.com to teams.live.com is normal product behaviour, not censorship, and a probe recording it as a block is itself misconfigured. tcp-timeout and unknown are treated as CLEAR (transient-leaning — a bare timeout with no DPI fingerprint cannot vote BLOCK). Upstream-source rows (OONI / CensoredPlanet / IODA, no probe attribution) use signal_type as a real verdict, with IODA outage rows dropped entirely because a country-level connectivity outage is not a domain-specific censorship verdict. Consensus is built with a three-tier fallback because the probe_node_id column is genuinely sparse — 35 probe nodes, ~265 attributed rows in 30 days, most (domain, country, day) cells touched by only one probe, so a strict same-cell rule would leave almost every probe row un-scorable. Tier 1 (weight 1.0) is other rows on the same domain+country+day, the gold standard. Tier 2 (weight 0.6) is the same domain+country a different day — blocking policy is sticky day-to-day. Tier 3 (weight 0.3) is the country base block rate over the window built from UPSTREAM SOURCES ONLY: an adversary could stand up many probes and manufacture a fake country consensus, so the base rate is anchored only to the independently-operated OONI/CP/IODA measurement sources. The node's agreement_rate is the tier-weighted mean match over its comparable rows; the probe's own rows are always excluded from the pool it is scored against. Two secondary signals are also computed: a degenerate verdict distribution (a node reporting ONLY blocks or ONLY clears across at least 6 distinct targets — the "everything is blocked" / "nothing is blocked" adversary signature) and a reporting-volume outlier (a node whose block-report count exceeds 5x its same-class peer median). integrity_score (0-1) = clamp01(agreement_rate minus 0.15x degenerate minus 0.15x volume_outlier), with agreement_rate defaulting to a neutral 0.5 when a node has no comparable rows. A node is flagged low-integrity when agreement_rate < 0.70, or a volume outlier, or a degenerate distribution that CORROBORATES a low or borderline agreement (degenerate shape alone is not a standalone flag — internal probes seeded on a low-censorship datacenter domain list and community probes seeded on known-blocked domains both come out uniform for benign reasons, so it only docks the score). The first run scored all 35 probe nodes with attributed evidence and flagged 10: 8 community probes plus 2 internal. cp-3e6ixxgs is the clearest case — it reports washingtonpost.com, whatsapp.com, messenger.com and rferl.org as BLOCKED in Great Britain, but the GB country base rate from OONI/CP is 147 clear to 3 block, an agreement_rate of 0.00 across 13 comparable rows, integrity_score 0.0, flagged actionable. Seven of the ten flags carry low confidence (below 0.40, marked "investigate, do not act") because those community nodes have only 4-9 comparable rows — the honest signal that there is not yet enough history to judge them. Honest caveats baked into every API response: (1) a low-agreement node may be CORRECT — a probe sitting in a region with genuinely different blocking than the consensus pool will "disagree" while being right (internal node blr is flagged for exactly this reason: it correctly classifies copilot.github.com as a benign redirect / CLEAR in India while OONI labels the domain blocked), which is precisely why this detector FLAGS for human review and NEVER auto-bans or disables a probe; (2) consensus itself can be wrong — if most probes and sources cluster in one region, and they do, with heavy datacenter and Global-North skew, the majority verdict reflects that region and an honest probe elsewhere is penalized; (3) Tier-2 and Tier-3 consensus are proxies, weaker than same-cell agreement, down-ranked by the tier weights but not made rigorous by them; (4) new and low-volume nodes have little history — a flag with confidence below 0.40 means "not enough evidence to judge"; (5) the probe_node_id column is sparsely populated so coverage grows as attribution improves; (6) degenerate verdict distribution is a weak signal in the current data because of the seeded domain lists, so it nudges the score but agreement_rate is the load-bearing metric. The detector is strictly detection plus flagging — it never auto-disables a probe. Live at GET /v1/atlas/probe-integrity (all nodes, filterable by flagged_only / node_class / score band) plus GET /v1/atlas/probe-integrity/{node_id} (per-node detail including the sample of cells where the node disagreed with consensus). Rebuilt daily 06:15 UTC. Implementation: scripts/build-probe-integrity.py.
#atlas#probe-network#integrity#consensus#trust#data-quality#adversarial#security#ml-honesty#transparency#api - 2026-05-21
Concept-drift detector: catching distribution shift before it poisons the models
Voidly Atlas runs two production models — the v3.3 censorship classifier and the v1 7-day forecast — both retrained weekly behind a dual-holdout promotion gate. But there was no principled detector for the question that sits upstream of the gate: has the live data distribution drifted away from what the models were trained on? That gap had a concrete cost. In March-April 2026 a labeling bug let raw IODA connectivity-disruption alerts (fiber cuts, BGP leaks, weather outages) flow into the forecast target_7day labels as confirmed censorship; the April positive rate exploded to 79% against a ~5% training baseline, the model learned to call almost everything a shutdown, and the bug festered for weeks before a human noticed. A distribution-shift detector watching the label rate would have flagged it on day 2. The concept-drift detector closes that gap. A one-time baseline builder freezes each model feature's training distribution — mean, std, a 7-point quantile summary, and the ten decile bin edges plus the training proportion per bin (frozen bins are what make the daily comparison honest: live data is binned into the exact same edges). Every day the detector recomputes each model's features on the last 7 days of live data using the same derivation as the training pipeline, and runs two tests per feature: Population Stability Index (PSI) — sum((live% - train%) * ln(live%/train%)) over the ten frozen bins, the primary signal, with standard industry thresholds 0.2 = significant drift and 0.25 = major drift — and a two-sample Kolmogorov-Smirnov statistic as a secondary cross-check. Separately it tracks LABEL drift: the target positive-rate over the trailing 30 days vs the training baseline — the metric that would have caught the IODA disaster. A composite drift_score in [0,1] combines 60% mean feature PSI and 40% label drift, floored at retrain-recommended whenever label drift alone is "major" so the IODA failure mode always triggers a retrain regardless of feature PSI, and maps to a verdict: stable / watch / drifted / retrain-recommended. When a model crosses retrain-recommended the detector writes a flag to the existing drift-trigger queue (retrain-queue.json) that weekly-retrain.sh already reads, so the next retrain runs early — it deliberately does NOT auto-retrain on the spot (retrains are expensive; a 12-hour cooldown serializes drift-driven retrains with the weekly cron). The first run surfaced a built-in trap of any naive PSI monitor: the calendar features month, week_of_year and day_of_week posted PSI of 13 to 17, orders of magnitude past threshold — not drift, but structure, because a 7-day window only ever spans one month and one or two ISO weeks so its decile distribution over a calendar feature can never match a full-year baseline. The detector flags those five cyclical features, still reports their PSI for transparency, but EXCLUDES them from the composite score; before that fix both models read retrain-recommended on calendar noise alone. After it, the verdicts are honest: classifier v3.3 scored drift_score 0.08, stable — its real features (anomaly rate, measurement count, probe signals) all well inside the training distribution, label drift 0.01 (live 30-day censorship positive rate 0.274 vs training baseline 0.263, the IODA label fix holding); forecast v1 scored drift_score 0.71, retrain-recommended — a genuine signal, with the block_rate lag and rolling-mean features (14 of 38) showing PSI ~3.4 because over the identical 21-country set the mean 7-day block rate jumped roughly tenfold (0.023 historical baseline to 0.234 in the trailing week), putting the forecast's recent-history inputs well outside the distribution the current model learned from (forecast label drift a modest 0.10 — not the trigger; the feature drift is). Honest caveats baked into every response: (1) drift is not the same as a broken model — a real censorship surge IS distribution shift and the forecast's 10x block-rate jump may be a genuine escalation the model still handles, so the verdict says "the inputs have moved, a retrain is warranted" not "the model is wrong today"; (2) the PSI thresholds 0.2/0.25 are industry convention, not derived from Voidly's data; (3) KS on a 7-day window is noisy — lean on PSI and the composite score, and KS here is computed against a sample reconstructed from stored training quantiles, a coarse cross-check; (4) the classifier's three contagion features (neighbor_*) are not monitored — they need the offline adjacency + regime-correlation pipeline; (5) the forecast baseline excludes the most-recent 30 days so label drift is baseline-vs-recent. Live at GET /v1/atlas/concept-drift (both models, per-feature PSI/KS, drift_score, verdict, label drift) + GET /v1/atlas/concept-drift/{model} (classifier-v3.3 | forecast-v1) + GET /v1/atlas/concept-drift/info. Refreshed daily 05:20 UTC. Implementation: scripts/build-concept-drift-baseline.py + detect-concept-drift.py + patch-concept-drift-endpoint.py.
#atlas#concept-drift#distribution-shift#psi#ks-test#data-quality#monitoring#auto-retrain#ml-honesty#transparency#api - 2026-05-22
Empirical-Bayes partial pooling didn't lift the classifier tail. Held.
Classifier v3.3 has a known split: LOCO median F1 0.870 but mean only 0.711, dragged down by ~16 MENA / former-Soviet countries (OM, UZ, TN, LY, YE, JO, MA and similar) that score LOCO F1 between 0.00 and 0.36. This finding is an honest attempt to lift that tail with empirical-Bayes partial pooling — a James-Stein shrinkage layer on top of v3.3 (no retrain) that pulls each country's probability toward its UN-region mean by w = n_country/(n_country+k), with k tuned by an inner LOCO sweep. It failed the promote gate decisively. The k-sweep is strictly monotonically decreasing in k (LOCO mean F1 0.714 at k=4 collapsing to 0.461 at k=60) — there is no interior optimum, so every amount of shrinkage hurts and the sweep just picks the smallest k offered. LOCO mean F1 moved 0.7109 to 0.7136 (+0.27pp, gate needs >=0.75 — FAIL); only 1 of 16 tail countries improved >=3pp (TN, a single row flipping in a 4-positive country — noise; gate needs >=10 — FAIL); LOCO median held at 0.875. The root cause: the 16 named tail countries are not row-poor — they hold 53 to 85 labeled rows each — so the shrinkage weight w is >=0.93 even at k=4 and the layer is structurally near-inert for exactly the countries the gate watches. What they actually lack is positive labels (2-22 confirmed-censorship days apiece), and their UN-region priors are themselves low (~0.19-0.22), so blending a ~0.18 prediction toward a ~0.20 prior cannot push anything across the 0.5 threshold. Three reformulations were also tested and ruled out: positive-count weighting (w = n_pos/(n_pos+k)), a region-floor blend (shrinkage may only raise a prediction), and a per-country F1-optimal threshold decoupled from the 0.5 cut. Partial pooling NOT promoted; classifier v3.3 stays in production unchanged; no serving endpoint exposes the pooled model. This is the third Atlas experiment (after v3.4 regime-cluster fine-tuning and the forecast-contagion port) to converge on the same lesson: no post-hoc architecture fixes a positive-label shortage — the honest path for the tail is targeted data labeling, not modeling.
#methodology#ml#classifier#partial-pooling#empirical-bayes#james-stein#shrinkage#tail-countries#negative-result#honest-no-promote#atlas
New findings ship as significant censorship events get measured. Subscribe to the Atom feed for new findings + every confirmed incident.