Internet shutdowns are easy to detect. The harder question is why they happened — was it the election? the protest? a court ruling? a policy update? Until today, Voidly could surface the WHAT (incident detected, severity X, mechanism Y) and the WHERE (country, ASN, blocked domains), but the WHY was guesswork.
Today we shipped synthetic difference-in-differences
attribution at /sentinel/attribute{' '}
and /v1/sentinel/attribute. For
any country + date pair, we compute a counterfactual block-rate
from weighted stable-democracy donors, measure the post-period
gap, and run a permutation p-value to gauge significance.
How it works
- Pull daily block_rate for the treated country, T−14 through T+6, from
evidence. - Pull same window for 15 candidate donors (stable democracies: US, JP, DE, GB, FR, AU, CA, NL, SE, CH, FI, NO, DK, IE, NZ).
- Filter to donors with ≥5 observations per period.
- Fit synthetic-control weights via scipy SLSQP with constraints w ≥ 0, Σw = 1. Minimize sum-of-squares against treated's pre-period.
- Post-period: causal_effect = treated_post_mean − synthetic_post_mean.
- In-space placebo permutation for p-value.
- Cross-reference
events.dbfor nearby political events (election/protest/coup/policy/religious).
Why this is novel
The synthetic difference-in-differences technique was developed for economic policy evaluation. Internet Society's NetLoss paper (ACM JCSS 2024) applied a similar method to estimate the GDP impact of internet shutdowns. But nobody publishes the censorship-attribution analog at scale — given a shutdown, attribute it to a triggering event with a defensible counterfactual.
Most prior work treats causality as “the protest happened and then the shutdown happened” correlation. SDiD lets us say something stronger: given comparable peer countries had no shutdown, this country's spike is X percentage points above expected, with p-value Y.
Test case: Iran 2026-05-13
| Metric | Value |
|---|---|
| Donor weights | NL 0.95, CA 0.04, AU 0.01 |
| Pre-period RMSE | 0.036 (good fit, well under 0.10 ceiling) |
| Treated post-mean | 92.6% |
| Synthetic post-mean | 95.2% |
| Causal effect | −2.6 pp |
| Permutation p-value | 0.25 (not significant) |
| Nearby event | OONI anomaly signal 2026-05-06 (upstream_anomaly_signal) |
| low_confidence | false |
The negative causal effect is mechanically correct but tells an interesting story: Iran was already at ceiling (96% block_rate) in the pre-period, so there was no “room above” for the trigger to lift things further. SDiD correctly reports the small negative number + non-significant p-value. The script doesn't editorialize — it lets the data speak.
This honest behavior is the point. A naive attribution would have said “Iran had 92% block rate, the OONI anomaly fired, attribution: anomaly.” SDiD says “synthetic Iran was already at 95%, so the anomaly didn't add anything measurable.” Big difference.
When attribution is meaningful
- Treated country was not at ceiling pre-T. Iran's 96% baseline rate makes attribution noisy.
- Pre-RMSE < 0.10. Synthetic must track treated's pre-trend well.
- ≥5 observations in both pre- and post-periods. Sparse data → wide confidence intervals.
- Donor pool isn't too small. Need at least 3-4 donors with good coverage.
The endpoint reports low_confidence: true with a reason
when these conditions aren't met. Don't cite the causal
effect in those cases.
Honest caveats
- Donor pool is hand-picked. Different donors → different counterfactuals. We default to 15 stable democracies; researchers can fork the donor list for their use case.
- Events.db is incomplete. 1,519 events total; 1,380 are auto-mined “signal” rows; only 139 are political events (election/protest/policy). Richer event ingest (Wikipedia + GDELT + IFES election calendars) would materially improve attribution quality.
- SDiD assumes parallel trends in the absence of the trigger. If treated and synthetic diverge for reasons we don't observe, the causal estimate is biased. Standard econometrics caveat.
- p-value uses in-space placebos over donors. The donor pool is small (8-15) so the p-value is granular (0.0, 0.125, 0.25, ...). Don't over-read a 0.25 vs 0.50 difference.
What this enables
- Journalists answering “was the Iran election the cause of the shutdown?” with a defensible number.
- Researchers running cross-incident attribution studies (which event types most often trigger shutdowns?).
- AI agents auto-attributing shutdowns in real-time without hardcoded heuristics.
- Voidly Atlas as the first system to publish causal-attribution numbers for individual shutdowns at scale.
Reproducibility
Implementation in scripts/sdid_attribution.py (312 lines).
Endpoint wrapper in scripts/patch-sdid-endpoint.py. Both in the public repo.
Dependencies: numpy + scipy.optimize.minimize (SLSQP). The donor
pool, p-value parameters, and confidence thresholds are constants
at the top of the script — easy to audit + fork.