voidly

Regional contagion does not improve the 7-day forecast — within-country dynamics already capture it

Censorship spreads regionally — a neighbour's shutdown often precedes your own — so regional "contagion" features are an obvious candidate to sharpen a 7-day forecast. We tested it honestly and it did not pan out. Starting from the production honest_forecast (HistGradientBoosting, 24 within-country features, strict temporal rolling-origin CV across 9 folds on independent OONI ground truth, AUC 0.8154 / PR-AUC 0.2934 — reproduced here exactly), we added three leakage-safe regional features: the same-region (continental, EXCLUDING the country itself) mean anomaly rate and event fraction, both as yesterday's strictly-lagged value AND as the contemporaneous value legitimately available at prediction time, plus a 7-day regional rolling mean. Across the same folds, neither variant beat the base: lagged contagion AUC 0.8140 / PR-AUC 0.2893 (−0.0015 / −0.0041), contemporaneous AUC 0.8153 / PR-AUC 0.2900 (−0.0001 / −0.0033) — flat-to-slightly-worse on every metric, with no consistent per-fold lift. The honest read: for a 7-day, country-relative anomaly-spike forecast, a country's OWN recent dynamics (anomaly lags, rolling means, z-score vs its trailing 28-day baseline, recent event history) already carry the forward signal; regional spillover at continental granularity adds noise, not skill. This does not refute contagion as a phenomenon — finer political/regime-similarity neighbourhoods, or a different target like the country-day classifier (where regime-weighted contagion did help v3.3), may differ — but at this granularity it is a clean negative, and we did not ship it. No model change: honest_forecast stays at its validated 0.815. Reproduce with the rolling-origin harness scripts/honest-forecast-backtest.py.

#methodology#ml#forecast#honest-negative#contagion#rolling-origin#temporal-cv#leakage-safe#accountability#atlas#api

Raw data