voidly

Cutting the Sentinel alert false-alarm rate from 79% to 35%

The alert lead-time retrospective showed that 79.2% of Sentinel forecast-threshold alerts over 90 days were false alarms — four of five alerts that fired were not followed by a confirmed censorship incident. This finding is the fix, backtested against the same 90-day history and the same 14-day scoring rule. Four candidates were evaluated: per-country thresholds, a persistence gate, chronic-false-country suppression, and combinations. Two are honest negatives reported as findings: per-country thresholds collapse to the baseline (the forecast probabilities for censorship-heavy countries cluster just above 0.05 with no separation between true-positive and false-alarm days, so raising a bar kills the true positives too), and a 2-day persistence gate nudges the false-alarm rate up rather than down (Sentinel false alarms are chronically-near-threshold countries, not transient one-day spikes). The fix that shipped downgrades 37 of 49 watched countries from alert to watch under three honest rules — chronic false-positive (>= 3 alerts, 0% true-positive rate, catches Iran), no-incident-signal (zero confirmed incidents in the window, catches stable democracies), and low-precision (>= 4 alerts, precision < 0.40, catches Bangladesh / Kazakhstan / Vietnam). A downgraded country still has a fully computed forecast — only the webhook alert is withheld. Backtest result: false-alarm rate 80.6% -> 35.0%, true-positive rate 17.6% -> 60.0% (recall went up, not down — the removed alerts were overwhelmingly noise), median lead time essentially unchanged at ~3.9 days. Honest caveats baked in: thresholds and the suppression list are picked in-sample so the live forward false-alarm rate will be somewhat worse than 35%; suppression is a downgrade not a deletion; Iran is suppressed for alert hygiene, not because Iran is safe; multi-signal confirmation was considered but not shipped because the DBSCAN and contagion signals are point-in-time snapshots with no 90-day history to backtest.

#methodology#sentinel#alerts#false-alarm-rate#precision#honest-negative#accountability#atlas#api

Raw data