Model honesty, public
Every Sentinel shutdown forecast ships with a 90% conformal interval. This page tracks how often the real outcome lands inside that interval — the closer to 90%, the more honest the model. Data lives at /v1/sentinel/calibration/history and updates every 24h.
/v1/forecast/{cc}/7day response under aci_alpha + aci.* fields. Full ACI methodology →Live forecast accuracy (prod_rolling, 30-day window)
Brier < 0.10 is good, > 0.30 is concerning. Calibration MAE < 0.05 means predicted-probabilities track observed-rates closely. See /sentinel/backtest for the actual reliability diagram (predicted-mean vs observed-rate scatter) and /methodology#validation for the full evaluation methodology + 3-split honest baselines.
Empirical coverage — 90-day rolling
The blue line is the actual fraction of forecasts where the real outcome landed inside the 90% conformal interval. The green dashed line is the nominal target (0.90). If the blue stays close to the green, the model is well calibrated.
Blue: empirical coverage · Dashed green: nominal 0.90 target · Orange circles: drift alerts
Last 14 days
| Date | Coverage | q90 | n holdout | Drift? |
|---|---|---|---|---|
| May 21 | 90.5% | 0.048 | 2,100 | — |
| May 20 | 90.5% | 0.048 | 2,100 | — |
| May 19 | 90.5% | 0.048 | 2,100 | — |
| May 18 | 90.5% | 0.048 | 2,100 | — |
| May 17 | 90.5% | 0.048 | 2,100 | — |
| May 16 | 90.5% | 0.048 | 2,100 | — |
| May 15 | 90.5% | 0.048 | 2,100 | — |
| May 14 | 90.5% | 0.048 | 2,100 | — |
| May 13 | 90.5% | 0.048 | 2,100 | — |
| May 12 | 90.5% | 0.048 | 2,100 | — |
| May 11 | 90.5% | 0.048 | 2,100 | — |
| May 10 | 90.5% | 0.048 | 2,100 | — |
| May 9 | 90.5% | 0.048 | 2,100 | — |
| May 8 | 90.5% | 0.048 | 2,100 | — |
What features the model actually uses
Sklearn feature_importances_ on the underlying XGBoost. 39 features total. Top-3 sum: 0.23 · Top-5: 0.326 · Top-10: 0.492. Healthy distribution — no single feature dominates the model.
- 1.gdelt_unrest_30d11.2%
- 2.recent_shutdown6.1%
- 3.week_of_year5.7%
- 4.high_urgency_signals_7d5.6%
- 5.month4.0%
- 6.election_in_7days3.6%
- 7.high_importance_event3.4%
- 8.block_rate_roll30_mean3.3%
- 9.block_rate_lag143.2%
- 10.ooni_anomaly_7d3.1%
- 11.block_rate_roll14_mean3.1%
- 12.critical_incident_7d3.0%
Interpretation: The forecast model's top feature is gdelt_unrest_30d (0.25) — protest + conflict signals from the GDELT 1.0 global news feed. recent_shutdown, block_rate rolling means, and incident counts follow. risk_tier — the leaky country-level encoding that dominated our older classifier at 85% — contributes only ~2% here. Healthy distribution; no single feature dominates.
Raw JSON: /v1/sentinel/feature-importance
Related
- /methodology#validation — 3-split honest baselines (LOCO 0.91 AUC vs stratified 0.98)
- /v1/sentinel/accuracy — full evaluation JSON, updated nightly
- /atlas/elections — see the forecast in action (90-day upcoming elections)