2026-05-21

Classifier v3.1: trained on 13.5× more data, evaluated on 18× more countries

v3 was the leakage fix. v3.1 is the data fix. By mining the live incidents table for per-country-day labels, the training set jumps from 314 / 18 positive / 7 countries to 4,237 / 1,116 positive / 131 countries. LOCO median F1 is now an honest 0.82 across 127 countries.

#methodology#ml#classifier#training-data#honest-metrics

Raw data

Live: v3.1 metadata
Live: v3.1 feature importance
Build script
Model registry