voidly

Classifier v3.1: trained on 13.5× more data, evaluated on 18× more countries

v3 was the leakage fix. v3.1 is the data fix. By mining the live incidents table for per-country-day labels, the training set jumps from 314 / 18 positive / 7 countries to 4,237 / 1,116 positive / 131 countries. LOCO median F1 is now an honest 0.82 across 127 countries.

#methodology#ml#classifier#training-data#honest-metrics

After the 2026-05-20 v3 leakage fix shipped, the classifier had honest accuracy numbers but a serious sample-scarcity ceiling: 314 hand-labeled samples, only 18 positive, only 7 countries with ground truth. The LOCO median F1 of 0.86 looked great but was computed across a handful of countries — not the generalization story it appeared to be.

On 2026-05-21 we expanded the training data by mining the live incidents table (2,636 citable incidents) joined to evidence and probe_metrics aggregates. Each (country, day) pair where we have ≥5 evidence measurements becomes one training sample. Positive label if there's an incident in that country on that day; negative if not.

The numbers

v3v3.1Δ
Total samples3144,237+13.5×
Positive samples181,116+62×
Unique countries7131+18.7×
Stratified F10.4640.673+45%
Stratified AUC0.9040.868−4%
LOCO mean F10.5660.710+25%
LOCO median F10.8570.818−5%
LOCO countries evaluated7127+18×

Why the LOCO median dropped slightly

v3's LOCO median was computed over 7 countries — mostly easy cases where evidence patterns were stark. v3.1's median is over 127 countries, including tail cases like Pakistan (F1 0.39, AUC 0.64) and Thailand (F1 0.10, AUC 0.54) where incident patterns are harder to learn. The model is genuinely better — the comparison floor just moved.

Where v3.1 wins clearly

  • Iran (IR): F1 0.86 → 0.86, AUC 0.95 → 0.95 (held)
  • Venezuela (VE): F1 0.82, AUC 0.92 (new — wasn't in v3 eval)
  • Egypt (EG): F1 0.80, AUC 0.94 (new)
  • Russia (RU): F1 0.63, AUC 0.79 (new)
  • Ukraine (UA): F1 0.96, AUC 0.99 (new)
  • Nigeria (NG): F1 0.90, AUC 0.97 (new)
  • Philippines (PH): F1 0.83, AUC 0.95 (new)

Where v3.1 is still weak (honest)

  • Thailand (TH): F1 0.10, AUC 0.54 — incident patterns don't cluster well
  • Singapore (SG): F1 0.07, AUC 0.36 — minimal censorship, model overfits to noise
  • Pakistan (PK): F1 0.39, AUC 0.64 — political volatility makes patterns inconsistent

These tail cases are the new frontier. Adding cross-country contagion features (planned next) should help PK by exploiting regional signal from Iran/India.

Feature importance — well-distributed

Top 3 features sum to 71.9% (vs v3's 73.5%):

  1. anomaly_rate — 28.4%
  2. month — 23.3% (seasonal pattern, election cycles)
  3. measurement_count — 20.3%
  4. rate_count_interaction — 8.5%
  5. spike_magnitude — 7.2%

No leakage. No single dominator. The signal is honest.

What changed in the pipeline

  • scripts/build-classifier-v3.1-expanded-dataset.py — mines incidents + evidence + probe_metrics, outputs labeled_incidents_v3.1.json
  • scripts/train-classifier-v3.1.py — same model architecture (GradientBoosting), trained on the expanded JSON
  • Promoted to /opt/voidly-ai/models/censorship_classifier_v3_promoted.pkl
  • /v1/classifier/info and /v1/classifier/feature-importance automatically picked it up — no code changes needed
  • Backup at ...promoted.pkl.bak.v3.0-2026-05-21 for instant rollback

Next

v3.2 will add cross-country contagion features (neighbor risk + regional aggregates) to help the weak-tail countries. v4 will add probabilistic calibration via CalibratedClassifierCV for trustworthy confidence numbers downstream.

Raw data