voidly
Atlas · ML registry

The model registry

Every machine-learning model deployed in Voidly Atlas, in one place. Version, training date, honest evaluation metrics, direct links to the transparency endpoints. Cite a specific model in your paper or post without spelunking through internal docs.

Updated hourly · CC BY 4.0

Censorship Classifier v3.3

Scores raw evidence into incident-vs-noise. v3.3 adds regime-similarity-weighted cross-country contagion features (3 of them) on top of v3.1's expanded 4,237-sample dataset. LOCO median F1 0.870 — best to date. Honest caveat: 16 MENA + former Soviet countries regress vs v3.1 due to sparse neighbor-pair overlap.

promoted 2026-05-21 (regime-weighted contagion)
Algorithm
GradientBoostingClassifier
Trained
2026-05-21
LOCO median F1 (honest)
87.0%
Stratified F1 (sanity)
72.9%
LOCO countries
127
Training samples
4237
Features (16) ▾
anomaly_ratemeasurement_countspike_magnitudeday_of_weekmonthis_weekendrate_count_interactionprobe_block_rateprobe_node_countprobe_avg_confidenceprobe_agreementrate_spike_interactionhigh_evidenceneighbor_block_rate_7dneighbor_incident_count_7dneighbor_max_anomaly_7d
Three-step ML evolution in 36 hours: v3 (fixed v2's 85% country_risk_tier leakage) → v3.1 (13.5× more training data) → v3.2 (geo contagion experiment, held back) → v3.3 promoted (regime-similarity-weighted contagion). Stratified F1: v3 0.46 → v3.1 0.67 → v3.3 0.73. LOCO median F1: v3.1 0.82 → v3.3 0.87. EG: v3.1 0.55 → v3.3 0.73 (+18pp recovery).

Censorship Classifier (legacy v2) v2

(Retired.) Reported 99.8% F1 but 85% was country_risk_tier leakage. v3.1 supersedes.

deprecated 2026-05-21 (superseded by v3.1)
Algorithm
GradientBoostingClassifier
Trained
2026-02-10
F1 (stratified — inflated)
99.8%
ROC AUC
1.000
Feature count
39
Top feature
country_risk_tier (85%)

Transparency endpoints

Still serving production predictions. The 99.8% F1 number is inflated by country_risk_tier leakage — it's a label-derived feature. v3 fixes this. Watch the v3 promotion writeup for swap timing.

Sentinel Shutdown Forecast XGBoost + isotonic

7-day shutdown-risk probability per watched country, with SHAP drivers + conformal interval.

serving (calibrated 2026-05-20)
Algorithm
XGBoost classifier + sklearn IsotonicRegression
Trained
2026-04-17
AUC (held-out)
0.980
F1 (held-out)
0.795
Brier (post-refit)
0.223
Conformal coverage
90.4%
Features (5) ▾
gdelt_unrest_30drecent_shutdownweek_of_yearhigh_urgency_signals_7dmonth
Isotonic recalibration on 2026-05-20 cut Brier from 0.59 to 0.22 + MAE from 0.60 to 0.00 in-sample. Applied gated to 30 watched censorship-heavy countries. See /sentinel/backtest for the reliability diagram and /sentinel/calibration for the 90-day drift series.

Anomaly Detector v1

(Retired) Pre-v2 unsupervised baseline. Kept on disk for historical comparison.

deprecated 2026-05-08
Algorithm
IsolationForest
Trained
2026-02-10
Status
frozen
Size
63 MB pickle
File: /opt/voidly-ai/models/anomaly_detector_v1.pkl.deprecated.20260508. The classifier v2/v3 pipelines superseded the unsupervised anomaly approach because labeled incidents (now ~2,640) outperform threshold-based detection at every operating point.

How models are evaluated

Voidly publishes three accuracy numbers per model, and picks the most honest one for headline claims:

  • Stratified k-fold — easy mode, random splits. Useful as a sanity check. Susceptible to temporal + cross-country leakage. Do not cite for deployment claims.
  • Leave-one-country-out (LOCO) — train on N−1 countries, test on the held-out one. Catches cross-country leakage. The honest cross-country generalization number.
  • Time-based — train pre-T, test post-T. Catches temporal leakage. Often the most pessimistic, especially for novel event types.

Full methodology + the three honest splits per model at /methodology#validation. Live calibration drift at /sentinel/calibration. Reliability diagram at /sentinel/backtest.

Related