2026-05-21

Concept-drift detector: catching distribution shift before it poisons the models

Voidly Atlas runs two production models — the v3.3 censorship classifier and the v1 7-day forecast — both retrained weekly behind a dual-holdout promotion gate. But there was no principled detector for the question that sits upstream of the gate: has the live data distribution drifted away from what the models were trained on? That gap had a concrete cost. In March-April 2026 a labeling bug let raw IODA connectivity-disruption alerts (fiber cuts, BGP leaks, weather outages) flow into the forecast target_7day labels as confirmed censorship; the April positive rate exploded to 79% against a ~5% training baseline, the model learned to call almost everything a shutdown, and the bug festered for weeks before a human noticed. A distribution-shift detector watching the label rate would have flagged it on day 2. The concept-drift detector closes that gap. A one-time baseline builder freezes each model feature's training distribution — mean, std, a 7-point quantile summary, and the ten decile bin edges plus the training proportion per bin (frozen bins are what make the daily comparison honest: live data is binned into the exact same edges). Every day the detector recomputes each model's features on the last 7 days of live data using the same derivation as the training pipeline, and runs two tests per feature: Population Stability Index (PSI) — sum((live% - train%) * ln(live%/train%)) over the ten frozen bins, the primary signal, with standard industry thresholds 0.2 = significant drift and 0.25 = major drift — and a two-sample Kolmogorov-Smirnov statistic as a secondary cross-check. Separately it tracks LABEL drift: the target positive-rate over the trailing 30 days vs the training baseline — the metric that would have caught the IODA disaster. A composite drift_score in [0,1] combines 60% mean feature PSI and 40% label drift, floored at retrain-recommended whenever label drift alone is "major" so the IODA failure mode always triggers a retrain regardless of feature PSI, and maps to a verdict: stable / watch / drifted / retrain-recommended. When a model crosses retrain-recommended the detector writes a flag to the existing drift-trigger queue (retrain-queue.json) that weekly-retrain.sh already reads, so the next retrain runs early — it deliberately does NOT auto-retrain on the spot (retrains are expensive; a 12-hour cooldown serializes drift-driven retrains with the weekly cron). The first run surfaced a built-in trap of any naive PSI monitor: the calendar features month, week_of_year and day_of_week posted PSI of 13 to 17, orders of magnitude past threshold — not drift, but structure, because a 7-day window only ever spans one month and one or two ISO weeks so its decile distribution over a calendar feature can never match a full-year baseline. The detector flags those five cyclical features, still reports their PSI for transparency, but EXCLUDES them from the composite score; before that fix both models read retrain-recommended on calendar noise alone. After it, the verdicts are honest: classifier v3.3 scored drift_score 0.08, stable — its real features (anomaly rate, measurement count, probe signals) all well inside the training distribution, label drift 0.01 (live 30-day censorship positive rate 0.274 vs training baseline 0.263, the IODA label fix holding); forecast v1 scored drift_score 0.71, retrain-recommended — a genuine signal, with the block_rate lag and rolling-mean features (14 of 38) showing PSI ~3.4 because over the identical 21-country set the mean 7-day block rate jumped roughly tenfold (0.023 historical baseline to 0.234 in the trailing week), putting the forecast's recent-history inputs well outside the distribution the current model learned from (forecast label drift a modest 0.10 — not the trigger; the feature drift is). Honest caveats baked into every response: (1) drift is not the same as a broken model — a real censorship surge IS distribution shift and the forecast's 10x block-rate jump may be a genuine escalation the model still handles, so the verdict says "the inputs have moved, a retrain is warranted" not "the model is wrong today"; (2) the PSI thresholds 0.2/0.25 are industry convention, not derived from Voidly's data; (3) KS on a 7-day window is noisy — lean on PSI and the composite score, and KS here is computed against a sample reconstructed from stored training quantiles, a coarse cross-check; (4) the classifier's three contagion features (neighbor_*) are not monitored — they need the offline adjacency + regime-correlation pipeline; (5) the forecast baseline excludes the most-recent 30 days so label drift is baseline-vs-recent. Live at GET /v1/atlas/concept-drift (both models, per-feature PSI/KS, drift_score, verdict, label drift) + GET /v1/atlas/concept-drift/{model} (classifier-v3.3 | forecast-v1) + GET /v1/atlas/concept-drift/info. Refreshed daily 05:20 UTC. Implementation: scripts/build-concept-drift-baseline.py + detect-concept-drift.py + patch-concept-drift-endpoint.py.

#atlas#concept-drift#distribution-shift#psi#ks-test#data-quality#monitoring#auto-retrain#ml-honesty#transparency#api

Raw data