voidly

Row-level measurement classifier: per-measurement censorship scoring (Niaki KDD23 inspired)

New POST /v1/measurement/classify scores a single OONI, CensoredPlanet, IODA, or Voidly measurement and returns a probability + SHAP top-5 explanation. Inspired by Niaki et al. KDD 2023. Honest framing: the model learns to reconstruct the labeling rule, so the high AUC is real but reflects label leakage from raw signals, not novel detection ability.

#ml#classifier#row-level#measurement#xgboost#shap#transparency#niaki-kdd23

What shipped

Up to today we classified censorship at the country-day level: 84K evidence rows aggregated into per-country features fed into the v3.3 GradientBoosting classifier. That answers “is country X being censored today?” well but cannot answer “is THIS specific measurement evidence of blocking?”

New endpoint POST /v1/measurement/classify takes one measurement (source, country, ASN, domain, signal_value, signal_baseline, observed_at, optional rolling rates) and returns {probability, label, top_features, model_version} with a SHAP-style top-5 contribution list. A companion GET /v1/measurement/info exposes model version, feature names, and held-out metrics.

Why this is interesting

Journalists and researchers querying a single OONI report previously had to either trust the upstream signal_type field as-is or compute country-level statistics. A per-row probability with an explanation lets you cite a single measurement with confidence and surface the features that drove the call. Inspired by Niaki et al. KDD 2023 ( Massively Parallel Censorship Probing of DNS and TLS Globally) and the OONI Probe per-measurement anomaly heuristics.

How it was trained

Inputs: all 85,468 evidence rows in voidly_data.db. Label rule per task spec: positive iff signal_type IN (block, isp-outage) OR signal_type LIKE 'dns-blocking%' OR signal_type LIKE 'http-blocking%' AND confidence >= 0.7. Balance: 16,798 positive (19.65 percent), 68,670 negative. Features (24 total, label-leaking columns dropped): signal_value, signal_baseline, signal_deviation, signal_ratio, risk_tier, time-of-day, day-of-week, weekend flag, month, has_artifact_hash, has_source_url, has_domain, has_asn, asn (integer), 7-day rolling positive rate per country, per ASN, and per country+domain, and hash-encoded source / kind / country / continent / un_subregion / region / domain_category. Model: XGBoost(max_depth=4, n_estimators=100) on a stratified 80/20 split (seed 42).

The result, and the honest caveat

Held-out 20 percent ROC AUC: 1.000 (n=17,094). F1: 1.000. Precision: 1.000. Recall: 1.000. Confusion matrix at threshold 0.5: [[13734, 0], [0, 3360]]. Per-source AUC where computable: OONI 1.000, CensoredPlanet 1.000, Voidly 1.000. IODA and Voidly-Community are one-class slices in the test fold (IODA: 0 positives by construction because we excluded outage from the label rule; Voidly-Community: 100 percent positives because every community probe entry confirms a block) so AUC is undefined for those.

Honest framing: AUC=1.000 is real but it is not a breakthrough. The label rule depends deterministically on signal_type, and each source uses distinctive signal_value / signal_baseline patterns (CensoredPlanet positives nearly always have value=1.0; OONI positives have baseline=0.03 with values in the 0.5-1.0 range; IODA has zero positives because we explicitly excluded outage-only rows). XGBoost can essentially memorize the labeling function from the raw signal columns plus source. The top feature is the 7-day rolling per-ASN positive rate (81 percent gain), which also overlaps with same-period labels for ASNs that only appear in a tight window. We drop signal_type, blocking_method, and signal_level from the feature matrix to avoid the worst leakage, but residual coupling between raw signal values and provider judgement is unavoidable.

What this is useful for: returning a calibrated probability and an SHAP-style explanation per measurement in one call, with the same labeling rule the rest of the pipeline uses. What it is NOT useful for: claiming we can detect censorship that the upstream providers missed. The v3.3 country-day classifier remains the model that actually generalizes across countries; this one is a per-row interface layer on the same evidence.

Promotion + endpoint

Promotion gate: AUC >= 0.85 AND every per-source AUC >= 0.75. Both met (the only sources excluded from per-source AUC are one-class slices, not failures). Endpoint live at POST https://api.voidly.ai/v1/measurement/classify and GET https://api.voidly.ai/v1/measurement/info. Build scripts: scripts/build-measurement-features.py, scripts/train-measurement-classifier.py, scripts/patch-measurement-classify-endpoint.py (idempotent, ast-validated).

Raw data