2026-05-21

Cross-protocol classifier: per-port blocking probability (8 protocol groups)

Eight small XGBoost classifiers, one per protocol group (HTTP-80, HTTP-headers, web_connectivity, TLS-WhatsApp, TLS-Signal, TLS-Telegram, TLS-FB-Messenger, Tor). Given a measurement to a (host, port) on a country/day, each model returns the probability that port is blocked. Built on OONI evidence by parsing the N/M anomalous-measurements ratio out of upstream_claim text (necessary because non-web_connectivity tests only store rows when blocking occurs). All 8 cleared promote floor: LOCO pooled-OOF AUC 0.98 to 0.999, per-country median AUC 0.89 to 1.0. Honest caveat: the high AUCs come from strong port-level blocking persistence (history is the dominant feature), not novel signal — and web-connect labels partially overlap with TLS-app labels (both touch 443). Live at GET /v1/classifier/protocol/{proto}/{cc} and /v1/classifier/protocol/info.

#classifier#protocol#ml#per-port#ml-honesty

Raw data

Endpoint info + per-protocol metrics
P(Tor blocked in CN)
P(Signal blocked in IR)
P(WhatsApp blocked in IR)
P(HTTP-80 middlebox in RU)
P(Telegram blocked in BY)
Generic v3.3 country classifier (for comparison)