US CA GB FR IN SG2026-05-21

Probe-node integrity detector: catching a compromised or misconfigured probe before it poisons incidents

The Voidly probe network is 40 nodes — 15 internal (Voidly-run) and 25 community (cp-* IDs, run by volunteers on hardware Voidly does not control). The community half is the trust weak point of the whole pipeline. Probes ARE the ground truth: every confirmed-censorship incident is built on their evidence. So a community probe that LIES — an adversary standing one up to report "everything is blocked" and manufacture a fake shutdown, or "nothing is blocked" to mask a real one, or simply a misconfigured node that mislabels benign traffic — corrupts incidents directly, and a lying ground-truth source is the worst failure mode the system has. The probe-integrity detector exists to catch that without trusting any single probe. For every probe node over a trailing 30-day window it asks one question: when this node tested a (domain, country) target, did its verdict AGREE with the consensus of every OTHER probe and upstream source that tested the same thing? Each evidence row is collapsed to a binary verdict — BLOCK or CLEAR — and the two row types are decoded differently because they encode the verdict differently. Voidly probe rows all land with signal_type=block (the row IS a block-test record, not a verdict) and a structured JSON blob carrying blockType: dns-poisoned / tcp-reset / blockpage / sni-blocked resolve to BLOCK (genuine DPI censorship signatures), while http-redirect resolves to CLEAR — a redirect such as copilot.github.com to github.com/copilot or skype.com to teams.live.com is normal product behaviour, not censorship, and a probe recording it as a block is itself misconfigured. tcp-timeout and unknown are treated as CLEAR (transient-leaning — a bare timeout with no DPI fingerprint cannot vote BLOCK). Upstream-source rows (OONI / CensoredPlanet / IODA, no probe attribution) use signal_type as a real verdict, with IODA outage rows dropped entirely because a country-level connectivity outage is not a domain-specific censorship verdict. Consensus is built with a three-tier fallback because the probe_node_id column is genuinely sparse — 35 probe nodes, ~265 attributed rows in 30 days, most (domain, country, day) cells touched by only one probe, so a strict same-cell rule would leave almost every probe row un-scorable. Tier 1 (weight 1.0) is other rows on the same domain+country+day, the gold standard. Tier 2 (weight 0.6) is the same domain+country a different day — blocking policy is sticky day-to-day. Tier 3 (weight 0.3) is the country base block rate over the window built from UPSTREAM SOURCES ONLY: an adversary could stand up many probes and manufacture a fake country consensus, so the base rate is anchored only to the independently-operated OONI/CP/IODA measurement sources. The node's agreement_rate is the tier-weighted mean match over its comparable rows; the probe's own rows are always excluded from the pool it is scored against. Two secondary signals are also computed: a degenerate verdict distribution (a node reporting ONLY blocks or ONLY clears across at least 6 distinct targets — the "everything is blocked" / "nothing is blocked" adversary signature) and a reporting-volume outlier (a node whose block-report count exceeds 5x its same-class peer median). integrity_score (0-1) = clamp01(agreement_rate minus 0.15x degenerate minus 0.15x volume_outlier), with agreement_rate defaulting to a neutral 0.5 when a node has no comparable rows. A node is flagged low-integrity when agreement_rate < 0.70, or a volume outlier, or a degenerate distribution that CORROBORATES a low or borderline agreement (degenerate shape alone is not a standalone flag — internal probes seeded on a low-censorship datacenter domain list and community probes seeded on known-blocked domains both come out uniform for benign reasons, so it only docks the score). The first run scored all 35 probe nodes with attributed evidence and flagged 10: 8 community probes plus 2 internal. cp-3e6ixxgs is the clearest case — it reports washingtonpost.com, whatsapp.com, messenger.com and rferl.org as BLOCKED in Great Britain, but the GB country base rate from OONI/CP is 147 clear to 3 block, an agreement_rate of 0.00 across 13 comparable rows, integrity_score 0.0, flagged actionable. Seven of the ten flags carry low confidence (below 0.40, marked "investigate, do not act") because those community nodes have only 4-9 comparable rows — the honest signal that there is not yet enough history to judge them. Honest caveats baked into every API response: (1) a low-agreement node may be CORRECT — a probe sitting in a region with genuinely different blocking than the consensus pool will "disagree" while being right (internal node blr is flagged for exactly this reason: it correctly classifies copilot.github.com as a benign redirect / CLEAR in India while OONI labels the domain blocked), which is precisely why this detector FLAGS for human review and NEVER auto-bans or disables a probe; (2) consensus itself can be wrong — if most probes and sources cluster in one region, and they do, with heavy datacenter and Global-North skew, the majority verdict reflects that region and an honest probe elsewhere is penalized; (3) Tier-2 and Tier-3 consensus are proxies, weaker than same-cell agreement, down-ranked by the tier weights but not made rigorous by them; (4) new and low-volume nodes have little history — a flag with confidence below 0.40 means "not enough evidence to judge"; (5) the probe_node_id column is sparsely populated so coverage grows as attribution improves; (6) degenerate verdict distribution is a weak signal in the current data because of the seeded domain lists, so it nudges the score but agreement_rate is the load-bearing metric. The detector is strictly detection plus flagging — it never auto-disables a probe. Live at GET /v1/atlas/probe-integrity (all nodes, filterable by flagged_only / node_class / score band) plus GET /v1/atlas/probe-integrity/{node_id} (per-node detail including the sample of cells where the node disagreed with consensus). Rebuilt daily 06:15 UTC. Implementation: scripts/build-probe-integrity.py.

#atlas#probe-network#integrity#consensus#trust#data-quality#adversarial#security#ml-honesty#transparency#api

Raw data