voidly

DPI fingerprint library: heuristic vendor attribution for 19,506 evidence rows across 10 device families

Voidly Atlas previously told you HOW a country blocks (DNS / TCP / TLS / blockpage) but not WHICH VENDOR. The new DPI Fingerprint Library v1 closes that gap with a curated, public, citation-backed library of 14 deep-packet-inspection devices — 7 state-deployed (Russia TSPU, China GFW, Iran ARIA DPI, Belarus Beltelecom, Turkey BTK, Myanmar Junta, Pakistan PTA) and 7 commercial appliances (FortiGate, Sangfor, Netsweeper, Blue Coat, Smartfilter, Cisco WSA, Palo Alto). Each fingerprint is a hand-curated rule with up to four components — country_prior (HARD GATE), signal_type_in, blocking_method_in, optional upstream_claim_regex — and a confidence floor. We emit at most one vendor match per evidence row, choosing the highest weighted-sum confidence above the floor; ties go to the rule with the most specific signal (regex > blocking_method > country_prior). Backfilled across 85,549 evidence rows: 19,506 matches (22.8%) spread across 10 of 14 vendors. Top vendors: China GFW (6,378), Russia TSPU (5,197), Iran ARIA DPI (1,870), Myanmar Junta DPI (1,709), Pakistan PTA WMS (1,449), Turkey BTK DPI (1,399), Belarus Beltelecom DPI (937), FortiGate (519), Blue Coat (30), Netsweeper (18). Honest caveats inline: heuristic matching not ML, public fingerprints lag vendor updates, an evidence row matching a vendor does NOT prove that vendor performed the block — only that the signal is consistent with that vendor's known behaviour pattern. The four vendors that did NOT fire heavily (Sangfor, Cisco WSA, Palo Alto, Smartfilter) need blockpage HTML text in upstream_claim, which OONI does not always preserve — that gap is honest, not hidden. Live at GET /v1/atlas/dpi-fingerprints + GET /v1/atlas/dpi-fingerprints/{vendor_slug} + GET /v1/atlas/dpi-distribution.

#dpi#vendor-attribution#fingerprints#investigative#transparency#ml-honesty

Raw data