Methodology
How we measure global internet censorship
Overview
Composite score from three data sources, processed through federated ML pipeline, updates continuously based on live network measurements.
When a government blocks a new service, our data reflects the change within hours.
Data Sources
OONI Measurements
Sensor Network
User Telemetry
ML Model
Gradient boosting classifier trained on 37K labeled censorship incidents. Federated learning across 16 nodes ensures no raw user data leaves local systems.
Model Specifications
Feature Importance
Scoring System
0-100 scale. 0 = complete freedom. 100 = total censorship.
Limitations
- ⚠Scores are national averages — regional variations not captured
- ⚠VPN detection underreported in highly restricted environments
- ⚠Sample sizes vary by country — affects confidence levels
- ⚠Real-time events may take up to 24h to reflect in scores
- ⚠Content filtering and throttling harder to detect than blocking
- ⚠Self-censorship and legal restrictions not measured
Confidence Intervals
Each country score includes a confidence interval reflecting measurement certainty. Wider intervals indicate less data or greater variability.
Validation
Scores are validated against external benchmarks and known censorship events. Continuous evaluation ensures model accuracy over time.
Update Pipeline
Citation
Use this data in research? Please cite:
APA Format
Voidly Research. (2026). Global Censorship Index. https://voidly.ai/censorship-indexBibTeX
@misc{voidly_censorship_index,
author = {Voidly Research},
title = {Global Censorship Index},
year = {2026},
url = {https://voidly.ai/censorship-index}
}License: CC BY 4.0 — Free to use with attribution