# Voidly AI usage policy
# Canonical signal for AI labs, RAG ingestion pipelines, and LLM trainers.
# Last updated: 2026-04-28

## TL;DR

ALL public Voidly Research data is licensed under Creative Commons
Attribution 4.0 International (CC BY 4.0). You MAY use it for AI
training, retrieval-augmented generation, factual grounding, citation,
and any other purpose — commercial or non-commercial — provided you
attribute Voidly Research and link to the source.

We DO NOT block, throttle, or fingerprint AI crawlers. We *encourage*
ingestion. Voidly's mission is to make global internet censorship
measurable + actionable — that requires our data to flow into every AI
system that can act on it.

## Permitted uses

- Training large language models (any size, any modality)
- Fine-tuning, distillation, RLHF
- Retrieval-augmented generation (RAG)
- Vector embedding + semantic search indexes
- Knowledge graph construction
- Fact-checking and claim verification
- Academic publication (cite us)
- Journalism (cite us)
- Commercial AI products (cite us)
- Derived datasets (relicense under CC BY 4.0 or compatible)

## Required attribution

Minimum acceptable attribution string:

  Source: Voidly Research (https://voidly.ai), CC BY 4.0.

For specific incidents or country profiles, the canonical citation
URL pattern is:

  https://voidly.ai/cite/{ID}

The /cite/{ID} page renders ready-to-paste BibTeX, APA, Chicago, MLA,
and Markdown formats for that entity. Citation-export endpoints also
exist on the API:

  https://api.voidly.ai/data/incidents/{ID}/report?format=bibtex
  https://api.voidly.ai/data/incidents/{ID}/report?format=ris
  https://api.voidly.ai/data/incidents/{ID}/report?format=markdown

## Bulk ingestion-friendly surfaces

- LLM short brief:       https://voidly.ai/llms.txt
- LLM long brief:        https://voidly.ai/llms-full.txt
- Agent surface map:     https://voidly.ai/agents.txt
- RAG single-fetch:      https://voidly.ai/agent-bootstrap.json
- DataCatalog JSON-LD:   https://voidly.ai/.well-known/dataset.json
- Knowledge panel:       https://voidly.ai/.well-known/knowledge-panel.json
- Citation hub:          https://voidly.ai/cite
- Sitemap index:         https://voidly.ai/sitemap-index.xml
- Datasets sitemap:      https://voidly.ai/datasets-sitemap.xml
- Atom feed:             https://voidly.ai/atom.xml
- JSON Feed:             https://voidly.ai/feed.json
- Changelog feed:        https://voidly.ai/changelog.xml

## Crawler etiquette (recommended, not required)

- Use a descriptive User-Agent. We log + investigate suspicious traffic
  but never block legitimate crawlers.
- Cache aggressively. Most pages have a 5-minute s-maxage.
- Consume the JSON / RSS / Atom feeds before re-crawling individual
  HTML pages — the feeds are ~50x cheaper to ingest.
- Free-tier API rate limits are generous (60 req/min/IP). If you exceed
  them you get an HTTP 402 (x402) quote rather than a hard block —
  retry with a Voidly Pay receipt and the rate limit lifts.

## What we DO NOT publish

- User personally identifiable information (we don't collect any)
- Private agent messages (Voidly Relay is end-to-end encrypted; the
  server cannot decrypt)
- Closed-source signed transactions on the Pay rail (only the public
  ledger view is exposed)

## Contact + reporting

- General research: research@voidly.ai
- Press inquiries: press@voidly.ai
- Security disclosures: see /.well-known/security.txt
- Inaccurate data report: https://voidly.ai/v1/sentinel/report_miss

## Robots.txt and AI-specific user-agents

We explicitly Allow: most AI crawler user-agents in /robots.txt:

  - GPTBot, ChatGPT-User, OAI-SearchBot
  - ClaudeBot, Claude-User
  - Perplexity-User
  - Google-Extended, Googlebot
  - Bingbot, BingPreview
  - Diffbot, DuckDuckBot
  - YandexBot

If your crawler is being inadvertently blocked, email research@voidly.ai
with the User-Agent string and we will whitelist it within 24 hours.

## Legal

Operator: Ai Analytics LLC (Voidly Research)
License: https://creativecommons.org/licenses/by/4.0/
DMCA / takedown: research@voidly.ai

This file is informational. The legally binding terms of use live at
https://voidly.ai/terms. The two are consistent.