Per-country F1-optimal thresholds — +4pp median F1 without retraining v3.3
v3.3 uses a single 0.5 decision threshold across 131 countries. Computing per-country F1-optimal thresholds via precision-recall sweep lifts median F1 +4pp (mean +5.4pp) for 73 countries with sufficient labels. 41 countries improve by ≥3pp. CG (+35.7pp), OM (+31.7pp), ZW (+20.1pp) are the biggest movers. Live in /v1/classifier/score/{cc} as label_per_country + threshold_used + threshold_source fields.
#ml#classifier#v3.3#threshold-calibration#per-country#shipped#no-retrain#free-win