v3.7 stacking ensemble over 4 base learners — failed F1 gate (+1.1pp vs needed +2.0pp), shipped as transparency endpoint
Stacked v3.3 GradientBoosting (OOF), DBSCAN unsupervised anomaly v1, Bayesian corroboration v1, and per-measurement classifier v1 into a meta-learner. Logistic regression won head-to-head vs MLP (16,8): stratified 5-fold F1 0.7534 vs v3.3 OOF baseline 0.7424 (+1.1pp), AUC 0.9033, LOCO median F1 0.8974 across 97 countries. Promote criteria required +2.0pp stratified F1 AND >=0.85 LOCO median — only LOCO gate passed. Endpoint shipped live at /v1/classifier/stacking/{cc} with passed_promote_gates=false for transparency. Coefficient analysis: v3.3 dominates (1.95), DBSCAN flag adds 0.32, per-measurement adds 0.12, Bayes posterior adds ~0 (already implicit in v3.3). Fourth honest negative result.
#ml#classifier#stacking#ensemble#meta-learner#logistic-regression#mlp#negative-result#honest#transparency#shipped-not-promoted
Raw data
- Live: stacking info + gates + coefficients
- Live: per-country score (Iran)
- Base learner: v3.3 GradientBoosting (info)
- Base learner: DBSCAN anomaly v1 (info)
- Base learner: Bayesian corroboration v1 (info)
- Base learner: per-measurement classifier v1 (info)
- Wolpert 1992 — Stacked Generalization
- Build script
- Train script