Farm-Out Sprint: 18 Specs, 7 Signals Tested, 1 Promoted to Shadow
Five-day sprint across 3 workstreams. inter-model-disagreement scores 9/10 gates (+1.1% marginal) — promoted to shadow. /pod dashboard rebuilt with real alpha metrics. v3 regression test confirms noisy xG wins. Full 2048-combo factorial synced.
Farm-Out Sprint: Apr 7–12 Results
Five days, 18 specs across 3 workstreams. Here's what we found.
Signal Approvals (7 signals tested)
| Signal | Gates | Marginal | Verdict |
|---|---|---|---|
| inter-model-disagreement | **9/10** | **+1.1%** | Promoted to shadow — best signal found |
| finishingLuck (lambda) | 7/10 | +0.6% | Rejected — IS/OOS gap, suspicious N |
| contextXg (lambda) | 8/10 | +0.5% | Rejected — boundary case, not stat sig |
| layered-threshold-variance | 8/10 | +0.3% | Rejected — effect too small |
| multi-source-regression-confidence | 8/10 | +0.2% | Rejected — correct direction, tiny effect |
| v3-model-residual | 8/10 | +0.1% | Rejected — near zero marginal |
| overperformance-decomposition | 7/10 | -0.6% | Rejected — wrong direction, dead end |
Key finding: inter-model-disagreement (+1.1% marginal, 9/10 gates) is the most promising signal discovered in this sprint. It only failed bootstrap significance (p=0.23). More live data may push it past the threshold.
Multi-Source xG Infrastructure
Built a complete multi-source xG pipeline:
- 90,699 matches computed with 4 xG sources (FotMob match, Understat/FootyStats, FotMob shot-level, v3 XGBoost model)
- v3 model inference runs our 500-tree XGBoost directly on shot coordinates
- 6 enrichment flags added to every BetRecord for conditional combination testing
/pod Dashboard Rebuilt (Specs F–J)
- Fixed alpha/beta/IR — cross-chunk pairing now works with 1,024 pairs per lambda signal (previously all zeros)
- Split filter/lambda tables with DEPLOY/REMOVE/HOLD badges
- Live gauntlet data connected — shows shadow P&L alongside factorial metrics
- P&L Attribution card — decomposes total P&L into model edge vs settling variance
- Top 10 configs, redundancy warnings, portfolio waterfall
v3 Regression Test
Confirmed: noisy match-level xG (corr 0.35) beats precise shot-level xG (corr 0.55) for regression detection by +0.14pp entry-adj ROI. The variance filter stays on FotMob match-level.
Gauntlet Per-Bet Fix
.p models (v2.p, contextXg.p) on /gauntlet now show real per-bet P&L from shadow config backtest data instead of dummy production mirrors.
Full Factorial Synced
All 64 chunks (2,048 signal combinations) synced from cloud-lab. Scorecard regenerated with complete data. Best config: regime+crossBtts+leagueExcl+finishingLuck (CLV +13.7%).
Graduation Status
No signals ready for production graduation yet:
- fsAll: 41% (70/169 bets needed, ~4 weeks)
- HFA/FF: 18% (70/379 needed, ~60 weeks)
What's Next
- Wire inter-model-disagreement into gauntlet for live shadow accumulation
- Wait for fsAll graduation (~4 weeks)
- Queue xG-weight solver A/B when cloud-lab finishes current batch