Farm-Out Sprint: 18 Specs, 7 Signals Tested, 1 Promoted to Shadow

April 12, 2026|research|INVESTIGATION

Farm-Out Sprint: 18 Specs, 7 Signals Tested, 1 Promoted to Shadow

Five-day sprint across 3 workstreams. inter-model-disagreement scores 9/10 gates (+1.1% marginal) — promoted to shadow. /pod dashboard rebuilt with real alpha metrics. v3 regression test confirms noisy xG wins. Full 2048-combo factorial synced.

Farm-Out Sprint: Apr 7–12 Results

Five days, 18 specs across 3 workstreams. Here's what we found.

Signal Approvals (7 signals tested)

Signal	Gates	Marginal	Verdict
inter-model-disagreement	9/10	+1.1%	Promoted to shadow — best signal found
finishingLuck (lambda)	7/10	+0.6%	Rejected — IS/OOS gap, suspicious N
contextXg (lambda)	8/10	+0.5%	Rejected — boundary case, not stat sig
layered-threshold-variance	8/10	+0.3%	Rejected — effect too small
multi-source-regression-confidence	8/10	+0.2%	Rejected — correct direction, tiny effect
v3-model-residual	8/10	+0.1%	Rejected — near zero marginal
overperformance-decomposition	7/10	-0.6%	Rejected — wrong direction, dead end

Key finding: inter-model-disagreement (+1.1% marginal, 9/10 gates) is the most promising signal discovered in this sprint. It only failed bootstrap significance (p=0.23). More live data may push it past the threshold.

Multi-Source xG Infrastructure

Built a complete multi-source xG pipeline:

90,699 matches computed with 4 xG sources (FotMob match, Understat/FootyStats, FotMob shot-level, v3 XGBoost model)
v3 model inference runs our 500-tree XGBoost directly on shot coordinates
6 enrichment flags added to every BetRecord for conditional combination testing

/pod Dashboard Rebuilt (Specs F–J)

Fixed alpha/beta/IR — cross-chunk pairing now works with 1,024 pairs per lambda signal (previously all zeros)
Split filter/lambda tables with DEPLOY/REMOVE/HOLD badges
Live gauntlet data connected — shows shadow P&L alongside factorial metrics
P&L Attribution card — decomposes total P&L into model edge vs settling variance
Top 10 configs, redundancy warnings, portfolio waterfall

v3 Regression Test

Confirmed: noisy match-level xG (corr 0.35) beats precise shot-level xG (corr 0.55) for regression detection by +0.14pp entry-adj ROI. The variance filter stays on FotMob match-level.

Gauntlet Per-Bet Fix

.p models (v2.p, contextXg.p) on /gauntlet now show real per-bet P&L from shadow config backtest data instead of dummy production mirrors.

Full Factorial Synced

All 64 chunks (2,048 signal combinations) synced from cloud-lab. Scorecard regenerated with complete data. Best config: regime+crossBtts+leagueExcl+finishingLuck (CLV +13.7%).

Graduation Status

No signals ready for production graduation yet:

fsAll: 41% (70/169 bets needed, ~4 weeks)
HFA/FF: 18% (70/379 needed, ~60 weeks)

What's Next

Wire inter-model-disagreement into gauntlet for live shadow accumulation
Wait for fsAll graduation (~4 weeks)
Queue xG-weight solver A/B when cloud-lab finishes current batch