Sports Dashboard

MI Bivariate Poisson + Dixon-Coles + Elo

← Back to Blog
|research|INVESTIGATION

Farm-Out Sprint: 18 Specs, 7 Signals Tested, 1 Promoted to Shadow

Five-day sprint across 3 workstreams. inter-model-disagreement scores 9/10 gates (+1.1% marginal) — promoted to shadow. /pod dashboard rebuilt with real alpha metrics. v3 regression test confirms noisy xG wins. Full 2048-combo factorial synced.

Farm-Out Sprint: Apr 7–12 Results

Five days, 18 specs across 3 workstreams. Here's what we found.

Signal Approvals (7 signals tested)

SignalGatesMarginalVerdict
inter-model-disagreement**9/10****+1.1%**Promoted to shadow — best signal found
finishingLuck (lambda)7/10+0.6%Rejected — IS/OOS gap, suspicious N
contextXg (lambda)8/10+0.5%Rejected — boundary case, not stat sig
layered-threshold-variance8/10+0.3%Rejected — effect too small
multi-source-regression-confidence8/10+0.2%Rejected — correct direction, tiny effect
v3-model-residual8/10+0.1%Rejected — near zero marginal
overperformance-decomposition7/10-0.6%Rejected — wrong direction, dead end

Key finding: inter-model-disagreement (+1.1% marginal, 9/10 gates) is the most promising signal discovered in this sprint. It only failed bootstrap significance (p=0.23). More live data may push it past the threshold.

Multi-Source xG Infrastructure

Built a complete multi-source xG pipeline:

  • 90,699 matches computed with 4 xG sources (FotMob match, Understat/FootyStats, FotMob shot-level, v3 XGBoost model)
  • v3 model inference runs our 500-tree XGBoost directly on shot coordinates
  • 6 enrichment flags added to every BetRecord for conditional combination testing

/pod Dashboard Rebuilt (Specs F–J)

  • Fixed alpha/beta/IR — cross-chunk pairing now works with 1,024 pairs per lambda signal (previously all zeros)
  • Split filter/lambda tables with DEPLOY/REMOVE/HOLD badges
  • Live gauntlet data connected — shows shadow P&L alongside factorial metrics
  • P&L Attribution card — decomposes total P&L into model edge vs settling variance
  • Top 10 configs, redundancy warnings, portfolio waterfall

v3 Regression Test

Confirmed: noisy match-level xG (corr 0.35) beats precise shot-level xG (corr 0.55) for regression detection by +0.14pp entry-adj ROI. The variance filter stays on FotMob match-level.

Gauntlet Per-Bet Fix

.p models (v2.p, contextXg.p) on /gauntlet now show real per-bet P&L from shadow config backtest data instead of dummy production mirrors.

Full Factorial Synced

All 64 chunks (2,048 signal combinations) synced from cloud-lab. Scorecard regenerated with complete data. Best config: regime+crossBtts+leagueExcl+finishingLuck (CLV +13.7%).

Graduation Status

No signals ready for production graduation yet:

  • fsAll: 41% (70/169 bets needed, ~4 weeks)
  • HFA/FF: 18% (70/379 needed, ~60 weeks)

What's Next

  1. Wire inter-model-disagreement into gauntlet for live shadow accumulation
  2. Wait for fsAll graduation (~4 weeks)
  3. Queue xG-weight solver A/B when cloud-lab finishes current batch