After the Bet Settles: Our Post-Paper-Trading Analysis Process
What happens after the model makes picks and real money is on the line. Daily settlement, 3-layer health checks, loss classification (5 categories), regime change detection, kill switches, and the feedback loop that turns every loss into a new alpha hypothesis.
You deployed a signal. Bets are firing. Some win, some lose. Now what?
Most betting systems stop at deployment. We don't. Every settled bet feeds a structured analysis loop that answers three questions: Is this still working? Why did we lose? And what can we learn from the losses to find new edge?
This post documents our post-paper-trading analysis process — the thing that happens after the model makes its picks and real money is on the line.
The Daily Settlement Cycle
Four crons run the daily loop:
| Time (UTC) | Job | What happens |
|---|---|---|
| 7:00 AM | `settle` | Yesterday's bets get results: goals, profit/loss, CLV vs closing line |
| 8:00 AM | `accumulate-xg` | Fetch match xG from Fotmob across 38 leagues |
| 9:00 AM | `cron-odds` | Collect fresh Pinnacle odds for today's matches |
| 12:00 PM | `log` | Generate new picks, apply filters, log bets |
Settlement is where the learning happens. For every bet, we record:
- Did we beat the closing line? (CLV — the model quality signal)
- Did we make money? (profit/loss — the outcome signal)
- What was the slippage? (entry odds vs closing — the execution signal)
- What regime were we in? (HFA, season phase, motivation — the context)
- What signals fired? (multi-signal attribution — which signals contributed)
CLV and profit can disagree. A bet with +8% CLV that loses is a good bet that got unlucky. A bet with -2% CLV that wins is a bad bet that got lucky. Over 200+ bets, CLV converges to truth faster than ROI.
The Three-Layer Health Check
After settlement, we check three layers — each at a different time horizon:
Layer 1: Immediate (per-bet, daily)
Every settled bet gets classified into one of five categories:
| Category | Detection | Response |
|---|---|---|
| **Variance** | CLV positive, model direction correct, just lost | None — expected noise |
| **Model error** | CLV positive but match dynamics contradicted model | Track frequency. If >30% → blind spot |
| **Stale input** | `oddsFlags` shows stale/thin market, injury data old | Fix pipeline immediately |
| **Regime shift** | CLV negative, line moved against us before close | Reduce exposure, investigate |
| **Execution leak** | Large entry-vs-closing gap, high slippage | Optimize timing/book selection |
The classification uses data already on every bet — no manual tagging needed.
Layer 2: Rolling (per-signal, weekly)
Every signal gets a health scorecard computed from its last 30 and 50 bets:
- CLV trend: Up, down, or flat? (first 25 vs last 25 of rolling 50)
- ROI vs backtest baseline: Z-score — how far from expected?
- Hit rate: Below 35% = critical, below 40% = warning
- CUSUM alarm: Cumulative sum control chart detects downward shifts in mean CLV
When a signal's z-score drops below -2.0 or CUSUM triggers, it gets flagged for review.
Layer 3: Structural (per-league/market, monthly)
- Per-league CLV: Is one league's edge eroding while others hold?
- Market-type performance: AH vs 1X2 vs OU — has the mix shifted?
- Pass-rate drift: Are league/market/direction hit rates drifting below 50%?
- Dependency health: Is Fotmob data fresh? Are odds complete? Team name resolution rate?
The Regime Change Decision
The hardest call in live betting: bad week or structural shift?
Is rolling 30-bet CLV still positive?
│
├─ YES → Losses are variance or execution.
│ Model still sees edge. Focus on execution.
│
└─ NO → Edge is eroding.
│
├─ One league? → League-specific regime shift
├─ One signal? → Signal decay (revalidate via /signal-test)
└─ System-wide? → Fundamental issue (check solver, data quality)CLV is the leading indicator. If CLV is positive and ROI is negative, the model is correct and variance is resolving against us — wait it out. If CLV turns negative, the model is wrong — act immediately.
Kill Switches and Re-Enable Gates
Automatic Disable Triggers
| Trigger | Threshold | Action |
|---|---|---|
| CLV collapse | Rolling 20-bet CLV < -2% per league/market | Disable that combo |
| Data staleness | Any source >48h stale | Block new bets |
| Model divergence | Solver non-convergence on 3+ consecutive solves | Flag for manual review |
Re-Enable Criteria (all must be met)
- 20+ new bets with CLV ≥ 0% in the disabled market
- 14-day minimum cool-off (prevents disable/enable thrashing)
- Statistical significance: 2σ above the disable threshold
We learned this from Championship AH Home — it was showing -29% CLV, we disabled it, traced it to home-field bias in the solver, found the fix (ahHomeAdvantageDiscount: 0.85), and re-enabled only after the discount was validated through a parameter sweep backtest.
Turning Losses Into New Alpha
This is the part that makes the system get smarter over time.
The Loss Mining Process
Every week, we run /post-deployment full audit. For each non-variance loss category, we extract patterns:
Model errors → blind spot hypotheses. "The model mispriced 4 matches this week where the away team was in European competition. Do Europa League teams systematically underperform domestically the week after a European fixture?" That's a testable hypothesis. Register it: /signal-test europa-league-hangover-effect.
Regime shifts → adaptation hypotheses. "Pinnacle's CLV on Championship AH has been negative for 15 bets. Did a new bookmaker enter that market? Are closing lines tighter?" Check the data. If systematic, register: /signal-test championship-market-efficiency-shift.
Execution leaks → timing hypotheses. "We entered 12 bets at odds worse than closing this week. All were morning entries on Saturday matches." Test whether entry timing correlates with slippage: /signal-test weekend-morning-entry-slippage.
The Virtuous Cycle
Deploy signal
↓
Monitor live performance (daily settlement)
↓
Classify losses (5 categories)
↓
Extract patterns from non-variance losses
↓
Register new hypotheses (/signal-test)
↓
Test through 10-gate approval
↓
Deploy new signal ←──────────────────┘Losses feed discovery. Discovery feeds deployment. The system's edge doesn't just erode — it evolves, because every failure teaches you where to look next.
What We Track on the Dashboard
The /model-health page and signal health API show:
- Overall: CLV, ROI, hit rate, P&L, CUSUM status (healthy/warning/critical)
- Per signal: N, mean CLV, ROI, trend direction, z-score vs backtest
- Per league: CLV, ROI, bet count, pass rate
- Per market type: AH, 1X2, OU25 performance breakdowns
- Realism scenarios: Pessimistic P&L under 3 slippage models (1%/2.5%/3.5%)
- Cherry-pick spread: How much edge comes from best-book vs Pinnacle vs median
- Disabled markets: What's off, when it was disabled, and what re-enable criteria remain
The Minimum Viable Audit
Don't have time for the full weekly process? Here's the 60-second version:
- Check rolling 30-bet CLV. Positive → you're fine. Negative → investigate.
- Count losses by category. If stale input > 0, fix the pipeline.
- Look at the worst-performing league. If CLV < 0% on 20+ bets, consider disabling.
- Check the signal health z-scores. Anything below -2.0 gets a
/signal-testrevalidation.
If all four pass, the system is healthy. Move on to finding new edge.
Current System State
| Metric | Value | Status |
|---|---|---|
| Model CLV | +11.1% | Healthy |
| IS ROI (6 leagues) | -1.4% | Near breakeven |
| OOS ROI (20 leagues) | -3.6% | Execution gap |
| Drift detector | CUSUM active | Monitoring |
| Strongest filter | odds-cap-2.0 (+4.2pp) | Deployed |
| Kill switches | CLV < -2% threshold | Active |
| Loss → alpha pipeline | `/post-deployment` | Ready |
| Discovery pipeline | `/signal-test` (10 gates) | Ready |
| Pending hypotheses | 46 | Waiting to test |