The Market IS the Prior: Why Elo and Marcel Can't Beat Pinnacle
Tested three Bayesian priors for early-season predictions. Elo warm-start made predictions WORSE (+0.004 Brier). Marcel early-prior confirmed at 0.0pp marginal. Player finishing xG calibration shows -0.00375 Brier but needs walk-forward. Key insight: the solver already fits to Pinnacle market odds, which IS the best prior. Path A (solver priors) is closed. Path C (shot-level xG) remains open.
The Market IS the Prior: Why Elo and Marcel Can't Beat Pinnacle
We tested three Bayesian priors for improving early-season predictions. Two failed decisively. One showed promise but needs proper validation.
The Question
Our system skips the first 5 matchdays of every season because the solver doesn't have enough data to estimate team strength. That's ~250 bets/season thrown away. Could pre-season information (player projections, Elo ratings, squad market values) fill the gap?
What We Tested
H6: Marcel Player Projections as Early-Season Prior
Re-tested the Marcel early-season prior with the runWithoutSignal fix that was breaking the marginal measurement.
| Metric | With Marcel | Without Marcel | Delta |
|---|---|---|---|
| Entry-adj ROI | +6.9% | +6.9% | **0.0pp** |
| Bets | 17,020 | 16,984 | +36 |
The focused test's +1.1pp on 27 recovered bets was real, but those 27 bets are invisible in a 17,000-bet pool. Marcel data only covers 2 of 12 backtest seasons — the signal is a no-op for 90% of the evaluation.
H5: Elo Warm-Start for the Solver
End-of-season Elo ratings → convert to attack/defense lambdas → warm-start solver instead of cold-starting at 1.0/1.0.
| Gameweek | Elo Prior Brier | Baseline Brier | Delta |
|---|---|---|---|
| GW1-5 | 0.60821 | 0.60380 | **+0.004 (worse)** |
| GW6-10 | — | — | **+0.008 (worse)** |
Elo priors made predictions *worse*. Not by a little — by 0.004 Brier points, and the gap *widened* as the season progressed. Only 3 of 16 leagues showed any improvement.
H9: Player Finishing Multiplier as xG Calibration
Multiply each shot's xG by the shooter's finishing multiplier (Haaland 1.45×, Pulisic 0.46×).
| Bucket | Shots | Brier Change |
|---|---|---|
| Elite finishers (>1.2) | 13,700 | **-0.0084** |
| Poor finishers (<0.8) | 14,600 | **-0.0089** |
| Middle (0.8-1.2) | 48,100 | -0.0009 |
| **Overall** | **76,400** | **-0.00375** |
Passes the -0.002 threshold. But this used full-sample multipliers (trained and tested on the same data). The low-confidence multipliers helping MORE than high-confidence is a red flag for in-sample overfitting. Walk-forward validation required.
The Big Insight: The Market IS the Prior
The H5 result is the most informative failure. Our solver cold-starts from devigged Pinnacle closing odds, which already embed the market's team strength estimate. Even at GW1 — before the season has kicked off — the solver is fitting to market-implied lambdas. Pinnacle's odds reflect squad investment, transfers, pre-season form, expert opinion, and betting volume.
Elo is a backward-looking statistic derived from historical results. The market is forward-looking. Of course the market wins.
This likely kills the entire Path A (solver-level priors) approach:
- If Elo can't beat market odds, squad value warm-start (H1) won't either
- Previous season finish (H4) is even weaker than Elo
- The market has already done this work for us
But Path C (shot-level xG) is different. The market prices match outcomes, not individual shots. When Haaland shoots from 15 yards, the market doesn't adjust the xG — but we can. The finishing multiplier adds information that exists below the market's resolution.
What Didn't Work
Both team-strength priors (Marcel and Elo) failed for the same fundamental reason: the solver already has access to market-derived team strength via Pinnacle odds. Adding a weaker signal on top of a stronger one produces noise, not improvement.
The Marcel prior is particularly instructive. It's a good projection system (+15.9% RMSE vs naive for individual players), but the system that USES those projections — the solver — doesn't need them because it has something better: the market's collective estimate.
What This Means
Path A is closed. The solver's cold-start from market odds is the best available prior. Don't try to improve it with historical data — the market already processes that information.
Path C (xG model) remains open. The finishing multiplier's -0.00375 Brier improvement is compelling but unvalidated. If it holds under walk-forward, it's deployable as a post-hoc xG calibration layer.
What's Next
Walk-forward validation for H9:
- Compute season-specific finishing multipliers (train on seasons ≤N-1)
- Test calibrated xG on season N shots
- If Brier improvement holds: deploy as xG calibration layer
- Then test downstream: does improved xG → better variance filter → better bets?