Sports Dashboard

MI Bivariate Poisson + Dixon-Coles + Elo

← Back to Blog
|Research|REJECTED

The Market IS the Prior: Why Elo and Marcel Can't Beat Pinnacle

Tested three Bayesian priors for early-season predictions. Elo warm-start made predictions WORSE (+0.004 Brier). Marcel early-prior confirmed at 0.0pp marginal. Player finishing xG calibration shows -0.00375 Brier but needs walk-forward. Key insight: the solver already fits to Pinnacle market odds, which IS the best prior. Path A (solver priors) is closed. Path C (shot-level xG) remains open.

H5 Elo
REJECTED
+0.004 Brier (worse)
H6 Marcel
REJECTED
0.0pp marginal
H9 Finishing
PENDING
-0.00375 Brier (needs WF)

The Market IS the Prior: Why Elo and Marcel Can't Beat Pinnacle

We tested three Bayesian priors for improving early-season predictions. Two failed decisively. One showed promise but needs proper validation.

The Question

Our system skips the first 5 matchdays of every season because the solver doesn't have enough data to estimate team strength. That's ~250 bets/season thrown away. Could pre-season information (player projections, Elo ratings, squad market values) fill the gap?

What We Tested

H6: Marcel Player Projections as Early-Season Prior

Re-tested the Marcel early-season prior with the runWithoutSignal fix that was breaking the marginal measurement.

MetricWith MarcelWithout MarcelDelta
Entry-adj ROI+6.9%+6.9%**0.0pp**
Bets17,02016,984+36

The focused test's +1.1pp on 27 recovered bets was real, but those 27 bets are invisible in a 17,000-bet pool. Marcel data only covers 2 of 12 backtest seasons — the signal is a no-op for 90% of the evaluation.

H5: Elo Warm-Start for the Solver

End-of-season Elo ratings → convert to attack/defense lambdas → warm-start solver instead of cold-starting at 1.0/1.0.

GameweekElo Prior BrierBaseline BrierDelta
GW1-50.608210.60380**+0.004 (worse)**
GW6-10**+0.008 (worse)**

Elo priors made predictions *worse*. Not by a little — by 0.004 Brier points, and the gap *widened* as the season progressed. Only 3 of 16 leagues showed any improvement.

H9: Player Finishing Multiplier as xG Calibration

Multiply each shot's xG by the shooter's finishing multiplier (Haaland 1.45×, Pulisic 0.46×).

BucketShotsBrier Change
Elite finishers (>1.2)13,700**-0.0084**
Poor finishers (<0.8)14,600**-0.0089**
Middle (0.8-1.2)48,100-0.0009
**Overall****76,400****-0.00375**

Passes the -0.002 threshold. But this used full-sample multipliers (trained and tested on the same data). The low-confidence multipliers helping MORE than high-confidence is a red flag for in-sample overfitting. Walk-forward validation required.

The Big Insight: The Market IS the Prior

The H5 result is the most informative failure. Our solver cold-starts from devigged Pinnacle closing odds, which already embed the market's team strength estimate. Even at GW1 — before the season has kicked off — the solver is fitting to market-implied lambdas. Pinnacle's odds reflect squad investment, transfers, pre-season form, expert opinion, and betting volume.

Elo is a backward-looking statistic derived from historical results. The market is forward-looking. Of course the market wins.

This likely kills the entire Path A (solver-level priors) approach:

  • If Elo can't beat market odds, squad value warm-start (H1) won't either
  • Previous season finish (H4) is even weaker than Elo
  • The market has already done this work for us

But Path C (shot-level xG) is different. The market prices match outcomes, not individual shots. When Haaland shoots from 15 yards, the market doesn't adjust the xG — but we can. The finishing multiplier adds information that exists below the market's resolution.

What Didn't Work

Both team-strength priors (Marcel and Elo) failed for the same fundamental reason: the solver already has access to market-derived team strength via Pinnacle odds. Adding a weaker signal on top of a stronger one produces noise, not improvement.

The Marcel prior is particularly instructive. It's a good projection system (+15.9% RMSE vs naive for individual players), but the system that USES those projections — the solver — doesn't need them because it has something better: the market's collective estimate.

What This Means

Path A is closed. The solver's cold-start from market odds is the best available prior. Don't try to improve it with historical data — the market already processes that information.

Path C (xG model) remains open. The finishing multiplier's -0.00375 Brier improvement is compelling but unvalidated. If it holds under walk-forward, it's deployable as a post-hoc xG calibration layer.

What's Next

Walk-forward validation for H9:

  1. Compute season-specific finishing multipliers (train on seasons ≤N-1)
  2. Test calibrated xG on season N shots
  3. If Brier improvement holds: deploy as xG calibration layer
  4. Then test downstream: does improved xG → better variance filter → better bets?
REJECTEDSignal: bayesian-priors-experiment|2026-04-13