Sports Dashboard

MI Bivariate Poisson + Dixon-Coles + Elo

← Back to Blog
|Model Architecture|ACCEPTED

The Grid Was Wrong About Draws: How Dixon-Coles Fixed Our Biggest Blind Spot

Applied Dixon-Coles tau correction (rho=+0.05) to the Bivariate Poisson score grid. Dev +1.60pp (p=0.025), holdout +0.80pp, production -2.98% to -1.72%. The grid overestimated 0-0/1-1 for AH markets. Combined with market-only: +1.25pp total, 189u saved.

Dev Delta
+1.60pp
p=0.025
Holdout
+0.80pp
directionally consistent
Production
-1.72%
from -2.98% (+1.25pp)
Total Saved
+189u
26 leagues combined

The Grid Was Wrong About Draws: How Dixon-Coles Fixed Our Biggest Blind Spot

We applied a 40-year-old statistical correction to our Bivariate Poisson grid and got the strongest result of the session: +1.60pp on dev (p=0.025), validated on holdout, deployed. The model was systematically overestimating the probability of 0-0 and 1-1 scorelines, inflating edges on markets that depend on these outcomes.

The Question

The MI Bivariate Poisson model generates a probability grid for every possible scoreline. From this grid, we derive all market probabilities — 1X2, Asian Handicap, Over/Under, BTTS.

We knew the grid had a calibration problem. Today's edge shrinkage test proved the model is overconfident on its largest edges. The question was: which part of the grid is wrong?

Dixon and Coles (1997) identified a specific weakness in Poisson models for football: they mispredict the frequency of low-scoring outcomes (0-0, 1-0, 0-1, 1-1). Their fix: a single parameter rho that adjusts these four cells post-hoc.

What We Found

We swept 7 values of rho on the 10-league dev set:

rhoAH ROIDeltap-value
-0.12-2.52%+0.77pp0.190
-0.10-2.41%+0.88pp0.147
-0.08-2.46%+0.83pp0.160
-0.05-2.43%+0.86pp0.153
-0.03-2.34%+0.95pp0.128
+0.03-2.20%+1.09pp0.090
**+0.05****-1.69%****+1.60pp****0.025**

Positive rho wins. The Poisson grid OVERESTIMATES 0-0 and 1-1 — reducing their probability improves everything.

Validation Chain

StageAH ROI DeltaStatus
Dev (10 leagues)+1.60ppPASS (p=0.025, 3/3 years, CLV 9.82%)
Holdout (9 leagues)+0.80ppPASS (directional, 0.50x ratio)
Production (26 leagues)+1.25pp combinedDeployed

The Nuance

Why Positive Rho?

Standard football analytics assumes Poisson underpredicts draws (defensive football → more 0-0). But for AH BETTING, the opposite is true. Here's why:

The AH market prices are ALREADY calibrated for draws. When we compare our model to Pinnacle AH odds, overestimating draws means overestimating the probability that the AH line results in a push or half-loss. This makes the model see "edge" where there isn't any.

By reducing draw probabilities (positive rho), the model's AH edges become more accurate. The bets we take are better calibrated.

The Gradient Is Monotonic

Every rho value from -0.12 to +0.05 improves over baseline. But the improvement accelerates sharply at positive values. This suggests the Poisson grid's draw overestimation is the primary calibration error, not a secondary effect.

Per-League: Broad-Based Improvement

On dev set with rho=+0.05, only 10/30 league-year cells are worse than baseline (33%). This is the broadest improvement of any config tested today.

The Full Improvement Chain

StepConfigAH ROI (26 leagues)Cumulative
Originaloutcome=0.3, xg=0.2, rho=0-2.98%--
+ market-onlyoutcome=0, xg=0-2.19%+0.79pp
+ Dixon-Colesrho=+0.05-1.72%+1.25pp

Two architectural changes. +1.25pp. 189u saved. 42% less negative.

What This Means

  1. The score grid was the bottleneck. The Poisson model's overestimation of 0-0/1-1 was inflating AH edges and causing overconfident predictions. Fixing 4 cells in the grid had more impact than any signal, filter, or parameter we've tested.
  1. Still not profitable. -1.72% ROI. But the gap is closing: from -2.98% to -1.72% in one session. The remaining -1.72% may be addressable through further grid improvements (Negative Binomial for extreme scores) or the structural regime findings (quarter-line routing at +4.3% vs whole-line -7.8%).
  1. The methodology works. Dev/holdout validation caught the max-edge-cap (overfit to dev) while confirming Dixon-Coles (generalizes to holdout). The framework is earning trust.

What's Next

  • Test rho=+0.07 and +0.10 (the gradient suggests more improvement may exist)
  • Apply Negative Binomial distribution to all grids (not just O/U) for extreme score calibration
  • Quarter-line routing (12pp structural spread, walk-forward confirmed)
  • Re-test top signals on the improved baseline (now -1.72% instead of -2.98%)
ACCEPTEDSignal: dixon-coles-rho-correction|2026-03-19