Sports Dashboard

MI Bivariate Poisson + Dixon-Coles + Elo

← Back to Blog
|Signal Test|REJECTED

Testing Finishing Persistence: More Data, Same Dead End

Re-tested the finishing persistence signal after backfilling FotMob shots to 19 leagues (5,144 matches, 127K shots). Player count cleaned to 2,040 (dedup), split-half r improved to 0.411. Affected bets doubled (899 vs 427) but marginal ROI halved (+0.2pp, below +0.5pp gate). The finishing effect is real but a binary filter can't capture it — needs continuous lambda adjustment.

Marginal ROI
+0.2pp
below +0.5pp gate
Bets Affected
899
of 16,167 (5.6%)
Split-Half r
0.411
2,040 players

Testing Finishing Persistence: More Data, Same Dead End

We know finishing skill is real. Split-half correlation of 0.411 across 2,040 players — Dembele consistently converts at 1.76x his xG, Pulisic at 0.46x his. The question was whether this knowledge could improve our variance filter: don't fade teams carried by elite finishers.

The Question

Our variance filter detects teams overperforming their xG and skips bets that back them (they'll regress). But what about teams where the overperformance is driven by a genuinely clinical striker? Fading them is wrong — they're not lucky, they're good.

The hypothesis: identify teams with persistent top-scorer finishing multipliers (>1.15) and exempt them from variance fading. More FotMob shot data (from 101 EPL matches to 5,144 matches across 19 leagues) should increase the signal's reach and push marginal ROI past the +0.5pp deployment gate.

What We Found

MetricPrevious (Apr 2)Re-Test (Apr 7)
Players with multipliers2,5582,040
Split-half correlation0.2900.411
Bets affected427 / 16,167899 / 16,167
Marginal ROI (closing)+0.3pp+0.2pp
Marginal ROI (entry-adj)+0.3pp+0.2pp

More data. Better player model. Twice as many affected bets. And the marginal ROI went *down*.

The Nuance

The player-level finishing effect got *stronger* with better data — the split-half correlation jumped from 0.290 to 0.411 after deduplication cleaned out noise. The model is more accurate now.

But the signal got *weaker* as a betting filter. Here's why:

The binary filter is too blunt. A player with a 1.80 multiplier (Dembele — 17 goals from 7.4 xG) is fundamentally different from a player at 1.16 (barely above threshold). The filter treats them identically: "has persistent finisher → don't fade this team." At 427 bets, it was only catching the extreme cases. At 899, it's catching marginal cases too, and those marginal cases are a coin flip.

The dedup matters. Player count dropped from 2,558 to 2,040 because the previous data had duplicate matches across FotMob files. The old model was double-counting some players' shots, inflating their apparent sample size and making the filter more aggressive than warranted.

Top finishers (50+ shots):

PlayerMultiplierGoals/xGShotsLeagues
Dembele1.76417/7.462Ligue 1
Semenyo1.63719/9.670EPL
Cunha1.61210/4.261EPL
McTominay1.52918/10.095EPL, Serie A
Sesko1.47719/11.282Bundesliga, EPL
Lamine Yamal1.45219/11.5101La Liga

Bottom finishers:

PlayerMultiplierGoals/xGShotsLeagues
Pulisic0.4561/8.464Serie A
Xavi Simons0.5521/6.175Bundesliga, EPL, Eredivisie
Kvaratskhelia0.5974/10.375Ligue 1, Serie A
Morgan Rogers0.5934/10.3129Championship, EPL

These are *real* effects. Pulisic's 0.456 multiplier across 64 shots isn't noise. But the binary filter can't translate this granularity into a betting edge.

What Didn't Work

Three approaches tried, all below the +0.5pp gate:

  1. Binary filter (this test): +0.2pp marginal ROI. Too blunt.
  2. Original test (smaller data): +0.3pp. Appeared better only because it caught fewer, more extreme cases.
  3. The filter logic itself operates on team-level data (does the team's top scorer have multiplier > 1.15?) when the effect is player-level. A team with a clinical striker who plays 60% of minutes is different from one whose clinical player plays 90%.

What This Means

The signal stays parked. It's not deployed, and the code is wired but disabled (finishingPersistenceFilter: false in production).

The finishing effect is *infrastructure*, not a signal. It validates that our variance filter has a systematic blind spot (fading teams whose overperformance is skill, not luck), but the binary filter approach can't exploit it at the margin.

What's Next

Two future paths worth trying:

  1. Continuous lambda adjustment: Instead of binary (fade/don't fade), scale the regression expectation by the multiplier magnitude. A team with a 1.80 finisher should get 30% less regression applied, not zero.
  1. Marcel integration: Combine with the Marcel projection system (validated this session at +15.9% RMSE improvement). Project each team's finishing quality from their roster's individual multipliers, weighted by projected minutes. This is the player-aware team strength model the system eventually needs.

Neither is urgent. The variance filter works well without this refinement (+12.2% CLV), and the marginal improvement is likely <1pp even with perfect implementation. File under "known limitation, not blocking."

REJECTEDSignal: xg-finishing-persistence|2026-04-07