Testing GK PSxG+/-: The Signal That Markets Already Priced In
Explored whether opponent goalkeeper quality (PSxG+/-) predicts AH outcomes. Expanded GK data from 4 to 22 leagues. Exploration found a +15.8pp ROI spread, but formal 10-gate approval rejected: marginal ROI +0.4pp (p=0.36), walk-forward decayed from +1.4% in 2023 to -9.0% in 2025. Markets appear to have adapted.
Testing GK PSxG+/-: The Signal That Markets Already Priced In
Goalkeeper quality should matter for betting. A team facing Alisson or Courtois has a harder time scoring than one facing a League Two reserve keeper. Our model's variance regression assumes defensive over/underperformance will regress — but what if a good keeper is *sustaining* that overperformance?
We tested whether filtering out bets against top-PSxG goalkeepers improves AH outcomes.
The Question
When our model identifies value in a team's AH line, is the expected regression toward the mean dampened by the opposing goalkeeper's shot-stopping quality? Specifically: should we skip bets where the opponent's GK has a goals prevented (PSxG+/-) above +2.0?
The hypothesis stems from a gap in our variance regression logic: it treats all defensive outperformance as luck-driven, when some of it is genuine GK skill.
What We Found
The exploration was exciting
We expanded GK data from 4 leagues to 22 (scraping Fotmob's goals prevented stat for 67 league-seasons). Coverage went from 17.4% to 59.4% of AH bets.
The directional effect is real and large:
| Opponent GK Quality | N | ROI | Hit Rate |
|---|---|---|---|
| Good GK (PSxG > +2) | 566 | **-8.5%** | 51.6% |
| Average GK (-2 to +2) | 507 | -0.0% | 57.0% |
| Bad GK (PSxG < -2) | 576 | **+7.3%** | 61.1% |
A 15.8 percentage point spread. Betting against bad keepers produces +7.3% ROI; betting against good keepers loses -8.5%. The mechanism is intuitive: good keepers suppress the scoring regression our model expects.
The backed team's own GK quality matters too: teams with good keepers produce +3.9% ROI when backed, bad keepers -4.6%.
The gate was sobering
When layered onto the existing filter stack, the signal barely moves the needle:
| Metric | With Filter | Without | Delta |
|---|---|---|---|
| Bets | 5,958 | 6,606 | -648 |
| ROI | -2.5% | -3.0% | **+0.4pp** |
| P&L | -151.6u | -195.9u | +44.3u |
Marginal ROI: +0.4pp. Bootstrap p-value: 0.362. Not significant, not practical.
6/10 gates passed. The failures:
- Gate 5 (Bootstrap): p=0.36 — indistinguishable from noise
- Gate 9 (Practical significance): +0.4pp < 0.5pp threshold
- Gate 10 (Walk-forward): This is the killer:
| Season | N | ROI | Verdict |
|---|---|---|---|
| 2022 | 978 | +1.3% | OK |
| 2023 | 1,875 | +1.4% | OK |
| 2024 | 2,018 | -4.6% | FAIL |
| 2025 | 1,087 | -9.0% | FAIL |
The signal worked in 2022-23 and then *reversed* in 2024-25. Not noise degradation — active reversal.
The Nuance
Why the exploration looked better than the gate
The exploration bucketed bets by opponent GK quality and showed dramatic differences (+15.8pp spread). But the approval gate tests *marginal contribution to the existing stack* — and the existing stack already captures most of the value. The 648 bets removed by the GK filter are a mix of genuinely bad bets and false positives where the model had real edge despite facing a good keeper.
Market adaptation is the likely explanation
The walk-forward decay pattern (positive 2022-23, negative 2024-25) is consistent with markets pricing in GK quality over time. As PSxG data became more widely available, closing lines started reflecting goalkeeper performance. Our model uses closing lines as inputs — so the edge was already being captured, and filtering on it adds no marginal value in recent data.
This is actually a *positive* finding for our model architecture: the Pinnacle closing line already incorporates GK quality, which flows into our devigged probabilities.
Per-league heterogeneity
The signal showed strong heterogeneity across leagues:
- Strong positive: Ligue 1 (+28.5%), League One (+22.4%), Championship (+15.6%)
- Negative: EPL (-9.0%), Turkish Super (-20.1%)
This pattern fits the market efficiency story: the EPL and Turkish Super Lig have the deepest betting markets, where GK quality is most likely already priced in. Lower leagues (English L1/L2, Ligue 1) have thinner markets where GK quality might still be underpriced — but the sample sizes are too small to deploy league-specific filters.
What Didn't Work
- PSxG > +2.0 threshold — too broad, catches many average-good keepers
- All-market application — applying to both 1X2 and AH; the mechanism is stronger for sides than totals
- Static threshold across leagues — GK quality matters more in thin markets
What This Means
Not deployed. The filter is rejected. Our model's closing-line architecture already captures most of the GK quality information through the odds themselves.
What's Next
Worth revisiting if:
- Higher threshold (PSxG > +4) — only filter against truly elite keepers
- Combined own + opponent GK — the "backed team bad GK" effect (-4.6% ROI) might be stronger marginal
- Variance regression interaction — only apply when variance filter is also active
- League-specific deployment — thin-market leagues only (League One, Championship, Ligue 1)
The walk-forward decay is the strongest evidence that markets adapted. GK PSxG data went from niche analytics to mainstream between 2022 and 2025, and closing lines now reflect it. The signal was real — it just got priced in.
Second-look candidates: higher threshold (PSxG > +4 for truly elite keepers only), combined own+opponent GK filter, or league-specific thin-market deployment where closing lines are less efficient (League One, Championship, Ligue 1 all showed positive effects).
The GK data expansion (4 to 22 leagues, 67 files) is valuable infrastructure regardless of this signal's outcome.