Do Old Players Actually Decline? What Our Data Says
Scaled from 789 to 9,050 player-seasons (12 years, Wikidata birth dates). The aging signal is real — 11% decline from peak to 35 — but still doesn't improve predictions. Marcel's recency weighting already handles decline implicitly. Hardcoded Caley curves are definitively the worst option.
Do Old Players Actually Decline? What Our Data Says (and Doesn't)
Every football analyst knows the story: players peak around 27, then decline. Michael Caley's Marcel projection system bakes this in — a set of age curves that adjust projected output based on how old a player is. We've been using hardcoded approximations of Caley's curves in our own Marcel engine. This week, we tried to do better by fitting empirical curves from our own data.
The short version: age curves made our projections worse, not better — even after scaling up from 789 to 9,050 player-season observations. The aging signal is real and measurable. It just doesn't help predict next season's output.
What We Did
We used Understat's league API to pull player-season aggregates for all Big 5 leagues from 2014 to 2025 — 32,459 player-seasons across 12 years. We matched birth dates from two sources: FotMob (2,614 current players) and a Wikidata bulk query of 231,138 professional footballers. Combined match rate: 79%.
The method follows Caley's "Building Marcel Part II":
- For each player-season, compute the Marcel projection without any age adjustment (2-1-1-5 weighted seasons + regression to league mean)
- Calculate the delta: actual per-90 minus projected per-90
- Smooth that delta against age using LOWESS, weighted by minutes played
- The smoothed curve becomes the age adjustment
We did this for 7 statistics (npxG/90, xA/90, xGI/90, shots/90, key passes/90, xGChain/90, xGBuildup/90), split by position group (FW, MF, DF). Training data: 9,050 player-season observations across target seasons 2017-2024. Validation: 938 players in 2025.
The Curves Are Real This Time
With 9,050 observations (up from 789), the curves finally resolve into smooth, believable shapes. Sample sizes at older ages are solid — 224 at age 33, 157 at 34, 90 at 35, 47 at 36, 34 at 37.
npxG/90 (shooting output): peaks at 23, declines smoothly to age 37.
| Age | Multiplier | N |
|---|---|---|
| 20 | 0.982 | 304 |
| 23 | **1.000** | 737 |
| 26 | 0.990 | 861 |
| 28 | 0.966 | 728 |
| 30 | 0.941 | 599 |
| 33 | 0.911 | 224 |
| 35 | 0.888 | 90 |
| 37 | 0.867 | 34 |
That's an 11% decline from peak to 35, and 13% by 37. The curve shape matches Caley's expectation: shooting peaks earliest and declines fastest.
xA/90 (chance creation): declines from age 17 onward. This is the one stat where survivorship bias still dominates the young end — only elite teenagers get 450+ minutes in Big 5 leagues, inflating their deltas. But the decline from 22 to 35 is 11%, consistent across position groups.
xGI/90 (combined involvement): peaks at 23 like npxG, with a 10% decline by 35. The shape is what you'd expect from blending a shooting peak and a creation decline.
Position-Specific Peaks
| Stat | FW peak | MF peak | DF peak |
|---|---|---|---|
| npxG/90 | 26 | 19 | 23 |
| xA/90 | 18 | 23 | 24 |
| xGI/90 | 26 | 23 | 24 |
| shots/90 | 23 | 19 | 22 |
Forwards peak latest for shooting (26), which makes sense — strikers refine their movement and positioning into their late 20s. Midfielders peak early for shots (19) because the young ones who take lots of shots transition to different roles or get screened out.
But the Curves Still Don't Help Predictions
Walk-forward validation (train 2017-2024, test on 2025, N=938):
| Age Mode | npxG/90 | xA/90 | xGI/90 | shots/90 | keyPasses/90 |
|---|---|---|---|---|---|
| **No age adj** | **+21.4%** | **+15.1%** | **+18.0%** | **+13.6%** | **+8.9%** |
| Hardcoded | +20.0% | +14.8% | +15.5% | +11.3% | +7.4% |
| Empirical | +21.0% | +15.1% | +17.2% | +13.3% | +8.6% |
(Numbers are % improvement vs naive last-year baseline. Higher is better.)
No-age wins 5 out of 5. Empirical curves are very close — within 0.05% to 0.88% — but can't quite beat doing nothing. Meanwhile, the hardcoded Caley approximations lose by 1-3% on every stat. They're definitively the worst option.
Why Real Curves Don't Improve Predictions
This is the puzzle. The curves are clearly measuring something real — a 23-year-old forward genuinely produces more npxG/90 than a 35-year-old, all else equal. So why doesn't applying this knowledge help?
Three hypotheses:
1. Marcel already handles aging implicitly. The 2-1-1-5 weighting scheme emphasizes last season (weight 2) over older seasons (weight 1 each). A declining 34-year-old already gets projected closer to their recent (lower) output. Adding an explicit age multiplier on top may be double-counting the decline.
2. LOWESS smoothing introduces noise. Even with 9,050 observations, the smoothed curves aren't perfect. At the margins — age 17 (N=27) and age 37 (N=34) — the LOWESS estimates are uncertain enough to hurt predictions more than they help. The curve is trying to correct for a 5-10% effect while introducing 2-3% noise.
3. Survivorship bias distorts the young end. xA/90, keyPasses/90, xGChain/90, and xGBuildup/90 all "peak" at 17 — which isn't a real peak, it's selection. Only Musiala-tier teenagers get enough minutes. This pulls the entire curve up at the left end, making the age adjustment systematically wrong for 20-25 year olds.
What We Confirmed
Hardcoded Caley approximations are definitively wrong for our use case. They hurt every stat by 1-3%. If you're using Marcel with hardcoded age curves from the blog posts, you'd be better off removing them entirely.
The aging signal is real but small. A ~10% decline from 23 to 35 is approximately 1% per year. For a player producing 0.30 npxG/90 at 23, that's 0.27 at 35 — a difference smaller than the typical year-to-year variance for any individual player.
Data volume was not the blocker. Going from 789 to 9,050 observations made the curves smooth and believable, but didn't change the validation result. The problem isn't statistical power — it's that the adjustment introduces as much error as it corrects.
Key Next Steps
1. Age-conditional Marcel weighting. Instead of multiplying the projection output, adjust the *input weights*. For a 34-year-old, weight the most recent season at 3x instead of 2x, and drop the oldest season. This lets the weighting scheme handle aging naturally without an explicit multiplicative curve. The advantage: it doesn't fight the regression step.
2. Asymmetric age adjustment. Only apply age curves for players 30+, where the signal is strongest and survivorship bias is weakest. Leave 17-29 year olds unadjusted. This avoids the noisy young end where the curves are most distorted.
3. Lower-league player data. Our curves only see Big 5 players. The 34-year-olds who "disappear" from our data are playing in the Championship, Serie B, and Ligue 2 — and declining. Adding lower-league data would let us observe the full aging trajectory instead of just the survivors.
4. Parametric curves instead of LOWESS. Fit a simple quadratic or log-linear decline model instead of nonparametric smoothing. Fewer parameters = less noise. A model like multiplier = 1 - 0.005 * max(0, age - 23)^1.5 would capture the shape with zero overfitting.
Update: We Tried Everything
After the initial results, we ran two more experiments and collected lower-league data to capture the "invisible decline."
Experiment: Age-Conditional Marcel Weighting. Instead of a multiplicative curve, adjust the 2-1-1-5 input weights by age. Young players (≤22): weight recent season 3x. Veterans (34+): weight recent 3x, drop oldest. Result: won keyPasses/90 by +0.05% vs no-age, lost the other 4. Helps the 17-22 age group (wins 4/5 stats within that bucket) but too aggressive for 34+.
Experiment: Asymmetric 30+ Only. Apply empirical curves only for players 30+, leave 17-29 untouched. Won xA/90 in the first run, but lost it when lower-league data was added. No consistent improvement.
Lower-League Data. Scraped 15,794 player-seasons from Sofascore across Championship, Serie B, Bundesliga 2, Ligue 2, and Segunda (2020-2025). Found 249 players who migrated from Big 5 at age 30+ to lower leagues. Average xG/90 decline: -38.2%. 85% declined. But even this extra data didn't change the curve-fitting verdict — no-age still wins 4/5.
Final score across all 5 modes, 5 stats:
| Mode | Wins |
|---|---|
| No-age | 4/5 |
| Age-conditional weights | 1/5 |
| Hardcoded Caley | 0/5 |
| Empirical (full curve) | 0/5 |
| Asymmetric 30+ | 0/5 |
The Lesson
The aging signal in football is real, measurable, and too small to improve season-ahead predictions. Marcel's 2-1-1-5 recency weighting already handles decline implicitly — a 34-year-old's projection is dominated by last season, which already reflects their current level. Adding an explicit age curve is adjusting for gravity twice.
But the migration data tells a different story for a different question. When 85% of players lose a third of their xG output after dropping a division, that's not noise — it's a signal about squad aging risk. Teams built around 30+ players are one bad season from losing their best contributors to lower leagues entirely. That's useful for the pod-shop, even if it doesn't help Marcel. [More on that here](/blog/2026-04-14-where-aging-footballers-go-to-decline).
The data pipeline is in place — 32,459 player-seasons, 15,794 lower-league observations, 5,869 matched birth dates, curves extending to age 37. No age adjustment remains the default for Marcel projections.