Did We Beat the Closing Line? Now We Know.

March 19, 2026|Infrastructure|DEPLOYED

Did We Beat the Closing Line? Now We Know.

Every bet now gets a post-settlement execution verdict: did we get a better price than closing? The bridge between 'model found edge' and 'we captured that edge.' Three new fields, zero new data sources — just connecting dots that were already there.

Every bet we log now gets a post-settlement verdict: did we beat the closing line, or did the market move against us before kickoff? This is the execution quality signal we've been missing — the bridge between "the model found edge" and "we captured that edge in practice."

Why This Matters

We've known for weeks that CLV is positive across every league (+5-11%) but ROI is negative in most markets. The gap between finding edge and making money has three possible explanations:

Model error — the model is overconfident (calibration gap)
Execution leak — we're betting at worse prices than the model assumes
Structural variance — 3-way outcomes (1X2) and totals add too much noise

Until now, we couldn't distinguish between these. We had CLV (model quality) and ROI (outcome quality) but nothing measuring execution quality — whether the price we actually got was better or worse than where the line closed.

What We Built

Three additions to the paper trading pipeline:

1. Line Movement Analysis (per bet)

Every settled bet now gets a lineMovementAnalysis object:

beatClosing — did our entry odds beat the closing price? (higher odds = better for us)
slippagePct — how much did the line move against us? (closing - entry) / entry × 100
pinnacleAtEntry vs pinnacleAtClose — Pinnacle-specific movement (the sharpest signal)
pinnacleMovePct — percentage Pinnacle line shifted

2. Execution Stats (aggregate)

The stats API now returns an executionStats block:

beatClosingRate — what fraction of our bets got a better price than closing? Should be ~50% if timing is random. Higher = good execution. Lower = we're betting too late or the market is moving against us.
avgSlippagePct — average line movement against us. Negative = we're consistently getting better prices than closing (good). Positive = execution leak.
beatClosingROI vs missedClosingROI — ROI split by whether we beat closing. If beatClosingROI is dramatically better, execution timing is the key variable.

3. Automatic Population

No manual work needed. The settler already loads closing odds from Pinnacle (football-data.co.uk) and Odds API snapshots. The new fields are computed automatically during settlement, using data that was already there but not being compared.

What This Unlocks

Loss Classification. Every losing bet can now be categorized:

Model saw edge + we beat closing + still lost = variance (bad luck, keep betting)
Model saw edge + we missed closing + lost = execution leak (fix timing)
Model saw edge + closing moved away from us = edge decay (bet earlier)
Model didn't see edge at closing = model error (investigate signal)

Timing Optimization. If beatClosingRate varies by time of day or hours before kickoff, we can shift the cron schedule to capture better prices.

League-Level Execution. slippagePct by league tells us where execution cost is highest. Leagues with consistently negative slippage (we beat closing) deserve more capital. Leagues with positive slippage (closing beats us) need timing fixes or smaller stakes.

Capital Allocation. Combined with the existing market-type sizing (AH 1.0x, 1X2 0.25x, OU25 0.25x) and tier-based conviction, execution quality becomes the third dimension of position sizing: *what* to bet (market type) × *where* to bet (league tier) × *when* to bet (execution timing).

The Data Already Existed

This is a recurring pattern in our system. The settler was already loading closing odds, computing CLV, and computing execution CLV. It had the entry price (from the logger) and the closing price (from football-data.co.uk). It just wasn't comparing them directly.

The implementation was 3 files, ~50 lines of new code. No new data sources. No new API calls. Just connecting dots that were already there.

The most valuable features are often the ones that extract more signal from data you're already collecting.