Fantasy Lab/The Blind Backtest

Methodology Receipt · 2017–2025

Could the model actually win an ODB Fantasy contest?

We rebuilt the WSOP fantasy projection for every year from 2017 to 2025 using only the data available before that year's Main Event kickoff. No ridge regression. No knowledge of who broke through. The model picked an 8-player lineup at the actual ODB auction prices, added the highest-projected bonus pick from that year's curated bonus pool, and was scored against the real ODB Fantasy field using that year's contemporaneous rules.

Here's exactly what happened.

3/7

Cash rate

3 of 7 years in the money

Wins

no crowns — best finish 16th of 478

+143%

7-year ROI

Net $5K on $4K entry

68%

Avg finish

Beat ~68% of the field on average · rank 164

The headline: The model doesn't win crowns. But across seven years it cashed three times for $9K on $4K in entries — a real edge, mostly produced by consistent top-quartile finishes rather than home runs. The biggest miss came in 2022, when our most expensive pick (Hellmuth) underperformed and we missed Koray Aldemir as a $2 sleeper.

The Method

What the model is allowed to see, and what it isn't.

Projector: Per-player projection = recency-weighted (5/4/3) average of fantasy points across Y-1, Y-2, Y-3, regressed toward 10 points using events/(events+30). No ridge regression, no future data, no information from year Y or beyond.

Pool + prices: Each year's actual ODB Fantasy player pool (primary + bonus, as published before that year's WSOP). Each year's actual ODB Fantasy auction (the prices contestants paid).

Bonus rule: ODB Fantasy has used 1 free bonus pick per roster every year since 2017. The blind model picks the top-projected name from the year's curated bonus pool.

Scoring + ranking: Each year's scoring rules (event multipliers + field bonuses from fantasy_event_entries). Final scores are ranked against Each year's actual ODB Fantasy entries — same teams the human owners submitted.

Excluded years: 2020, 2021 (WSOP cancelled / virtualized due to COVID.)

Why this is the honest backtest

Most projection models report flattering numbers because their projections were tuned with full knowledge of how players actually performed. Here, the 2025 lineup was picked using only 2022, 2023, and 2024 data. The 2024 lineup used 2021-2023. The 2017 lineup used 2014-2016. No back-fitting, no peeking at the answer key. The page you're reading was generated by hitting theblind-backtest endpoint — you can verify it in the open.

Year by year

The seven lineups. The seven verdicts.

Each card shows the lineup the model picked using only pre-year data, what each player projected to score, what they actually scored, and how that team finished against the human-submitted field.

Two Projectors, One Test

What if we used ridge regression — also strictly blind?

Marcel is a simple 3-year recency-weighted average. Critics will fairly ask: what about your actual production ridge model — does it win when you force it to be blind? We re-trained the ridge regression model seven times, once per test year, using only data from strictly prior years (the 2017 ridge model trained on 2011-2016; the 2025 ridge model trained on 2011-2024). No future leakage in either. Result: the simpler projector wins.

Marcel

★ Winner

3-year weighted average · no price feature · no training

3/7

Cash rate

Wins

+143%

ROI

$5K

Net

68%

Avg %ile

Ridge Regression (blind)

Re-trained per year on prior years only · Ridge regression (λ=100) re-trained for each test year using only data from strictly prior years. No future leakage.

1/7

Cash rate

Wins

+0%

ROI

Net

36%

Avg %ile

Why the simpler model wins (and what it teaches us)

The ridge regression model uses draft_price as a feature — and the training signal “expensive players score more” holds in historical data. So when blind ridge meets a year where the most expensive picks bust (looking at you, 2025: Ausmus, Seiver, Schulman all underperformed), it walks straight into the trap. Marcel ignores price entirely and just trusts each player's last three years of fantasy points, which turns out to be more robust when the auction is overpricing chalk. This is the kind of finding you only see when you actually run the blind test, instead of grading your own homework with a model that's seen the answer key.

Ridge per-year detail

Year	Trained on	Rank	Score	Marcel rank	Result
2017	6 yrs · 579 rows	222/255	523	26/255	miss
2018	7 yrs · 641 rows	28/287	998	143/287	$4K
2019	8 yrs · 759 rows	283/478	782	16/478	miss
2022	9 yrs · 854 rows	346/433	862	308/433	miss
2023	10 yrs · 962 rows	311/594	922	83/594	miss
2024	11 yrs · 1111 rows	447/709	834	421/709	miss
2025	12 yrs · 1257 rows	858/873	438	148/873	miss

Multi-Entry Portfolio Test

Three lineups, not one. Did diversification help?

The single-entry result above is volatile by design — one team, one shot at the field. Real fantasy contestants often run multiple lineups to smooth variance. So we tested a three-entry portfolio: a chalk lineup maximizing raw projection, a value lineup maximizing projection per dollar, and a contrarian lineup that excludes the top 25% most-projected names then maximizes from the remaining pool. All three use the same blind marcel projector. Each year costs $500 × 3 = $1,500 in entries.

$10,500

Total entry

3 entries × 7 yrs

$9,500

Total payout

3/7 yrs cashed something

-$1,000

Net

vs. K=1 chalk: see below

-10%

ROI

portfolio over 7 years

The honest answer

Diversification didn't help in our blind test. The chalk lineup alone went 3 cashes / +143% ROI at $500 entry × 7 years. Adding value + contrarian lineups added $1,000 in extra payout but cost $7,000 extra in entries — net negative. The value strategy in particular gets killed by stacking $1 sleepers who collectively underperform their modest projections; the contrarian strategy avoids chalk but doesn't reliably find the next breakout. A more sophisticated multi-entry strategy — e.g., Monte-Carlo correlation-minimizing portfolio construction with adjusted bonus picks — is the next iteration. For now, the single chalk lineup beats this naive diversification.

The Receipt, Stacked

Best year to worst.

2019

16th of 478·1,173 pts vs winner's 1,379

97%ile

$4K

2017

26th of 255·960 pts vs winner's 1,297

90%ile

$4K

2023

83rd of 594·1,157 pts vs winner's 1,813

86%ile

$1K

2018

143rd of 287·724 pts vs winner's 1,259

50%ile

miss

2025

148th of 873·1,156 pts vs winner's 1,615

83%ile

miss

2022

308th of 433·939 pts vs winner's 1,952

29%ile

miss

2024

421st of 709·850 pts vs winner's 1,679

41%ile

miss

What this isn't

Three caveats no honest researcher should skip.

1. The pool itself wasn't blind.

We used the actual ODB Fantasy player pool from each year — which the human-run contest curated based on their own forward-looking judgment of who'd be relevant. Our model didn't pick the universe; it picked within the universe. If the curators had a systematic blind spot (say, missing a hot mid-stakes pro who later broke out), our model inherited that blind spot.

2. One entry per year is high variance.

Real ODB Fantasy entries cluster among multiple lineups per owner — the big winners often have 3 to 10 different submissions. We're scoring a single optimal lineup against fields where the top finishers may have spent 10× entry fees. A 3-of-7 cash rate from a single annual entry is meaningfully better than coin-flip; multiple diversified entries would likely cash more often and crown rarely.

3. The scoring rules drift.

ODB introduced the bracelet bonus around 2025. Earlier years had a simpler structure. We apply each year's actual rules from fantasy_event_entries, but a meta-strategy that exploits the bracelet bonus (e.g., loading up on elite mixed-game specialists) only became correct in the last two years. Our marcel projector doesn't know about the rule change — it just projects fantasy points using the rules in effect at scoring time.

The takeaway

The model has real signal — it finishes in the top third of a 200-to-873 person field consistently — but it isn't a contest-winning machine on a single-entry basis. The realistic path to a crown is a diversified three-to-five entry portfolio, not a single optimum. That's the calibrated honest claim, and these receipts are how we get there.