Using a dataset of 1,016,310 de-vigged moneyline odds from 40+ sportsbooks across 21,688 games in four major US sports (2023-2025), we examine the relationship between the size of cross-book pricing deviations and the realized win rate of the deviant selection. The relationship is monotonically negative: small deviations (0.5-1.0%) are associated with win rates of 59-61%, while deviations above 3.0% are associated with win rates of 26-29%. We attribute this to longshot amplification: the relative deviation formula inflates small absolute errors when the base probability is low, producing large apparent edges that reflect noise rather than genuine mispricing.
When one sportsbook prices a team differently than the rest of the market, a natural interpretation is that the outlier book is "wrong" and the deviation represents a betting opportunity. The larger the deviation, the larger the opportunity — or so the reasoning goes.
We tested this premise on two years of historical data. It does not hold. In fact, the relationship between edge size and win rate inverts, and it does so consistently across sports. The biggest apparent edges are the worst bets in the dataset.
This result is not as paradoxical as it first appears. It follows directly from the arithmetic of how edges are computed. But the fact that it follows from arithmetic does not make it obvious to practitioners, many of whom use cross-book deviation as a primary signal for value.
We use 1,016,310 closing moneyline odds from 40+ sportsbooks on 21,688 games across the 2023-24 and 2024-25 NBA, NHL, MLB, and NCAAB seasons. All probabilities are de-vigged using the power method (Clarke, Kovalchik, Ingram 2017), which accounts for the favorite-longshot bias. For each game, we estimate a fair value probability using a reference anchor book and compute the percentage deviation of every other book from that anchor:
edge% = (pbook − pfair) / pfair × 100
We then group these deviations into tiers and compute the realized win rate of the backed selection in each tier.
| Edge tier | Observations | Wins | Win rate | Signal |
|---|---|---|---|---|
| 0.5 – 0.7% | 2,482 | 1,513 | 61.0% | Positive |
| 0.7 – 1.0% | 3,299 | 1,971 | 59.7% | Positive |
| 1.0 – 1.5% | 4,229 | 2,279 | 53.9% | Marginal |
| 1.5 – 2.0% | 2,946 | 1,486 | 50.4% | None |
| 2.0 – 3.0% | 3,032 | 1,343 | 44.3% | Negative |
| 3.0%+ | 4,375 | 1,253 | 28.6% | Strongly negative |
| De-vigged using power method. Fair value from reference anchor. Only h2h (moneyline) markets. | ||||
The pattern is monotonic. At 0.5-0.7%, the backed selection wins 61% of the time. At 3%+, it wins 28.6%. That is not a marginal deterioration; a 28.6% win rate on what is supposed to be a positive-edge bet is worse than simply picking the underdog at random.
The same structure holds in NCAAB:
| Edge tier | Observations | Win rate |
|---|---|---|
| 0.5 – 0.7% | 4,058 | 61.0% |
| 0.7 – 1.0% | 5,240 | 60.2% |
| 1.0 – 1.5% | 7,247 | 58.3% |
| 1.5 – 2.0% | 5,141 | 53.7% |
| 2.0 – 3.0% | 6,423 | 49.7% |
| 3.0%+ | 12,203 | 26.2% |
| The 3%+ tier is the largest in the dataset (12,203 observations), not a small-sample artifact. | ||
Note the sample size in the 3%+ tier for NCAAB: 12,203 observations. This is not a statistical fluke. It is the single largest tier in the college basketball dataset, because NCAAB is full of lopsided matchups that produce heavy favorites and corresponding longshots. Those longshots are the source of the large-percentage edges, and they lose at exactly the rate their probability suggests they should.
The explanation is a consequence of the edge formula. Edge percentage is a relative deviation from a fair probability estimate. Relative deviations are amplified when the denominator is small.
Suppose a team's true win probability is 25% (roughly +300 American odds). If a sportsbook estimates them at 24% — an absolute error of one percentage point, well within the precision of any pricing model — the edge formula returns (25 − 24) / 25 = 4.0%. A four-percent "edge."
Now consider a team at 67% probability (roughly -200). The same one-point absolute error yields (67 − 66) / 67 = 1.5%. A smaller edge, but on a probability that both the model and the market can estimate with far greater precision. The 1.5% edge on the favorite is informative. The 4.0% edge on the longshot is noise wearing a suit.
The 3%+ tier in our data is populated overwhelmingly by these longshot bets: underdogs at +250 and longer, where the probability denominator is small enough to inflate trivial absolute errors into attention-grabbing edge percentages.
NHL is a case worth mentioning separately. The win rates in the NHL table range from 52.6% at the 0.5-0.7% tier to 40.7% at the 3%+ tier — the same inversion pattern, but starting from a much lower baseline. Even the best tier barely clears a coin flip. This is consistent with the well-documented observation that hockey moneylines are among the lowest-signal markets in US sports.
MLB shows no profitable tier on closing moneylines. Win rates range from about 51% at the tightest edges to 45% at the widest, with most tiers hovering near 50%. Baseball moneylines at close are efficiently priced, at least in the aggregate.
For anyone using cross-book deviation as a betting signal, the data suggests a ceiling. Edges above approximately 3% are overwhelmingly longshot noise rather than genuine mispricings. Between 0.5% and 1.0%, the signal is reliably positive for NBA and NCAAB. Between 1.0% and 2.0%, it is weaker and sport-dependent. Above 2.0%, it deteriorates. Above 3.0%, it inverts.
There is a temptation to interpret large edges as large opportunities. The data indicates the opposite. In this market, as in most quantitative contexts, the quiet edges — the small, persistent, precisely estimated ones — are the ones that compound. The loud ones are usually wrong.