The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball (9 page)

BOOK: The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball
2.41Mb size Format: txt, pdf, ePub
ads
Hitting

Like baserunning, the evaluation of hitters is complicated by the fact that teammates usually act in concert in order to push runners across the plate. For example, if the leadoff hitter walks, advances to third on a single by the second hitter, and then comes in to score on a sacrifice fly by the third hitter, then clearly all three players have contributed something to the scoring of that run. Conventional scorekeeping credits the leadoff hitter with a run scored (R), and the third hitter with a run batted in (RBI), but the second hitter is not directly credited for his contribution to that run. For this reason, sabermetricians have long lamented the focus on runs scored and RBIs in award voting, and sought a more accurate, balanced, and systematic way to credit each hitter for his contributions to run scoring.

One approach to doing this is to again sum the values of the changes in the expected run matrix. But this can be a complicated calculation that requires reams of detailed play-by-play data. A simpler, more accessible approach is to engineer a statistic that corresponds closely to run scoring for teams, and then apply that to individual players. For many years, it was assumed that batting average was the best way to do this. But while batting average does a decent job of explaining the variation in the number of runs that a team scores, it leaves much to be desired. In
Figure 2
, each dot represents one team in one season, and all team-seasons from 1954 to 2011 are represented (as in
Figure 1
, each dot is slightly transparent, so that the appearance of a darker cluster indicates several overlaid dots). It is clear that as a team’s batting average increases, the number of runs that they score also increases. Statisticians use a measurement called the correlation coefficient to describe the strength of that linear relationship. The correlation coefficient of 0.82 shown below suggests that about 67 percent (0.82 × 0.82) of the variation in runs scored is explained by batting average.
10
Thus, knowing a team’s aggregate batting average (and nothing else) gives you a decent understanding of how many runs that team will score.

But there is a long way to go from 67 percent to 100 percent, and sabermetricians have sought to close that gap in inventive ways. One of the reasons that the A’s focused so heavily on on-base percentage (OBP) in the
Moneyball
era was that it does a better job of predicting runs scored than batting average. In
Figure 3
, the relationship between runs scored and OBP is more closely linear than it is for batting average, and the correlation coefficient is higher (0.88). Further improvements were made by OPS (the simple sum of OBP and slugging percentage [SLG]; its correlation with runs scored is about 0.95), which Lewis erroneously claims is “a much better indicator than any other offensive statistic of the number of runs a team would score.”
11
In fact, it was known at that time that OPS was bested by a bevy of run estimators, most notably by Run Created, a nonlinear invention of James that in its simplest form is the
product
of OBP and SLG.

Figure 2. Team Runs Scored Versus Batting Average, 1954–2011

The relationship between a team’s batting average over the course of a season and the number of runs that they score. The correlation is strong (0.82), but can we do better?

But suppose that rather than invent a formula out of thin air, you wanted to derive a formula based on simple assumptions about what the formula should look like. A common set of assumptions is that each offensive event should be associated with an average run value, and those run values should be summed based on their frequency. Clearly, a home run is worth, on average, more than one run—but how much more? Sabermetricians have devised two completely independent techniques for deriving these average run values, and arrived at similar answers. The first technique is to simply average the number of runs scored on each play. The second is to apply a well-known statistical technique called multiple regression using team statistics. It can be proven mathematically that the latter technique will provide the best fit to the data under a variety of assumptions. Thus, it represents essentially the best estimator that can be constructed with the idea that each batting event affects runs scoring in a linear fashion. Any run estimator that obeys these assumptions belongs to the class of linear weights formulas. While there are many linear weights formulas, we will use one known as eXtrapolated Runs (XR).
12
In XR, a home run is worth 1.44 runs, and its correlation with runs scored is about 0.95.
13

Figure 3. Team Runs Scored Versus On-Base Percentage, 1954–2011

The relationship between a team’s on-base percentage (OBP) over the course of a season and the number of runs that they score. The correlation is stronger (0.88) than it is for batting average.

Figure 4. Team Runs Scored Versus OPS, 1954–2011

The relationship between a team’s cumulative OPS over the course of a season and the number of runs that they score. The correlation is very strong (0.95).

The value of OPS is that it provides a simple way to translate a player’s hitting statistics into an estimator that we know corresponds closely to runs scored. Thus, knowing the OPS of each player gives us a better understanding of how many runs he is contributing to his team than if we knew only his batting average. Two players with identical batting averages could have significantly different OPSs, and we would be remiss in thinking that they were making equal contributions to their team’s offense.

The popularity of OPS has much to do with its combination of simplicity and accuracy. Since it is the simple sum of two existing and commonly reported statistics (OBP and SLG), it can be computed in one’s head quickly, and as we have shown above, provides a very good estimate of runs scored. The superior estimators outlined above are only slightly more accurate, but much more complicated to compute (try multiplying OBP and SLG in your head!). Nevertheless, since OBP and SLG are on different scales (the former is truly a percentage, the latter is not), the question of why the simple sum should be so accurate quickly arose. DePodesta supposedly found that by valuing a point of OBP at three times the value of a point of SLG, the fit to runs scored could be made even tighter.
14
Subsequent research has suggested that the true value is closer to 1.8, as opposed to three.
15
Nevertheless, DePodesta’s calculation presented Beane with evidence that OBP was even more important relative to power than sabermetricians thought. This discrepancy made DePodesta’s argument “heresy,” which Beane interpreted as “good,” and the “best argument he had heard in a long time.”
16
It is implied that DePodesta’s calculation was the primary motivation behind the A’s emphasis on OBP.

Predictive Analytics

The preceding argument should make it clear that some of the more recent sabermetric hitting statistics (e.g., OPS, Runs Created, linear weights) are quantifiably better at estimating the number of runs that a team will score than older statistics such as batting average. Thus, it stands to reason that when applied to an individual hitter, they do a better job of quantifying his
contribution to his team’s offense. This captures the notion of the
accuracy
of the statistic. A separate but perhaps equally important question is whether those statistics do a better job of predicting that hitter’s
future
performance.

To summarize, we now know that if one wants to estimate a player’s offensive contribution, one is better off knowing that hitter’s OPS as opposed to his batting average. The next question is, if one wants to predict a hitter’s OPS next season, what statistic is most helpful? The answer is not obvious, and, as we will illustrate below, it is not as simple as knowing his OPS this season.

Forecasting the future performance of baseball players is a task that falls under the umbrella of “predictive analytics,” a term that is increasingly common in the data-driven world in which we live. The task, as Lewis suggests, is not unlike what many financial analysts do, and, in fact, Lewis (a former financier) describes DePodesta as “just the sort of person” who might otherwise find himself doing exactly that.
17
In order to predict the future performance of baseball players, one needs a model that separates the elements of baseball that are predictable from those that are not. For clarity of exposition, let’s call the former “skills,” and the latter “luck,” “chance,” or “randomness.” (In Nate Silver’s parlance, these are the “signal” and the “noise.”)
18
Randomness is by definition unpredictable (despite the best efforts of those financial wizards performing “technical analysis”). It is hopeless to try to predict the outcome of a random process, but what one can do is attempt to understand the distribution of that randomness, and conversely focus one’s efforts on predicting skills accurately. This requires a nuanced understanding of what various baseball statistics actually reflect. Are they measuring skills? Or just luck? Or (as is most often the case) some combination of both? How much of each?

Every game exists on a spectrum that describes the amount of chance inherent to it. Some games, like chess, are entirely skill-based and involve no element of chance. Others, like the lottery, are completely based on luck and admit no skill. Baseball is a game that lies somewhere in between. Clearly, tremendous skills are required to play the game at the major league level, and players who are more skilled perform better over a long period of time. But chance also plays a large role, as the outcomes of games, and even championship seasons, can be determined by the tiniest unpredictable element (think Jeffrey Maier, Bernie Carbo, Bill Buckner, etc.). The same is true of statistics
like OPS and batting average, which obviously capture the skills of different hitters to some extent, but just as obviously measure outcomes that have little to do with the hitter (e.g., the bloop single that happened to fall in for a hit). As in the game itself, it is not always obvious how much of what these statistics are capturing is skill, and how much is chance.

The notion of
reliability
helps statisticians distinguish between the signal (skills) and the noise (chance) present in a measurement. That is, if a measurement (e.g., batting average) is actually measuring something that tends to stay the same (e.g., a skill of a particular player), then repeated measurements should also tend to stay the same. Conversely, if what was being measured was actually due to chance, then one would expect to get wildly different measurements. For example, the height of an adult can be considered the ultimate skill, in the sense that it does not change (much or at all) over time, so repeated measurements of one’s height are very likely to be the same again and again and again. Conversely, the number of five-dollar bills that one has in one’s pocket is not a reliable measurement of an attribute of that person, because the quantity will fluctuate significantly depending on what one is doing and the change one received after one’s last cash transaction. While attributes of people clearly do change over time (height changes rather slowly or not at all; weight changes more frequently), we expect that attributes of people are likely to be similar over short periods of time. Thus, if a statistic, when applied to the same player, remains similar over time, it provides evidence that what is actually being measured by that statistic is an attribute of that player. On the other hand, if we observe the statistic fluctuating appreciably over a short period of time, then it suggests that either what the statistic is measuring is not really an attribute of that player, or the attribute itself has little predictive value.

BOOK: The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball
2.41Mb size Format: txt, pdf, ePub
ads

Other books

God's Favorite by Lawrence Wright
Wake Up and Dream by Ian R. MacLeod
Speak the Dead by Grant McKenzie
The Scepter's Return by Harry Turtledove
RenegadeHeart by Madeline Baker
Dragons Deal by Asprin, Robert
Chieftains by Forrest-Webb, Robert