The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball (25 page)

BOOK: The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball
3.26Mb size Format: txt, pdf, ePub

Oakland, St. Louis, Atlanta, the Chicago White Sox, Minnesota, and Florida have the largest average residuals.

Table 24. The Relationship Between Win Percentage and Payroll, Without Team Fixed Effects

Building a Composite Sabermetric Index

Our goal here is to build a single index that will measure sabermetric intensity. To do this, we need weights for each of the elements of saber-intensity. A reasonable choice for those weights is the normalized coefficients from a double-log model for WPCT as a function of the sabermetric variables that we are tracking. That model is presented below. Since the baserunning statistic can be negative, we transform it so that it is always positive. The natural log of WPCT is the dependent variable.

The value of these coefficients gives us a sense of the relative impact that each has on WPCT. Note that while a team’s fielding metric (DER) appears to have the largest impact upon WPCT, each of the six terms has a statistically significant effect at the 10 percent level or better.

In order to use these coefficients as weights, we drop the coefficient for the intercept term, take their absolute values, and then normalize them.

Table 25. Double Log Model to Weight Components of SI Index

Table 26. Final Component Weights in SI Index

log(OBP)

0.2367

log(ISO)

0.0374

log(FIP)

0.2903

log(DER)

0.4328

log(brun)

0.0008

log(sacbunt)

0.0020

This suggests that roughly 43 percent of our sabermetric index will consist of a team’s fielding metric (DER). These weights are then applied to our six saber-intensity elements to generate our composite index of saber-intensity (SI). The results for the top thirty teams in SI between 1985 and 2011 are presented in
Table 17
in
Chapter 7
.

When we regress the residuals from the WPCT and payroll regression on our saber-intensity index, the results are presented in
Table 27
.

Note that since the average team has an SI of 1, the predicted WPCT residual (the expected impact of all factors other than payroll) is zero for the average team.
6
It is worth reiterating that there is no reason to believe, independent of sabermetrics, that a team with a high sabermetric index would be more successful than their payroll would indicate. Yet the regression model above shows that our sabermetric index explains nearly 37 percent of the variation in team winning percentage that is not explained by payroll.

Table 27. Relationship of Win Percentage and Team Saber-Intensity (SI)

MODELING THE SHIFTING INEFFICIENCIES
IN MLB LABOR MARKETS

In this section we describe how we modeled our test for the morphing inefficiencies in baseball’s labor market. Our approach is an extension of the one employed by Hakes and Sauer in 2006.
7

There are two main components to this procedure. First, we construct a model for team performance in terms of simple performance metrics. In this manner, we gain an understanding of what skills translate into team success. Second, we construct a model for how those skills are compensated on the labor market. Market inefficiencies are reflected in the differences in the estimates between these two models.

Hakes and Sauer identified three largely orthogonal qualities that reflect on-field performance for both batters and pitchers: Eye (walks plus hit by pitch per plate appearance), Bat (batting average), and Power (slugging percentage divided by batting average). Since there is a natural equality to the way in which a team’s offense and defense contribute to their success, the Hakes and Sauer model constrains the coefficients such that the Eye of a team’s hitters makes an equal contribution to the Eye of the opposing team (EyeA). The dependent variable is team winning percentage above .500. Thus, the full model is:

WPCT − 0.5 = β
0
+ β
1
(Eye − EyeA) + β
2
(Bat − BatA) + β
3
(Power − PowerA)

In
Table 28
, we present the coefficients, standard errors, and R
2
s for this model applied to different periods of time. Our results largely correspond with those presented by Hakes and Sauer, although with the benefit of hindsight, we are able to draw more nuanced conclusions.

Table 28. Effect of Hitting Skills on WPCT

The second component of this evaluation is to model how these skills are valued on the labor market. Here, we construct a model for the natural log of the salary of an individual player as a function of his performance in these three statistics in the previous season. Control variables are added for plate appearances, free agency and arbitration eligibility, indicator variables for catchers and infielders, and fixed effects for each year. Only players with 130 plate appearances are included. The results are shown in
Table 29
.

Table 29. The Effect of Hitting Skills on Player Salary

The foregoing represents our basic approach. Various permutations in the modeling were attempted, including taking logs of each of the independent variables. Notable results are discussed in
Chapter 7
.

NOTES

Preface

1
. To be sure, Taylor was concerned with other elements of workplace design and control as well. See, for instance, Charles Wrege and Amedeo Perroni, “Taylor’s Pig-Tale: A Historical Analysis of Frederick W. Taylor’s Pig-Iron Experiments,”
Academy of Management Journal
17, no. 1 (March 1974), 6–27, and Harry Braverman,
Labor and Monopoly Capital
(New York: Monthly Review Press, 1974).

2
. Prior to the Industrial Revolution, some of the earliest uses of measurement and numbers were attached to sport. For an excellent discussion of the historical evolution of measurement in sports, see Allen Guttmann,
From Ritual to Record: The Nature of Modern Sports
(New York: Columbia University Press, 1978).

Chapter 1. Revisiting
Moneyball

1
. Hyperbole is equally present in the 2012 anti-
Moneyball
movie,
Trouble with the Curve
, wherein the sabermetricians are portrayed as unlikeable buffoons and the scouts as insightful mavens. As with
Moneyball, Trouble with the Curve
’s emotional appeal is largely driven by a father-daughter relationship.

2
. Scott Sherman, “Rethinking America’s Pastime: The Paul DePodesta Story,”
Harvard Crimson
, May 5, 2012.

3
. Also see, for instance, Michael Lewis,
Moneyball
(New York: W. W. Norton, 2003), p. 256.

4
. See, for one, Alan Schwarz,
The Numbers Game: Baseball’s Lifelong Fascination with Statistics
(New York: St. Martin’s Press, 2004), p. 75, citing Lindsey’s 1959 article in the journal
Operations Research
.

5
. See, for example, Tom Tango, Mitchel G. Lichtman, and Andrew E. Dolphin,
The Book: Playing the Percentages in Baseball
(Washington, D.C.: Potomac Books, 2007), chapter 11. For further reading see J. Click, “What if Rickey Henderson Had Pete Incaviglia’s Legs?” in
Baseball Between the Numbers: Why Everything You Know About the Game Is Wrong
(New York: Basic Books, 2007), chapter 4-1, 112–126; and B. Baumer, J. Piette, and B. Null, “Parsing the Relationship Between Baserunning and Batting Abilities Within Lineups,”
Journal of Quantitative Analysis in Sports
8, no. 2 (2012).

Other books

Notes from Ghost Town by Kate Ellison
StarFight 1: Battlestar by T. Jackson King
Dual Threat by Zwaduk, Wendi
Four Spirits by Sena Jeter Naslund
Hope House by Tracy L Carbone