Lies, Damn Lies and Baseball Statistics

I am going to take a flyer, because after all this is my blog, and post a digression that really has nothing to do with the subjects of this blog, other than the fact that it relates to the significance of statistics and their impact on litigation, including financial litigation involving ERISA plans. I have talked before about the extent to which the role of statistics in this area – as in all of litigation, frankly – can quickly move the parties into the netherworld of half-truths, inaccurate modeling, and misleading statistic-based assumptions that spawned the well-worn phrase “lies, damn lies and statistics.” The phrase, for those of you not that familiar with it, basically reflects the idea that statistics, used for evil rather than good, can be used to try to show to be true something that is, in fact, not so, and that absent the cover of statistical analysis and complexity, would be seen quickly as either a lie or a damn lie.

This problem permeates litigation, whenever one moves into areas that require extrapolating from a small set of examples, such as determining damages across an entire class based on the data relevant to a limited number of class members or, in another type of scenario, determining contractual rates of performance by extrapolating from a defendant’s performance in a subset of cases. Cases built around such statistical approaches can run off the tracks, if one side or the other’s statistician applies suspect methodology. Of most fun to me, methodological errors, if the lawyers on the other side are astute enough to recognize them, can invalidate one side’s expert testimony and preclude its submission to a jury.

The extent to which statistics can be used or misused in this type of way to either illuminate or instead obfuscate what would otherwise be discernable to the observer has never been so well-illustrated as it has been in recent years by the explosion in the use of statistical analysis in major league baseball, a point both illustrated by and discussed in this article, from the new, overtly literate sports website Grantland (something I see as an attempt by ESPN to expand its tentacles to swallow up those few of us remaining sports fans who don’t think talking heads on a television screen yelling at each other about sports is all that interesting). Statistical analysis has devoured the front offices of baseball teams, replacing, in many instances, career observers and students of baseball, like a Pat Gillick, with laptop cradling, statistics-obsessed graduates of top-flight universities who, absent this opportunity to use math to grab central roles in baseball front offices, would have taken their math skills to Wall Street, or become well-paid actuaries.

However, what is getting lost in the shuffle here, and perhaps more than that in the smoke, mirrors and obfuscation that statistical analysis, when not properly placed in context, can generate, is that the statistical analysis being provided by the new, younger generation of baseball executives is not a “new” way of looking at baseball at all, but is simply the art of reducing observed reality to data that anyone with an understanding of the math can grasp and apply. What the data and their crunchers are doing is reducing the events that make up baseball games to numbers, so that people without vast experience observing the game and its myriad variations can understand and act on the game’s nuances. These analysts are not, however, doing anything more than illuminating facts that astute observers, after observing thousands of games and the variations inherent in them, have previously mastered based only on decades of watching the game and its possible outcomes in different circumstances, without ever reducing that knowledge to mere numbers and statistical formula. In essence, astute and experienced people who have seen enough baseball can tell you the same thing, and make the same judgment calls about which players are good at what, how to use players, what tactical decisions to make, and what players to sign, without ever needing the crutch of statistical analysis to do so. This, however, takes not just perception – something that not every baseball insider has – but also the decades of experience needed to see enough variation in the game, its players and its outcomes to be able to forecast outcomes. It is neither a coincidence nor an indicia of youthful expertise and comfort with math that the statistical revolution, such that it is, in baseball has resulted in ever younger baseball executives replacing much older, experienced warhorses. Rather, it is because turning the reality of baseball into numbers that can be analyzed independent of actual experience with the sport allows anyone astute enough to manipulate statistics to become an expert of a sort on baseball, without the need to first observe however many tens of thousands of hours of the sport that the old guard, without access to such data, had to see first hand before they could make the same evaluations and reach the same conclusions.

Want proof? You can find it right here in this article on Moneyball on Grantland, with its discussion of fielding being undervalued statistically, and the use of that fact by some teams to improve their ballclubs by focusing their spending on buying fielding. But there is nothing new about the idea that defense wins and that you can win by putting excellent fielders out there; all that is new is the reduction of that maxim to data points. Earl Weaver, whose expertise was developed by a lifetime spent watching baseball games, and who by doing so collected in his head – whether he realized it or not, although I think he did – all of the same data points that the new egghead baseball thinkers collate obsessively and then reduce to formulas on their laptops, always emphasized the importance of defense, yet he retired when most of the current generation of young baseball executives were in elementary school, assuming they were even born yet. Want more? Before the current statistical obsession turned to proving the role and importance of fielding, the baseball stat people were obsessed with demonstrating the lack of importance of bunting, hit and runs, and anything else that gave up outs and at-bats for free, in comparison to the importance of taking advantage of those opportunities at the plate; that whole idea is nothing more than the reduction to numbers of the Earl of Baltimore’s well-known hatred of bunting and the hit and run (something which some observers, incidentally, still think cost his team the 1979 World Series).

The point of this is not to belittle the new generation of baseball experts, who interpret baseball reality by what numbers tell them, but to illustrate a fact which too often gets missed: that statistics in baseball are not a new way of looking at the game, but are instead merely the reduction to numbers of a reality that others were already able to see. It has always troubled me that this simple fact has long been ignored in the glorification of the new baseball statistics, and in the idea that the older generation of baseball experts - people like Jack McKeon who somehow, despite not running SPSS packages for fun at night, still won a championship - suffered from a blind spot and did not know what the numbers showed.

And I suppose if I have to tie it back into the subject of this blog, and to my own professional preoccupations – which include the need to communicate clearly to juries and to understand when expert analysis is only interfering with doing that – I would point out that this is the perfect illustration of exactly how statistics, misused, can misrepresent reality in the courtroom. It is always important to grasp the extent to which statistics are clearly demonstrating a reality that cannot otherwise be seen, and when they are instead simply illustrating what could be seen without statistics and instead with careful observation. If you think about it, under the rules of evidence, statistical evidence doesn’t belong in a case in the latter instance, but can be truly illuminating in the former. The use of statistics in baseball, and how we think of them, is a perfect representation of this distinction.