Underlying Logic: Baseball MVP talk: quality, value, and chance

For a starting point, I'll take this column by Sean McAdam supporting David Ortiz over Alex Rodriguez for MVP in the American League.

Now, this is an unusually stupid column. A writer who says that "it's impossible to imagine that anyone could be more valuable to his team than David Ortiz is to the Boston Red Sox" is simply not taking language seriously. Sadly, however, the column does seem to reflect the level of thinking among most writers who explain their votes--and the writers elect the MVP.

First, I'm going to articulate what I think would be the traditional "stathead" position on McAdam's column, a position I support almost entirely, and then I'll explain a complication I've come to consider in the statheaded approach.

The most fundamental problem with McAdam's argument is that he's using statistics as an advocate rather than as an analyst. He cites a hodgepodge of stats, ranging from those that do a good job of measuring individual hitting production (slugging percentage) to traditional triple-crown stats that have long been shown to be lacking because they depend on teammates' performance (RBI) and exclude important information such as a hitter's walks and doubles. McAdam's standard is simply to cite the evidence that makes Ortiz look good. One name for that approach is intellectual dishonesty. Another is sports opinion journalism.

The problem is not that some sports opinion writers say thoughtless things or twist evidence to make their cases. They are paid to generate readership (or viewership), and partisan columns can serve that purpose well. But the need for a writer to present an original angle in a debate is directly at odds with the writer's function as a voter in the awards race. To analytical purists, the awards would ideally reflect the application of the best analytical practices we know of; thoughtful people can disagree about the details of the standards, but they must agree that an even-handed account of available evidence is the only reasonable starting point. But sports opinion writers can't do that, for reasons I'll return to.

Baseball offers analysts more objective evidence about individual performance than other sports do. In football, the performance of running backs depends on that of everyone else on the team--the rest of the offense has to create running opportunities, the coaches need to call running plays, and the defense needs to maintain control of the game to avoid a desperate pass-based comeback attempt. Baseball's pitchers and hitters, however, are almost entirely on their own, and the team-based elements of their performance are fairly easy to recognize and disregard in the data generated by baseball's uniquely long seasons. Therefore, statheads say that we can and should factor out statistics that depend on team performance (pitchers' W-L records, hitters' RBIs and runs scored) and test measures of individual performance based on their demonstrable effectiveness. For hitters, the quick statheaded way to account for nearly all of offensive production is to add on-base percentage plus slugging percentage to create a stat called OPS, for "on-base plus slugging." As it happens, this year's MVP race is a no-brainer by that standard: Rodriguez led Ortiz easily in on-base percentage (so McAdam didn't mention that stat), and he also overtook Ortiz in slugging at the very end of the season, finally leading Ortiz in OPS, 1.036 to .999. If Ortiz were a valuable defensive player, his contributions could still justify an MVP award, but, of course, defense is also in Rodriguez's favor, as he played a solid third base every day while Ortiz did not take the field. Because defense hurts his argument, McAdam writes, "Defense has never been much of a factor in MVP voting. If it were, Ozzie Smith, Mark Belanger and Bill Mazeroski would have been serious contenders. They weren't." But this is patent silliness: it's simple and accurate to say that hitting is more important than fielding, but fielding still counts for something--especially when one player plays a skill position, allowing his team to pack more offense into its lineup, and the other clogs the DH hole, robbing his team of offensive flexibility. For all these reasons, Rodriguez clearly had the better individual season, and the fact that I like the Red Sox and Ortiz better than the Yankees and Rodriguez won't change that. A good stathead applies the same standards every year and knows why those standards are better than others. By those standards, the MVP is A-Rod's, hands down. And the infuriating problem with the situation is that his case will be damaged because it's too easy to make: Rodriguez was widely considered the best player in the AL before the season started, and he played better than anybody else. Nobody's going to attract readers with that storyline. And that's why I believe that sports journalists should be stripped of their voting power; the conflict of interest is too great to overcome when voters explain their logic in print for money.

Now here's a twist, where I'm going to diverge a little from statheaded methods. I've addressed the distinction between individual and team-dependent stats, but there's a third category: situational stats, which, for hitters, generally measure performance in "clutch" situations, variously defined: in the pennant race, at the ends of close games, with runners on, and so forth. Some such stats are easily dismissed: in a one-run game, a home run in the first inning is not less valuable than a home run in the ninth, even if the latter is more memorable. The more interesting question is how we should evaluate a single that drives in two runs versus a single with two outs and nobody on.

The statheaded approach, grounded in a lot of careful analysis, has been to contend that the two singles should count the same. At the major league level, hitters do not seem to have special "clutch" abilities; good and bad clutch performance in a given season seems to result mostly or entirely from chance variations rather than special psychological characteristics. If two hitters have similar seasons and one happens to drive in more runs (because of timely hitting rather than more opportunities), statheads say that the difference essentially doesn't count because the hitter could not control it. You shouldn't get credit for luck.

And that was my position, without reservations, for a long time. But about four years ago, in research summarized here, Voros McCracken introduced what he calls DIPS, based on a compelling thesis that pitchers can control a few factors consistently (strikeouts, walks, and home runs allowed), but the number of fair balls that drop for hits against them is largely random. The details are beside the point here; the short version is that McCracken introduced the idea that we can separate a pitcher's performance from his results: if two pitchers each allow four runs per nine innings (and all else is equal), McCracken's method might tell us that one of them was lucky and one unlucky--they had the same results, but one pitched better.

This insight is extremely valuable to people investing in baseball players for the future--you want the guy who really pitched better on your team next year, not the guy who got lucky. The consequences of this approach raise a troubling issue for individual awards based on the past, however: these two pitchers were, demonstrably, equally valuable to their teams, but we can reasonably say that one of them pitched better. And the logic underlying everything I said above is that being better and being more valuable are the same. By traditional stathead logic, in which we credit players only for achievements stripeed of demonstrably random effects, we could give Cy Young awards based on normalized hypothetical results for pitchers rather than what opposing hitters actually did against them.

I'm not ready to do that, so to be consistent, I must entertain this question: if David Ortiz was blessed by fate in ways that enabled his performance to benefit his team because of chance, should he get a little credit for that? By McCracken's logic, I'm giving that kind of credit every time I compare pitchers by ERA.

Honestly, I still don't want to give Ortiz bonus points for pleasing Fate, and I certainly don't think such credit should overcome a clear-cut MVP choice like that of Rodriguez over Ortiz. But I do think our new insights into evaluating performances separately from the results they produce raise serious theoretical questions about statistical analysis of sports performance.

Underlying Logic

Monday, October 03, 2005

Baseball MVP talk: quality, value, and chance

No comments:

Blogiana

Podcastiana