Underlying Logic

Wednesday, January 30, 2008

A small observation about Michael Pollan, The Omnivore's Dilemma

I'm well behind the rest of the eggheaded world in reading The Omnivore's Dilemma: A Natural History of Four Meals. I recommend it highly for reasons that are now widely known: it's an entertaining and memorable introduction to many varieties of food production. I'm delighted to understand the many inflections of the "organic" label, for example, and to know the etymologies of "corned beef" (salt used to be among the many grains known broadly as "corn") and "corn-hole" (actually, I'm not as delighted to know this one, but I'll never forget it).

But here's a moment from the book that has bothered me since I read it a few weeks ago. A hero of Pollan's story is Joel Salatin, who operates a 550-acre anti-industrial farm in Swoope, Virginia. Salatin's regionalism forms the foil to conventional farming practices in the book. Here is the passage I have in mind:

Before we got off the phone, I asked Salatin if he could ship me one of his chickens and maybe a steak, too. He said that he couldn't do that. I figured he meant he wasn't set up for shipping, so offered him my FedEx account number.

"No, I don't think you understand. I don't believe it's sustainable--or 'organic,' if you will--to FedEx meat all around the country. I'm sorry, but I can't do it."

This man was serious.

"Just because we can ship organic lettuce from the Salinas Valley, or organic cut flowers from Peru, doesn't mean we should do it, not if we're really serious about energy and seasonality and bioregionalism. I'm afraid if you want to buy one of our chickens, you're going to have to drive down here to Swoope and pick it up."

Which is eventually what I did.

Think about that, reader. Joel "really serious about energy" Salatin has an alternative to conventional methods of specialty food delivery: individual customers can drive to his farm to pick up one chicken!

I am trying to come up with a more madly wasteful means of delivering food than this one. Renting a coal-powered locomotive to haul a pork chop from Cedar Rapids to Iowa City, perhaps?

This may seem the picking of a nit, but I think the passage indicates an important blind spot in Pollan's argument and in much similar rhetoric: the "really serious about energy" folks sometimes talk about mass transit for people and foods in opposite ways. In the same way that relatively wealthy liberals who drive Camrys (we own two) feel superior to SUV owners, even if the Camry people drive 30,000 miles a year because they have jobs in two different cities and shop at Costco in a third (ask me how I know), Pollan fails to note the ways in which his values contradict each other. Whether we like it or not, the availability of Mexican asparagus in Iowa for two bucks a pound is a sign of astonishing energy efficiency in the means of delivery.

Which is more damaging: buying lunch at our local McDonald's or driving our Camry 55 miles to buy the ingredients of lunch from a farmer who herself drove 20 miles to a farmer's market? I suspect the latter, by a large margin, but I don't presume to know the answer. When Joel Salatin tells Michael Pollan to drive to his farm for a chicken, I wish Pollan would at least ask some questions.

Sunday, December 23, 2007

The strangeness of Rudolph the Red-Nosed Reindeer

Only this year did I stop to ponder the opening lines of Rudolph the Red-Nosed Reindeer:

You know Dasher and Dancer
And Prancer and Vixen,
Comet and Cupid
And Donner and Blitzen.
But do you recall
The most famous reindeer of all?

Hold on: these kids "know" the likes of Comet but might have forgotten Rudolph? And you can't say they're just working up to knowing the big one because they would have to "recall" the acknowledged "most famous reindeer." Nonsense and bollocks and humbug.

But we let that pass. Here's the part I've been thinking about more:

Then one foggy Christmas Eve
Santa came to say
Rudolph with your nose so bright
Won't you guide my sleigh tonight?
Then all the reindeer loved him ...

My first thought was that this is a prototypical nerd's fantasy, the dream of a world in which gaining the favor of a parent or teacher results, instantly and without explanation, in attaining the love of one's peers.

Maybe there's something to that reading, but I've come to a more universal one that I like better: that the song is less about the child's perspective than the adult's--the parent's. This is the fantasy of beholding a child subjected to laughter and name-calling and transforming the social world into one of approval and love. What power could a parent or teacher desire more, and what power is less attainable?

At this moment, the Santa myth meets the Christmas story in a beautifully complicated way: Santa's approval of Rudolph involves the God-like prevention of social wounds; the Christmas story has God subject God's child to the world's woundedness. And at some level, they both raise the problem of preventable even: until a moment of dramatic redemption, Santa and God both allow suffering they ostensibly have the power to stop.

The Rudolph story may gain its greatest complexity and interest, and its strongest connection to the more complicated mythologies of Christmas, when we imagine Rudolph going to bed Christmas night, exhausted and happy and loved, and wondering what will happen if the next Christmas Eve brings a clear sky.

Wednesday, December 19, 2007

The Iowa Democratic Caucus, Education, and Poll Reporting

This morning's Washington Post has a routine article about the paper's latest political poll in Iowa. The main thrust of the story is equally routine: Obama has a small lead, but everything will come down to turnout. As an Iowa caucusgoer myself, I scraped up enough motivation to click to the second page of the story and found this paragraph:

Considering other turnout factors brings no additional clarity. Age and education are two key predictors of caucus participation, with older and more highly educated people disproportionately showing up to vote. While Clinton outpaces Obama among older voters, particularly those aged 65 and up, Obama outperforms her nearly 3 to 1 among those with an education of a college degree or more.

THREE TO ONE? Obama is outpolling Clinton three to one among college grads? I am gobsmacked: I've read a lot of coverage of this race, and I would guess that I've seen a hundred times as much coverage of race and gender as education level. Yet there it is: alongside relatively tiny differences in other areas, an enormous gap based on one variable that almost nobody is talking about. Note that the gap isn't even the main topic of the Post's own paragraph: the gap is presented as a turnout factor, not as the crucial difference between the Iowans who prefer Obama and those who prefer Clinton.

In this race, the education level of voters also seems to work against some of the race's main narratives; for example, given the Clintons' alleged association with cultural elites, would we have heard more about this story if the numbers were reversed? Do we even know how to talk about Hillary Clinton as someone who connects with common people but flops among college graduates? I'm not sure we do.

But I also wonder whether this case illustrates a blind spot in political journalism more generally. I imagine so, at least to some extent. It might be easier, and it seems to me more conventional, to talk about political preference in terms of race, gender, and age than education. If I'm right that there is such a blind spot, does it relate to ways in which we do and don't discuss social inequality in America?

Tuesday, October 23, 2007

Book review: Harry Potter and the Deathly Hallows, J. K. Rowling

I largely agree with Stephen King’s advocacy of the merits of J. K. Rowling: the Harry Potter novels—especially the later ones—manage a combination of imaginativeness and pacing equaled by few other writers. While acknowledging Rowling’s achievement, however, and counting myself among the deeply absorbed readers of the series to the end, I want to comment on my dissatisfaction with its last installment.

The primary flaw of this book lies in its cavalier dismissal of the moral complications involved in the use of extreme force. This dismissal violates the values of Rowling’s make-believe world, returning the reader to everyday relativism with an anticlimactic thump. From the beginning, the books led us to understand that the wizarding world operates with a code roughly analogous to, but fundamentally different from, human theories of justified violence and war. The wizarding system of morality draws a clear line between minor offences and three Unforgivable Curses: those that murder, torture, and enslave others. The rules of wizarding set the limits of legitimate violence with a specific, strong prohibition.

In Harry Potter and the Deathly Hallows, Harry crosses the line, unequivocally and shamelessly torturing an enemy with an Unforgivable Curse. The situation allows Rowling the opportunity to grapple with the moral difficulties of just war theory: if wartime requires good people to act immorally—even unforgivably, by conventional standards—how do we assess the human consequences of just war for the perpetrators as well as the victims of violence? The novel’s answer: no sweat! We win! The narrative forgives the supposedly unforgivable in advance, framing the action to encourage rooting for Harry to go ahead with it already, and then never hints at any consequences of Harry’s choice.

This choice to deflate the problem of unforgivable curses is one example of the novel’s larger inability to play by its own rules. Two other problems both contribute to the cheeriness of the epilogue, which trades epic complexity for cuteness. The simpler of these is the logic by which Harry claims the elder wand. The climactic duel between Harry and the incarnation of ultimate evil turn on the results of an ordinary skirmish in which Harry has disarmed Draco Malfoy and thus gained ownership of the elder wand, though Draco does not possess the wand at the time. By this reasoning, anybody who disarms Harry becomes master of the world’s most powerful weapon, but nobody seems interested in trying.

The more complex problem involves the logic of sacrifice. Harry’s ability to save a world by embracing his own death parallels the Christian myth of sacrifice in many ways, and, not coincidentally, it runs into some of the same logical problems. The Christian version of redemption by sacrifice has caused theological trouble for millennia. Why does the sacrifice of one being redeem the sins of others? Doesn’t that redemption require some kind of deal in which Satan accepts the trade? If so, why does an omnipotent God have to negotiate? If not, why does an omnipotent God have to sacrifice anything, let alone God’s only child? The story gains great emotional power from the idea of sacrifice, but explaining the necessity for and nature of the sacrificial transaction requires some seriously complex theology because common sense doesn’t do the trick. The complexity of the problem is a danger sign for the novelist who would take Christian sacrifice as a model for a plot.

Rowling’s version borrows the Christian mystery of sacrifices that protect other beings from evil, but Rowling removes the complicating factors that make the Christian story so interesting and troublesome. Harry is only mostly dead, for one thing, and his victory is disconcertingly complete. Whereas the whole series of Harry Potter novels drew its energy from the suggestion that dark wizarding came not only from Voldemort but also from every character’s susceptibility to temptation, the Battle of Hogwarts allows Harry a total victory. Voldemort’s literal death is unsurprising, but his metaphorical death in the elimination of evil people and even significantly evil thoughts at the end of the book provides cheap, simple satisfaction. The book sidesteps the central question of the Christian story of victory through sacrifice: why does evil persist after the redemption?

Hence the lack of occupations in the epilogue, where the central characters wallow in domestic bliss with no jobs or, more importantly, vocations. I know Rowling has assigned them work in post-publication interviews, but I’m enough of a formalist to say that’s cheating: the key issue is that the end of the last book doesn’t provide a substantive logic for continued conflict, and the epilogue flows smoothly from that lack of conflict. If you’re going to posit the continuance of evil (or, say, the homosexuality of a leading character), you need to make it work in the book—and the book ends, “And all was well.” As much as I enjoyed reading these books, and that is very much indeed, all is not well when an epic ends without grappling with the persistence of evil.

Thursday, October 18, 2007

For my convenience

Here's a great moment in technology. I just got an email that was sent to the whole faculty. It begins, "For your convenience, I have attached a PDF file of this email."

I admit that I couldn't resist opening the file to confirm that statement. Yup: it's a PDF of the same formatting and text of the email. Perhaps next time, we could also receive an image file of the PDF of the email.

Tuesday, October 02, 2007

Polls and the Iowa Caucuses

As an Iowan who experienced the Democratic caucus last time around, I'd like to offer a perspective that sometimes attracts a little coverage (as in this 2004 piece) but then disappears for a long time.

When you go to the caucus in Iowa, the first stage of the process is like a live, public primary: each candidate has a designated space, and his or her supporters go to that place. But then a viability rule kicks in: any candidate with less than 15 percent support (or more, depending on the situation) is declared non-viable, and his or her supporters go other groups.

Based on today's poll numbers, therefore, a typical precinct will see everybody but the Big Three eliminated right away. Your Richardsons or Bidens might survive in a precinct or two, but every precinct will have a significant chunk of voters, probably somewhere between 15 and 30 percent, who aren't able to support their first-choice candidates. If Iowa remains close, those voters could play a large, even decisive, role in determining the state winner. This two-stage process will reward candidates with broad support and low negative ratings--the ones most likely to be the second choice of those Dodd supporters who need to find a new horse to ride on caucus night.

My guess is that the caucus process will result in Obama surpassing expectations based on his pre-caucus poll numbers, but that's just my guess. My main point is that anyone considering the possibility of a candidate catching a wave in Iowa should consider the poll numbers in the context of the caucus process.

Monday, October 01, 2007

A bizarre argument for arts education

I'm a big fan and proud veteran of public-school programs in the arts, especially music. Because I wish school arts programs well, I hope they enjoy better supporting arguments than this one, offered by Ellen Winner and Lois Hetland in the Boston Globe. (Winner and Hetland teach at Boston College and the Massachusetts College of Art, respectively.)

Starting with the big question, "Why do we teach the arts in schools?" Winner and Hetland argue, in brief,

1. The common claim that the arts make students "smarter" (or higher achievers) in other subject areas has not held up to scrutiny.

2. However, arts classes are valuable in another way because their teachers tend to use techniques that develop "life skills" such as critical self-examination more than teachers in other classes.

3. This "arts-like approach" can be adapted to teaching other subject areas.

That last step is the kicker: if the only demonstrable benefits of the "arts-like approach" can be exported straightforwardly out of the arts classroom, why should we bother with the arts classroom as anything but a transitional space, where certain (not very revolutionary) teaching techniques are examined and extracted until the arts themselves become wholly unnecessary?

Again, I write as a supporter of arts education, but the logic of this article, ostensibly in support of the cause, constitutes one of the most effective attacks on it that I have encountered.

Wednesday, August 15, 2007

The informal economics of class size at Grinnell

I was involved recently in an Internet discussion of the effects of class sizes in public schools. If you have followed such discussions, you can guess how this one went. When data from studies finding very small effects resulting from very large investments in smaller classes came up, the teachers in the discussion protested mightily, offering tales of the difficulties of teaching huge classes in the present system. My own teaching experience also leads me to think of class size as an important factor. Here, however, I want to sidestep the normal policy debate to share my experience watching students negotiate the marketplace of course registration at Grinnell.

For starters, let's note that Grinnell students tend to be a politically liberal bunch who chose to attend a school that aggressively advertises the smallness of its classes; I'll wager almost all of them, if asked in abstract terms, would say that they value small classes as a policy objective and a personal preference.

Now here's what I mean about the marketplace of course registration. Let's say you have 50 students who can choose between two sections of the same class. The students choose in order, always knowing the current enrollment of each section. For whatever reason, they believe that the teacher of Section A is preferable to that of Section B.

We can model this easily. If the students all believe that class size is the only value worth considering, the two sections will each end up with 25. (#1 will chose Section A because the sizes are equal and the teacher is preferable, then #2 will go to B, #3 to A, and so forth.) If the students all believe that teacher quality is the only factor worth considering, you'll see 50 in section A and 0 in section B. More likely, you would actually see some kind of weighted preference, where students consider both factors and begin to choose section B as section A gets bigger--a 40-10 split would indicate a weaker preference for small classes than a 30-20 split.

In other words, the bigger the variation in freely chosen class sizes, the more weight students are putting on teacher quality relative to class size. The enrollments in the sections give you a lot of information about the population's values.

Viewing the choice from this perspective reveals that students tend to accept fairly large differences in class size before they let it trump perceived teacher quality. That's why every secondary school I know of (public or private, in any social setting) tries to make switching sections extremely hard. We can't know what choices students would make, but the barriers to switching imply a widespread assumption that left to their own devices, the students would choose exactly the model that some libertarian economists propose: bigger classes with the best teachers.

At Grinnell, students can often make exactly that kind of choice among sections or (more often) among classes that perform the same function in their course plan. Based on what I've seen, I would say that Grinnell students value perceived teacher quality much more than class size, to the point where most will readily become, say, the 21st person in the desired section rather than the ninth in another. I have seen students make switches because they value lower class sizes, but only in the most extreme cases by Grinnell standards (switching from, say, a section of 40 to one of 13), and even in those cases, very few students make the switch. I'm sure there are contrary anecdotes out there, but having seen a lot of preregistration numbers, I'm confident in asserting the general pattern.

I don't mean to imply that the Grinnell model would apply to other educational situations. I understand the problems with that translation. But I find this situation interesting because it involves a set of people making decisions that don't seem to match their abstract values.

Thursday, April 26, 2007

Ask me for my opinion

If I saw this result (or an analogous one) in a public opinion poll, my faith in the public's opinion would rise tenfold:

Q12: How closely have you been following 
developments in the war in Iraq?

96% very closely
 3% closely
 1% a little
 0% not at all

Q13: Would a near-term pullout of American forces 
help or harm Iraq and its residents in the long term?

 1% help
 2% harm
97% don't know

Tuesday, October 17, 2006

The Irony of George W. Bush

Back in July of this year, 2006, a lot of people made a fuss when a live microphone captured a private exchange between George W. Bush and Tony Blair. The fuss came about largely because the President, though already established as something of a pottymouth, added a new entry to the catalog of his documented obscenities. Here's the key line:

"The irony is, what they really need to do is to get Syria to get Hezbollah to stop doing this shit, and it's over."

As you may remember, the release of this audio prompted a range of commentaries. News outlets had to decide how their obscenity policies worked when the President dropped an s-bomb while talking politics. A number of commentators noted that Bush uttered the line with his mouth full, chomping through his words as though the chefs of the G8 summit had served him gristly cud. More serious commentary addressed the content of the remark, weighing the accuracy of Bush's characterization of the Syrian role in the conflict between Israel and Hizbullah. I propose that all of these angles missed the most important point to be made about Bush's comment:

George W. Bush does not understand the meaning of the word "irony."

Let's assume that Bush was correct that "what they [the U.N.?] really need to do is to get Syria to get Hezbollah to stop doing this shit, and it's over." There's nothing ironic about that sentiment. On the contrary, it displays Bush's characteristically blunt cause-and-effect logic of diplomacy; in this case, one body pushes another, which pushes a third, and the desired reaction comes about. No irony, right?

Now consider this statement, made a couple of weeks ago as part of Bush's pre-election offensive against Democrats:

"You do not create terrorism by fighting terrorism."

Of course not! That would be ironic! When you have no understanding of irony, the word or the concept, it makes no sense that fighting terrorism (badly) can create terrorism, that a show of strength can create weakness, that the rhetoric of certainty can mask anxiety, that the public faces of moral self-congratulation can be overwhelmed by corruption.

Bush and his party have thrived on convincing voters that the biggest hammer is the best tool for any nail on any wall. The upcoming elections may be a referendum on Bush, but they will also be a referendum on irony, as many politicians of both parties now run on positions that assume the ironic consequences of Bush's policies and look for ways to escape them.

It may be that the failure of Bush's policies, by creating such wrenching tragedies that voters can no longer ignore the ironies beneath the President's unflagging certitude, will teach a generation of young people the notoriously tricky concept of irony. If so, the students will understand by example what the teacher himself does not grasp. How ironic.

Saturday, October 07, 2006

How to get serious about steroids in sports

I have a proposal for dealing with steroids and other performance-enhancing drugs in sports.

I have the case of Major League Baseball in mind because of what I see as the scapegoating of Barry Bonds to cover up the more important underlying scandal that if Bonds did use steroids when it’s alleged he did, he did not break the rules as they stood at the time. Since a huge range of substances could qualify as "performance-enhancing drugs" in sports—can any among us explain why caffeine doesn't count?—the rule-makers must take responsibility for creating specific and effective deterrents.

I bring this up not to defend Bonds or to get into assigning blame for the outdated rules of a few years ago. Instead, I mean to illustrate the extent to which the lessons of the Bonds case do not seem to have sunk in. The rule-makers (in baseball’s case, the players’ union and the owners, perhaps in that order) still don't seem interested in writing the toughest possible rules.

Here's my proposal: define banned substances, test aggressively when reliable tests are available, and save samples in the care of a neutral, confidential agent. Then test retroactively as new procedures become available so that players can't get away with using HGH, for instance, by taking advantage of the fact that the tests haven't caught up to the drug. Then enact this rule: if reliable tests from two separate samples EVER show you were juicing, your very existence is stripped from the official records of baseball. No asterisks, no nothing. If we catch your HGH use in 2018, you never played.

Don't you think that would get in players' heads a little?

Sunday, April 30, 2006

Substantial Nits

I hereby define a Substantial Nit as a small matter of usage that has large consequences for a reader or listener's understanding of a significant point. In my judgment, most common nits, however worth picking they may be for other reasons, are not SNs. To qualify as an SN, a common mistake must routinely lead to significant misunderstandings; for example, I'm not interested in the stray case where the needless apostrophe in "The Simpson's" on the decorative boulder in front of my house might case a space shuttle to crash. By way of positive example, I begin the list with two charter members:

1. Percent vs. percentage points. For a recent example, I recently heard an NPR story that talked about an incidence of a disease rising by two percent. The story went on for a while, and I couldn't believe a two percent increase had created such a big story. At the end, the reporter finally mentioned the numbers--an increase of something like 4% to 6%, or two percentage points, but about fifty percent. No wonder it was a big deal! I find this to be a fairly common error that almost always makes a big difference in meaning.

2. Disinterested vs. uninterested. In a strict, old-fashioned sense, these are not synonyms. Uninterest is lack of interest in the sense of willingness to pay attention. If you stopped reading this entry before now, you were probably uninterested (as well as gravely misguided, of course). Disinterestedness means lack of personal or financial interest for the purposes of fair judging; Consumer Reports accepts no advertisements because it wants to make disinterested judgments. A lot of people now use "disinterested" to mean "uninterested" and "unbiased" to mean "disinterested." I think that doesn't work. For example, I generally think that buying new cars is a bad idea (in strictly financial terms), and I have good reasosn for thinking so. I'm biased against buying new cars, but my bias is disinterested because I derive no significant benefit from the sale of used cars. More importantly, it's important for politicians and journalistic analysts to cultivate disinterest, but it's fine for them to have well-founded biases that they can explain and defend. It's hard to encourage the disinterestedness that's essential to productive discourse if we lose the word for it.

3. "Proof." Contentions and fictions do not prove things. Brad DeLong makes this point today in his memorial post about J. K. Galbraith, zinging the New York Times obituary writers' use of "proofs" to describe the arguments of some of Galbraith's detractors: "Proofs? I know many people who find Becker's and Stigler's arguments powerful ones. I know nobody who would call them 'proofs.'" I come across similarly inexact usages of "proof" and its variants frequently, often in new college students' papers, which sometimes claim that a given text "proves" something about life. For example, Pride and Prejudice might "prove" that women in Jane Austen's time could find happines by defying social convention and holding out for true love in marriage. A novel can imagine something of that sort, and a novelist might be said to argue it, but a novel cannot prove a sociological or historical claim. To insist on more exact usage of the terms of proof is to encourage public discourse to distinguish among the kinds of proof that various situations allow or require.

4. "Beyond a shadow of a doubt" (in a legal context). This is related to and narrower than SN #3. Almost as often as not, I find, news coverage of court cases will slip at least once from the phrase "beyond a reasonable doubt" to "beyond a shadow of a doubt." Wouldn't it be lovely if reporters and analysts routinely used the correct phrase and helped their viewers or readers understand more precisely the legal standard of a "reasonable doubt" in a given case?

Tuesday, March 28, 2006

Cinderella story

Many recent news accounts have referred to the improbable presence of George Mason in the Final Four of the NCAA men's basketball tournament as a Cinderella story. In this recent story, Dan Wetzel extends the metaphor, asking readers to entertain the question, "Why not George Mason?"--that is, why couldn't GMU win two more games and become the tournament champions?

(A side note: Wetzel's piece is a beautiful example of "angle" journalism, as I call it. He seems to have nothing to say other than that George Mason has a chance, though a small one, of winning two more games. Does anyone think otherwise? Of course GMU could win two more! Of course the probability is small! All Wetzel does is call attention to his own expert angle on the issue by emphasizing an improbability. He says nothing that isn't plainly visible in the point spreads or betting odds on the upcoming games.)

But I digress. Wetzel closes his piece by extending the Cinderella metaphor: "Of course, in the original, Cinderella lived happily ever after."

Not necessarily.

(Another side note: one of the most useful and surprisingly accurate tidbits of textual analysis I've ever picked up was the notion that if you want to find a writer's most ideologically loaded and debatable point, look for whatever follows "of course" or "obviously" or "certainly." Unintentional ironies often lurk in those assumptions of consensus.)

There are three problems with this common usage of the Cinderella metaphor.

The first problem is the idea of an "original" Cinderella story. As a near-universal folk tale, Cinderella has no identifiable original version. But this is a nitpick; let's translate "original" to "standard" and use Charles Perrault's version, the basis of most English-language storybook Cinderellas.

The second is the underlying assumption that Cinderella is an underdog who achieves a social standing far beyond what she had reason to expect. Not so much: Cinderella is the daughter of a man with significant class standing--a "worthy man" who has the money and connections to get his stepdaughters to the prince's ball and have them dressed well for the occasion. Cinderella is a high-born woman with immense cultural capital, and her story is one of restoration and moderate rise in status. Arguably, from this perspective, George Mason would have the story least like Cinderella's among the four possible winners of this year's tournament because GMU is a true upstart. The other three teams all seek a restoration of former glories; UCLA's former dominance is too great to make it a true Cinderella, but the slipper fits Florida and LSU fairly well--both programs have made the Final Four, but not too recently, and neither has won a championship.

The third, and perhaps the most important, problem is the statement that "Cinderella lived happily ever after." Perrault's story says no such thing, and his ending is maintained in the modern translations I've seen. Cinderella does seem to be happy, but the narrator does not address her future. Moreover, the established marriages range from grotesquely dysfunctional (Cinderella's father and step-mother) to suggestively creepy (the prince's parents). The story seems to go out of its way to contradict the assumption that people of high station find lasting happiness automatically. Cinderella has her moment, but no more.

So let's enjoy the success of George Mason. GMU is this season's most remarkable underdog story. But even--or especially--if they win the championship, their story will not be Cinderella's.

A better March Madness pool

A friend, Doug Cutchins (co-author of this book), and I have created an auction-style pool for the NCAA men's basketball tournament. We enjoy the format and invite others to follow along. Update--from here to the end of the paragraph added. In keeping with the theme of this blog, I'll emphasize the underlying logic of the idea: most tournament pools simply reward correct picks. Some recognize the limitations of that model and reward upsets. Both of those approaches lead to insincere picks by rewarding contrarianism; if you want to win the pool, you can't simply make sensible choices and fill out your bracket. A market captures the advantages of rewarding upset picks while avoiding the incentive for insincerity. If everyone in the pool thinks St. Bonaventure will win the championship, they can all bid accordingly, according to their sense of the Bonnies' probability of winning in each round, and the market will determine how much the team's output is worth. Upset picks are rewarded automatically because underdogs command lower prices than favorites--and the underdog/favorite distinction is determined by participants rather than the selection committee.

As with all pools, it is not necessary to wager real money to enjoy the competition. Here's the way we announced the pool, slightly revised for this general context:

We like basketball, but we have tired of the standard pool format—its blunt all-or-nothing payouts, its overweighting of the final games, its inability to let participants express their degree of confidence in teams beyond simple brackets.

We think we have a better way. We propose an auction, to take place at 7:30 p.m. on March 14th (the Tuesday between the announcement of the brackets and the tourney) at a location yet to be determined. In the auction, taking eight participants by way of example, eight participants each buy “ownership” of eight tournament teams with a fictional budget of $25 each, so every basketball team is on one of our participants’ teams. (The winner of the play-in game would count as one team.) The teams are bought in a standard auction format, with rising bids in ten-cent increments, so players express their confidence in each team’s prospects with their bids. Then each player gets credit for each game his or her teams win. We believe that wins should increase in value in each round of the tournament, though not so much that they cheapen clever picks in the first two rounds. To that effect—after many drafts—we have come up with this payout scheme:

Each of 32 first-round winners earns 1.25% of the pot ($2.50 in an eight-person league)
Each of 16 second-round winners earns an additional 1.50% ($3.00)
Each of 8 third-round winners earns an additional 2.00% ($4.00)
Each of 4 fourth-round winners earns an additional 2.50% ($5.00)
Each of 2 fifth-round winners earns an additional 3.00% ($6.00)
The tournament champion earns an additional 4.00% ($8.00)

It’s up to each player to decide how much of his or her $25.00 to bid on each team. In an eight-person league, the tournament champion would earn its owner $28.50 (14.25% of the pot), in addition to any winnings generated by the player’s other seven teams.

We like this system because, unlike most common approaches, it allows players real flexibility in pursuing overall strategies, making trade-offs between, say, a #1 seed or two #3 seeds. It lets the little market of the auction determine the relative costs of teams instead of relying on the rankings of the committee. It lets participants enjoy a strong sense of identification with eight specific teams instead of the traditional 64 picks, most of which are widely shared with other players. It creates payouts that reflect players’ performance with some subtlety; prizes will spread out rather than simply going to the luckiest player or players. And best of all, it lets us enjoy a late-evening time of sports banter and good cheer to raise our spirits before the tourney tips off.

The details:

1. Each player has the right to spend $25 of fictional money in the auction. The player may spend less than 25 imaginary dollars in the auction, but the 25 real dollars will remain in the pot.

2. The auction will proceed in a steady rotation, with players putting teams up for bid in turn. The nomination constitutes an opening bid, as in “St. Bonaventure for two dollars!” All bids must meet two conditions: the player must have space on his or her eight-team roster for a team (players who have drafted eight teams will no longer nominate teams in the auction), and the player must reserve enough money to make a minimum bid of ten cents on each of eight teams. If St. Bonaventure is nominated first, for example, everyone would be able to bid up to $24.30, the amount necessary to buy St. Bonaventure and seven teams at the minimum price of ten cents. Also, skip bids are fine; if someone nominates St. Bonaventure for two dollars to start the auction and you want to bid $24.30 right away to ensure control of the Bonnies, you are welcome to do so.

3. As bidding slows down on each team, someone will count down the sale clearly, as in “St. Bonaventure to Erik for $8.60, going once . . . going twice . . . sold!” The count should give all bidders reasonable time to pipe up. Players may occasionally interrupt the countdown by asking for brief time-outs, but this privilege should not be abused, lest the abuser be subject to taunts, scorn, and mockery.

4. The statkeeper will send out an update with standings and commentary every round.

Friday, December 16, 2005

Book review: Martin Seligman et al., The Optimistic Child

The argument of The Optimistic Child: A Proven Program to Safeguard Children Against Depression and Build Lifelong Resilience--one that is both more and less powerful than Selgiman claims, as I will explain--runs something like this:

When people encounter adversity, they are affected by it in varying ways according to the way they explain the adversity to themselves. Seligman (and his research team) call these narratives explanatory styles.
People with an optimistic explanatory style characterize adversity as changeable and specific, with behavioral solutions. An optimist explains doing poorly on a math test, for instance, as only one instance with specific causes (such as not studying effectively or enough). The optimist's response to the situation is to focus on behavioral changes (such as studying differently or more) that will produce better results. The pessimist, on the other hand, sees adversity as permanent and personal and responds passively. To a pessimist, the same poor math score will prompt a response such as "I can't do math"; the explanation attributes adversity to an aspect of the person's static character and leaves no room for productive responses.
As he explains the previous point, Seligman is careful to contrast his approach with the "self-esteem" movement, which he accuses of fostering falsely happy explanations that actually encourage children to develop pessimism as they perceive the insincerity of their parents and teachers.
Returning to the linear argument: depression is marked by a pessimistic explanatory style.
People can learn to recognize their own explanations, check their pessimistic self-commentary against reality, and develop a reality-based optimism.
When children practiced changing their explanatory styles in studies developed by Seligman and his team in Philadelphia-area schools, those children appeared to enjoy long-term increases in optimism and therefore decreased risk of depression.

This seems to me a compelling argument in many ways. Most importantly, the opening chapters of the book give parents concrete, readily applicable advice about how to speak to children who have just experienced disappointment or failure. My guess is that most parents and other people who work with young children will find those early chapters well worth the small amount of time they require to read.

Most of the latter part of the book describes the Philadelphia experiments in detail and offers many practical materials for helping children identify and modify their explanatory styles. Readers other than parents and child-care workers will probably find themselves skimming or skipping much of this material; the meat of the argument is clearly stated early on.

I said above that I find this argument both less and more powerful than it claims to be. It is less powerful because Seligman does not articulate the extent to which the book itself is a piece of optimistic explanation of an adverse phenomenon, adolescent depression, that lends itself only partly to that optimism. Seligman only occasionally acknowledges the biologial components of depression--precisely the permanent, personal factors that would emerge in a pessimistic explanatory style. The data presented in the book is impressive, but Seligman does very little to help parents and teachers cope with the significant incidences of pessimism and depression that persist even in the groups trained by Seligman's team itself. As this book points out (very late), there is some evidence that mild pessimists perceive situations more accurately than optimists, and the book does little to counter suspicions that much pessimism and depression lurks just to the side of its explanatory model. This is not to take anything away from the remarkable achivement of offering any demonstrably effective techniques for countering adolescent depression, only to say that by the authors' own account, that effectiveness is limited more than the book's tone sometimes implies.

Those concerns arise only in the specific context of depression prevention, however. Seligman's argument is more powerful than it claims to be in the generality of its application. This book, quite unintentionally, explains concretely how teachers, managers, and colleagues can be simultaneously demanding and encouraging. I have said of my own best teachers that they managed to convey great faith in their students even, or especially, when holding them strictly to high academic standards. (I have seen a highly respected professor's response to a student paper opening, "I know you can write better than this!" Note that this is an optimistic statement, harshly critical of the piece by way of praising the person.) The same is true of inspiring managers and colleagues, of course; few characteristics are as socially valuable as the ability to speak frankly about ways that work can improve in the context of sincere personal support.

I therefore come to this odd evaluation of The Optimistic Child: it is a good book about childhood depression, though not as good as it might be. It is a much better book, especially in its opening chapters, about how to talk to people, from students to friends. I recommend that section highly.

Saturday, November 19, 2005

Book review: Freakonomics

In a recent post on EconLog, Bryan Caplan wrote, "while there are many reasons why economics is the most successful social science, willingness to say what people don't want to hear is near the top." Such a willingness is certainly the chief merit of Freakonomics: A Rogue Economist Explores the Hidden Side of Everything, by Steven D. Levitt (the eponymous economist--ha!) and Stephen J. Dubner, an admiring journalist. Levitt's work addresses inflammatory issues such as abortion and urban gang culture; his conclusions are refreshing because he disdains ideological influence. His convincing argument that legal abortions have reduced crime rates, for example, will unsettle readers of all political affiliations; the hypothetical lives of aborted fetuses do not fit well into the pro-choice emphasis on pregrant women's control of their bodies, and legal abortion's power to reduce crime more than conventional measures (aside from increasing the number of police) radically upsets some conservative ideas of being "tough on crime." The abortion chapter of Freakonomics fairly begs to be misrepresented for ugly political purposes, as indeed it has been. The political risk of approaching these issues apolitically is cause for admiring the courage of the authors.

It is difficult, however, to avoid the sense that the most sympathetic reader will struggle to admire the authors as much as they appear to admire themselves. "An explanatory note" at the beginning of the book explains Dubner's view that "many economists" speak "English as if it were a fourth or fifth language" and Levitt's sense that the "thinking" of "many journalists" is not "very . . . robust, as an economist might say" (x, ellipsis in original). Happily for us, the note explains, we are reading the work of exceptions to these rules: "Levitt decided that Dubner wasn't a complete idiot. And Dubner found that Levitt wasn't a human slide rule" (x).

The authors' patting of their own backs will put off some readers, but it should not take away from the insights of the book. I mention it because this is not merely a matter of tone: in fact, authorial self-congratulation is the thesis of the book. One of the great marketing triumphs of Freakonomics is the ability of the authors to persuade many readers that the lack of an overarching argument is a feature, not a bug--the explanatory note ends with a philosopher asking about Levitt, "Why does he need to have a unifying theme? Maybe he's going to be one of these people who's so talented he doesn't need one" (2). Or as Malcolm Gladwell puts it on the back cover, "Steven Levitt has the most interesting mind in America." The same idea is reflected in the title: "freakonomics" is a nonsense word, a term used in the book and even more on the authors' blog to stand roughly for stuff that Levitt has argued or that Levitt and Dubner find interesting. Returning to the book's lack of a unifying theme in an epilogue, Levitt and Dubner do make a claim for a "common thread" of "thinking sensibly about how people behave in the real world," and they suggest that readers "might become more skeptical of conventional wisdom" for having read the book (205).

It is hard to argue with such broad claims, but it is worth pointing out that other books have similarly challenged conventional wisdom in ways that more actively assisted readers with their own investigations. Leaving aside the philosophical heavyweights in the skeptical line--Plato's Socrates and Hume, among many others--consider a few recent examples. When Bill James published his Baseball Abstract, he made claims like Levitt and Dubner's for the value of a general skepticism of conventional wisdom. Though James's works are not unified by theme, they do not take on all of everyday life, and focus on baseball allowed a new kind of analysis to flourish after James's example--and James's unconventional wisdom has taken hold, even as many later analysts have revised the specifics of James's claims. Or when Thomas Gilovich published How We Know What Isn't So: The Fallibility of Human Reason in Everyday Life, he not only transformed his readers' understanding of many everyday phenomena, but he also explained the specific mechanisms that caused conventional wisdom to go astray. And Gladwell himself, in The Tipping Point: How Little Things Can Make a Big Difference, offered not only a theory of counterintuitive "social epidemics" but also a mechanism of transmission through people called connectors, mavens, and salesmen that offered a widely applicable framework through which to understand other issues.

There is no equivalent framework in Freakonomics. The authors call this a lack of a unifying theme, but lack of thematic unity is a literary defect. The limitations of Freakonomics lie rather in the absence of larger scientific ideas: Levitt's argument about the relationship between legal abortion and crime rates is an argument only about that issue. The arguments of Freakonomics apply so narrowly that, while I do recommend reading the book for its key points--especially chapters three through five, on the economics of drug dealing, abortion and crime rates, and advice given to new parents, respectively--I cannot recommend reading it before the best works of the other authors I have mentioned, from Plato to Gladwell.

Sunday, October 16, 2005

On Grade Inflation

Two unrelated events have called my attention to grade inflation recently. One was a post on Grinnell Plans from a current student who had seen a chart demonstrating the upward drift of Grinnell's grades over the last decade. Essentially, the overall mean GPA has risen from roughly a B grade to roughly a B+ grade--a large change for such a short time, and as I understand it, a fairly typical change over the same stretch in many colleges and universities. The second was a detailed post by Steven Willett on NASSR-L, an email list populated by a couple of thousand people interested in Romanticism, mostly graduate students and professors in the field. Willett is a contrarian and a traditionalist who frequently attacks the state of his profession on the list; in this post, he resisted arguments minimizing the existence and consequences of grade inflation by citing a range of studies on the issue. One of those studies caught my attention because it resisted the moralizing I find tiresome on both sides of inflation debates and offered some insight into the mechanisms of grade inflation. This is Willett's quotation of the summary of that study, by Donald G. Freeman, published in 1999 in the Journal of Economic Education:

"My hypothesis is that, given equal money prices per credit hour
across disciplines, departments manage their enrollments by 'pricing'
their courses with grading standards commensurate with the market
benefits of their courses, as measured by expected incomes.

"I analyzed grade divergence using a cross-section of 59 fields of
study from a recently published survey of college graduates by the
National Center for Education Statistics, A Descriptive Summary of
1992–93 Bachelor’s Degree Recipients: 1 year later (NCES 1996). The
survey tracks 1992–93 college graduates to determine outcomes from
postsecondary education, including returns to investment in
education. Using this sample, I found evidence consistent with the
economic explanation of grade divergence: Graduates from high-grading
fields of study have lower earnings than graduates from low-grading
fields of study. This is true even when controlling for factors such
as student ability and experience" (344-45).

Fascinating! Other bits from Willett's post (drawing on other sources) flesh out some of the details underlying this hypothesis: music and education departments tend to give particularly high grades, for instance, and the latest wave of grade inflation has affected the humanities more than the hard sciences, but English and biology in particular more than mathematics. It seems to me that the place of education among particularly high-grading disciplines deserves a good deal of consideration--and has perhaps received such consideration that I simply haven't read. I'll extend that disclaimer to what follows; my speculations may be supported or contradicted by research I don't know. This isn't one of the books I'm writing.

So here's a starting point. Grade inflation is real, across the board in higher education. Giving higher grades produces higher evaluations for teachers, when other factors are controlled (other studies show). Grade inflation varies by discipline. Grade inflation comes in spurts, one of which occurred roughly around 1970 and one in the last ten years.

I find Freeman's hypothesis--that departments whose majors generally earn little money compensate by awarding high grades--fascinating and largely supported by my intuitions. However, I am prompted to look for further explanations for three reasons. First, a bad reason: Freeman's hypothesis does not match how I've seen professors talk about their grading. I call this a bad reason because of the obvious potential for self-deception or deceptive self-marketing here. The second is that there are some exceptions to the rule that I know off the top of my head: when I was at Penn, the ultra-prestigious Wharton School (business) had a reputation for giving high undergraduate grades, and indeed, a web search confirms that its introductory course has a mandated median grade of B+ in each section, which is especially high for an introductory course, where grades are generally lower than in advanced courses. Similar cases abound in related areas, such as the most prestigious law schools, where students with the highes expected earnings get very high grades. The third reason is that the logic of expected earnings does not apply to institutions; the most prestitgious colleges and universities, whose graduates have the highest expected salaries, have experienced grade inflation along with everyone else. For all these reasons, I suspect Freeman is largely correct but that other factors are also in play.

(Side note: I feel no professional self-interest in this issue. My grades are a little lower than average for Grinnell, as I suspect my department's are, and student comments about my grading reflect that. I am neither an apologist for today's grading levels nor an indulger of nostalgia for yesterday's lower ones. I do want to understand how and why my profession employs grades.)

I offer three hypotheses about those other factors:

1. The growing emphasis on revision allows students in some courses to receive higher grades given the same talent, application, and academic standards. I claim no original insight here, but I mention this factor because so many discussions of grade inflation assume that higher grades must imply better student work or lower academic standards. Allowing students to earn higher grades through revision, however, allows teachers to award higher grades while still feeling that students have received honest feedback on their work. Since many pedagogical studies support the learning outcomes of revision-based writing, this can produce a kind of guiltless grade inflation. I'll come back to this point.

2. Elite colleges and universities can use grade inflation to shift employers and graduate schools from statistical evaluations of transcripts to a self-serving prestige market. If every college and university enforced a strict 2.0 median grade, evaluators would compare transcripts by using implicit prestige adjustments--perhaps a 2.5 GPA from a highly selective institution would be roughly equivalent to a 3.1 at a less selective institution. I've seen the application of this kind of unofficial adjustment many times. If practically everyone graduates from Harvard with honors, however (as is the case), then Harvard has created a situation where most of its students cannot be outperformed in transcript reviews. Shifting all grades close to 4.0 forces evaluators to discount grades themselves, thus increasing the importance of the instutional reputation. Harvard has a great deal to gain from grade inflation, and less selective institutions can only play along--if UMass intentionally lowers grades as Harvard inflates them, UMass only hurts its graduates even more relative to Harvard's. Colleges and universities that have the highest stake in maintaining the importance of institutional prestige also have a strong incentive to keep overall grades high. And the least selective institutions are facing pressure to keep marginal students enrolled (to maintain government support based on enrollment levels).

3. The recent inflation of grades coincides with a significant weakening of tenure. Most college courses are now taught by people who are not tenured or tenure-track. Teachers who are untenured but on the tenure track (including me, for whatever that's worth) may feel some pressure to use high grades to raise the level of student evaluations, but that pressure is limited by the relatively large sample of evaluations and many other factors that go into tenure reviews. I would find a reputation for low standards much more dangerous to my tenure prospects than slightly lower average teaching evaluations. I know circumstances vary, but I think the key here is graduate and adjunct teachers whose piecework employment depends heavily on the student evaluations of any given semester. Such teachers often see their professional lives in the hands of administrators unconstrained by full review processes, administrators who need to care a great deal about student and parent satisfaction and not as much about teachers' other contributions to their institutions and professions. If grade levels are a small but significant factor in student evaluations of teaching, piecework teachers are extremely vulnerable to giving higher grades out or real or perceived self-preservation.

Taking all these factors into consideration, I offer my own hypothesis about the grade inflation of the last decade. We are seeing the confluence of multiple, independent incentives that all point in the direction of higher grades: a dramatic increase in reliance on teachers with tenuous employment, defensible mechanisms of raising grades without changing underlying standards, and institutional incentives for every kind of institution to keep overall grades high.

Tuesday, October 04, 2005

Writing on Research Leave

For scholars who want or need to publish their research findings, no question produces more opinions, self-doubts, and superstitions than this: given the demands of full-time teaching and personal or family life, how do you get the writing done? I imagine the same kind of question applies to anyone who tries to get long-term projects done when those projects compete for time with smaller, deadline-driven tasks.

Most commentary on this question runs along the lines of Tara Gray’s findings in Publish & Flourish: write daily for 15 to 30 minutes and share your progress with someone, says Gray. Her research (and other similar research) backs up the idea that writing a little at a time consistently will produce more than writing in isolated big blocks of time. In a non-academic context, Jeff Covey’s idea of the Progressive Dash relies on many of the same principles. Start the day with a minute or two of attention to all your priorities, says Covey, and then allocate more and more time to them as the day goes along. Both Gray and Covey depend on the notion that simply keeping some kind of momentum is more than half the battle of completing projects. My experience certainly confirms those conclusions, though I have not always been able to live out my convictions and write during heavy teaching semesters.

Right now, however, I want to answer a different question that grows out of the luxury of having a year to write without teaching: given the demonstrated effectiveness of doing a little research work every day, how can we apply the same ideas to making the most of a dedicated stretch of research time?

Early in my year of leave, I’m trying to learn from my experiences as they come to maximize my writing during the rest of my leave. The past two weeks really set me thinking. In the first, I got some work done, but my baby got sick in the middle of the week, and a few other complications (logistical and psychological) clearly set me back, and I didn’t do as much as I wanted to. In the second, I had perhaps the most productive week I’ve ever had in the U.S., working through and taking the notes I needed on about 5,000 pages of commentary. (As any researcher will understand, that doesn’t mean I read 5,000 pages carefully. I went through a stack of material, found what I needed, and paid close attention to those sections.) The work happened as I was taking care of my baby son—while he was sleeping, sometimes while he was playing happily in front of me, and while he was at day care for three or four hours a day.

I’ve worked that well in London before, spending long days at the British Library. Other researchers tell me of similar experiences, where going to a new place for a research trip allows them to overcome their usual limits. So why did the home routine suddenly become as effective as a remote archive for me last week?

My guess, oddly, is that the baby—crucially, during a relatively happy and healthful week—substituted for the plane trip to London. Archival trips work so well, I propose, because they give a scholar a sense of time being both abundant and scarce: abundant in that days are set aside entirely for research, scarce in that the scholar knows that the demands of daily life will wake from their sleep in a few days or weeks. The same combination of abundance and scarcity could explain the effectiveness of the day-by-day approaches I cited above: writing 15-30 minutes a day takes the pressure off any given writing session, since writing days become abundant when one writes every day, but the brevity of each session enforces scarcity—at the end of such a short session, a writer almost always wants to say more.

A research leave provides a sense of abundant, unbroken time for writing. That’s a luxury but a frequently overwhelming one because of the tendency to write to exhaustion. It’s easy to write until you’re sick of writing—and therefore to feel sick of writing most of the time you’re not doing it. The baby takes care of that problem by providing major and largely unpredictable interruptions to my day. When I stop writing, I stop because I have to, and I generally want to do more. I keep thinking about the work, sometimes scribbling an idea with one hand while holding the baby in the other. Even though I can end up spending 6-10 hours on my work, the baby ensures that time feels scarce anyway.

I don’t recommend baby care as a productivity enhancer, but I do think some of what happened in that exceptionally smooth week might carry over into a more general approach. Keeping a sense of other priorities, writing in short sessions, having expected but somewhat unpredictable interruptions, and generally avoiding the feeling of quitting from exhaustion might contribute to maintaining the productivity of short writing blocks in the more open-ended context of research leave. I’ll be thinking about this more as the year progresses.

Monday, October 03, 2005

Baseball MVP talk: quality, value, and chance

For a starting point, I'll take this column by Sean McAdam supporting David Ortiz over Alex Rodriguez for MVP in the American League.

Now, this is an unusually stupid column. A writer who says that "it's impossible to imagine that anyone could be more valuable to his team than David Ortiz is to the Boston Red Sox" is simply not taking language seriously. Sadly, however, the column does seem to reflect the level of thinking among most writers who explain their votes--and the writers elect the MVP.

First, I'm going to articulate what I think would be the traditional "stathead" position on McAdam's column, a position I support almost entirely, and then I'll explain a complication I've come to consider in the statheaded approach.

The most fundamental problem with McAdam's argument is that he's using statistics as an advocate rather than as an analyst. He cites a hodgepodge of stats, ranging from those that do a good job of measuring individual hitting production (slugging percentage) to traditional triple-crown stats that have long been shown to be lacking because they depend on teammates' performance (RBI) and exclude important information such as a hitter's walks and doubles. McAdam's standard is simply to cite the evidence that makes Ortiz look good. One name for that approach is intellectual dishonesty. Another is sports opinion journalism.

The problem is not that some sports opinion writers say thoughtless things or twist evidence to make their cases. They are paid to generate readership (or viewership), and partisan columns can serve that purpose well. But the need for a writer to present an original angle in a debate is directly at odds with the writer's function as a voter in the awards race. To analytical purists, the awards would ideally reflect the application of the best analytical practices we know of; thoughtful people can disagree about the details of the standards, but they must agree that an even-handed account of available evidence is the only reasonable starting point. But sports opinion writers can't do that, for reasons I'll return to.

Baseball offers analysts more objective evidence about individual performance than other sports do. In football, the performance of running backs depends on that of everyone else on the team--the rest of the offense has to create running opportunities, the coaches need to call running plays, and the defense needs to maintain control of the game to avoid a desperate pass-based comeback attempt. Baseball's pitchers and hitters, however, are almost entirely on their own, and the team-based elements of their performance are fairly easy to recognize and disregard in the data generated by baseball's uniquely long seasons. Therefore, statheads say that we can and should factor out statistics that depend on team performance (pitchers' W-L records, hitters' RBIs and runs scored) and test measures of individual performance based on their demonstrable effectiveness. For hitters, the quick statheaded way to account for nearly all of offensive production is to add on-base percentage plus slugging percentage to create a stat called OPS, for "on-base plus slugging." As it happens, this year's MVP race is a no-brainer by that standard: Rodriguez led Ortiz easily in on-base percentage (so McAdam didn't mention that stat), and he also overtook Ortiz in slugging at the very end of the season, finally leading Ortiz in OPS, 1.036 to .999. If Ortiz were a valuable defensive player, his contributions could still justify an MVP award, but, of course, defense is also in Rodriguez's favor, as he played a solid third base every day while Ortiz did not take the field. Because defense hurts his argument, McAdam writes, "Defense has never been much of a factor in MVP voting. If it were, Ozzie Smith, Mark Belanger and Bill Mazeroski would have been serious contenders. They weren't." But this is patent silliness: it's simple and accurate to say that hitting is more important than fielding, but fielding still counts for something--especially when one player plays a skill position, allowing his team to pack more offense into its lineup, and the other clogs the DH hole, robbing his team of offensive flexibility. For all these reasons, Rodriguez clearly had the better individual season, and the fact that I like the Red Sox and Ortiz better than the Yankees and Rodriguez won't change that. A good stathead applies the same standards every year and knows why those standards are better than others. By those standards, the MVP is A-Rod's, hands down. And the infuriating problem with the situation is that his case will be damaged because it's too easy to make: Rodriguez was widely considered the best player in the AL before the season started, and he played better than anybody else. Nobody's going to attract readers with that storyline. And that's why I believe that sports journalists should be stripped of their voting power; the conflict of interest is too great to overcome when voters explain their logic in print for money.

Now here's a twist, where I'm going to diverge a little from statheaded methods. I've addressed the distinction between individual and team-dependent stats, but there's a third category: situational stats, which, for hitters, generally measure performance in "clutch" situations, variously defined: in the pennant race, at the ends of close games, with runners on, and so forth. Some such stats are easily dismissed: in a one-run game, a home run in the first inning is not less valuable than a home run in the ninth, even if the latter is more memorable. The more interesting question is how we should evaluate a single that drives in two runs versus a single with two outs and nobody on.

The statheaded approach, grounded in a lot of careful analysis, has been to contend that the two singles should count the same. At the major league level, hitters do not seem to have special "clutch" abilities; good and bad clutch performance in a given season seems to result mostly or entirely from chance variations rather than special psychological characteristics. If two hitters have similar seasons and one happens to drive in more runs (because of timely hitting rather than more opportunities), statheads say that the difference essentially doesn't count because the hitter could not control it. You shouldn't get credit for luck.

And that was my position, without reservations, for a long time. But about four years ago, in research summarized here, Voros McCracken introduced what he calls DIPS, based on a compelling thesis that pitchers can control a few factors consistently (strikeouts, walks, and home runs allowed), but the number of fair balls that drop for hits against them is largely random. The details are beside the point here; the short version is that McCracken introduced the idea that we can separate a pitcher's performance from his results: if two pitchers each allow four runs per nine innings (and all else is equal), McCracken's method might tell us that one of them was lucky and one unlucky--they had the same results, but one pitched better.

This insight is extremely valuable to people investing in baseball players for the future--you want the guy who really pitched better on your team next year, not the guy who got lucky. The consequences of this approach raise a troubling issue for individual awards based on the past, however: these two pitchers were, demonstrably, equally valuable to their teams, but we can reasonably say that one of them pitched better. And the logic underlying everything I said above is that being better and being more valuable are the same. By traditional stathead logic, in which we credit players only for achievements stripeed of demonstrably random effects, we could give Cy Young awards based on normalized hypothetical results for pitchers rather than what opposing hitters actually did against them.

I'm not ready to do that, so to be consistent, I must entertain this question: if David Ortiz was blessed by fate in ways that enabled his performance to benefit his team because of chance, should he get a little credit for that? By McCracken's logic, I'm giving that kind of credit every time I compare pitchers by ERA.

Honestly, I still don't want to give Ortiz bonus points for pleasing Fate, and I certainly don't think such credit should overcome a clear-cut MVP choice like that of Rodriguez over Ortiz. But I do think our new insights into evaluating performances separately from the results they produce raise serious theoretical questions about statistical analysis of sports performance.

Thursday, September 01, 2005

Facing Down Katrina, Bush Declares War on Wet

Facing Down Katrina, Bush Declares War on Wet

September 1, 2005

WASHINGTON, D.C. -- Faced with increasingly dire conditions in the area devastated by Hurricane Katrina and mounting pressure to act decisively, President Bush Wednesday offered a sweeping initiative: a new American War on Wet.

"We have been hit again," said the President, on a podium flanked by Secretary of Homeland Security Michael Chertoff, Defense Secretary Donald Rumsfeld, Secretary of State Condaleeza Rice, and A.G. Lafley, CEO of Proctor & Gamble, which manufactures Bounty brand paper towels. "Now it is time to respond. Not just in New Orleans and its environs, but everywhere this threat can bring us harm. Today begins America's War on Wet!"

The President did not comment on the large-scale strategic details of WOW!, the acronym given in official White House press releases. He did, however, give on example of a way citizens can begin to participate symbolically. "Starting tomorrow, I hope every household in America will display their solidarity with this initiative by placing a single bowl of water in a visible place. If we all do that, then wherever you go in this great country, you will see the resolve of water cracking as it dries up, hour by hour, day by day. We will stand for no result but victory."

Pressed on the unconventionality of declaring war on an adjective, senior White House staff defended the decision. "Before President Bush took office, some thought America could only declare war on nations, or at least groups of people," said one such source, speaking on condition of anonymity because of the security issues involved. "We have by turns demonstrated that we can wage war against a tactic--terrorism--or even the feeling of terror itself. Far from overreaching, fighting an adjective is an obvious next step."

Multiple sources close to leading Democrats said that the party's leaders privately express reservations about WOW!, citing the difficulty of maintaining human life without regular water intake, but no Democrat has yet been willing to oppose the President publicly, and many have spoken out to support him. "This is not a time for politics," said 2008 Presidential hopeful and Delaware Senator Joseph Biden, holding a large bowl of water. "We may have questions about the specific strategies of the President's plan as details emerge, but for now, we stand with the Commander-in-Chief and share his resolve."

Right-wing media personalities and religious leaders immediately sought to isolate Democrats from WOW!. Syndicated talk-show host Rush Limbaugh commented, "Everyone knows that with a Democrat in the White House, we might have maintained funding for FEMA's disaster response capabilities, the levees around New Orleans, and the readiness of the National Guard. But only President Bush could have come up with WOW!, and you can see the Democrats seething about that." Later in the day, Reverend Jerry Falwell, head of Falwell Ministries, added a demographic point. "Look at any electoral map," he said. "Where do liberals live? On the coasts. In river cities. In short, wet places. What do you find in the nation's deserts? Bibles and dry, dry sand. The Democrats talk a good game now, but it's just a matter of time before their real loyalties become clear." None of the 32 million registered Democrats who live near oceans, lakes, and rivers was available to comment on Falwell's allegations.

The President and his top staff will depart Friday for a series of speaking engagements, informally dubbed the "Like WOW!" tour, designed to foster public support for the newly formed initiative.