Read Everything Is Obvious Online
Authors: Duncan J. Watts
That the nature of the outcome should matter is about as commonsense an observation as one can think of. If great harm is caused, great blame is called for—and conversely, if no harm is caused, we are correspondingly inclined to leniency. All’s well that end’s well, is it not? Well, maybe, but maybe not. To be clear, I’m not drawing any conclusion about whether Joseph Gray got a fair trial, or whether he deserved to spend the next fifteen years of his life in prison; nor am I insisting that all drunk drivers should be treated like murderers. What I am saying, however, is that in being swayed so heavily by the outcome, our commonsense notions
of justice inevitably lead us to a
logical
conundrum. On the one hand, it seems an outrage not to punish a man who killed four innocent people with the full force of the law. And on the other hand, it seems grossly disproportionate to treat every otherwise decent, honest person who has ever had a few too many drinks and driven home as a criminal and a killer. Yet aside from the trembling hand of fate, there is no difference between these two instances.
Quite possibly this is an inconsistency that we simply have to live with. As sociologists who study institutions have long argued, the formal rules that officially govern behavior in organizations and even societies are rarely enforced in practice, and in fact are probably impossible to enforce both consistently and comprehensively. The real world of human interactions is simply too messy and ambiguous a place ever to be governed by any predefined set of rules and regulations; thus the business of getting on with life is something that is best left to individuals exercising their common sense about what is reasonable and acceptable in a given situation. Most of the time this works fine. Problems get resolved, and people learn from their mistakes, without regulators or courts of law getting involved. But occasionally an infraction is striking or serious enough that the rules have to be invoked, and the offender dealt with officially. Looked at on a case-by-case basis, the invocation of the rules can seem arbitrary and even unfair, for exactly the reasons I have just discussed, and the person who suffers the consequences can legitimately wonder “why me?” Yet the rules nevertheless serve a larger, social purpose of providing a rough global constraint on acceptable behavior. For society to function it isn’t necessary that every case get dealt with consistently, as nice as that would be. It is enough simply to discourage certain kinds of antisocial behavior with the threat of punishment.
2
Seen from this sociological perspective, it makes perfect sense that even if some irresponsible people are lucky enough to get away with their actions, society still has to make examples of violators occasionally—if only to keep the rest of us in check—and the threshold for action that has been chosen is that harm is done. But just because sociological sense and common sense happen to converge on the same solution in this particular case does not mean that they are saying the same thing, or that they will always agree. The sociological argument is not claiming that the commonsense emphasis on outcomes over processes is right—just that it’s a tolerable error for the purpose of achieving certain social ends. It’s the same kind of reasoning, in fact, that Oliver Wendell Holmes used to defend freedom of speech—not because he was fighting for the rights of individuals per se, but because he believed that allowing everyone to voice their opinion served the larger interest of creating a vibrant, innovative, and self-regulating society.
3
So even if we end up shrugging off the logical conundrum raised by cases like Joseph Gray’s as an acceptable price to pay for a governable society, it doesn’t follow that we should overlook the role of chance in determining outcomes. And yet we do tend to overlook it. Whether we are passing judgment on a crime, weighing up a person’s career, assessing some work of art, analyzing a business strategy, or evaluating some public policy, our evaluation of the process is invariably and often heavily swayed by our knowledge of the outcome, even when that outcome may have been driven largely by chance.
This problem is related to what management scientist Phil Rosenzweig calls the Halo Effect. In social psychology, the
Halo Effect refers to our tendency to extend our evaluation about one particular feature of another person—say that they’re tall or good-looking—to judgments about other features, like their intelligence or character, that aren’t necessarily related to the first feature at all. Just because someone is good-looking doesn’t mean they’re smart, for example, yet subjects in laboratory experiments consistently evaluate good-looking people as smarter than unattractive people, even when they have no reason to believe anything about either person’s intelligence. Not for no reason, it seems, did John Adams once snipe that George Washington was considered a natural choice of leader by virtue of always being the tallest man in the room.
4
Rosenzweig argues that the very same tendency also shows up in the supposedly dispassionate, rational evaluations of corporate strategy, leadership, and execution. Firms that are successful are consistently rated as having visionary strategies, strong leadership, and sound execution, while firms that are performing badly are described as suffering from some combination of misguided strategy, poor leadership, or shoddy execution. But as Rosenzweig shows, firms that exhibit large swings in performance over time attract equally divergent ratings, even when they have pursued exactly the same strategy, executed the same way, under the same leadership all along. Remember that Cisco Systems went from the poster child of the Internet era to a cautionary tale in a matter of a few years. Likewise, for six years before its spectacular implosion in 2001, Enron was billed by
Fortune
magazine as “America’s most innovative company,” while Steve & Barry’s—a now-defunct low-cost clothing retailer—was heralded by the
New York Times
as a game-changing business only months before it declared bankruptcy. Rosenzweig’s conclusion is that in all these cases, the way firms are rated
has more to do with whether they are perceived as succeeding than what they are actually doing.
5
To be fair, Enron’s appearance of success was driven in part by outright deception. If more had been known about what was really going on, it’s possible that outsiders would have been more circumspect. Better information might also have tipped people off to lurking problems at Steve & Barry’s and maybe even at Cisco. But as Rosenzweig shows, better information is not on its own any defense against the Halo Effect. In one early experiment, for example, groups of participants were told to perform a financial analysis of a fictitious firm, after which they were rated on their performance and asked to evaluate how well their team had functioned on a variety of metrics like group cohesion, communication, and motivation. Sure enough, groups that received high performance scores consistently rated themselves as more cohesive, motivated, and so on than groups that received low scores. The only problem with these assessments was that the performance scores were assigned at random by the experimenter—there
was
no difference in performance between the high and low scorers. Rather than highly functioning teams delivering superior results, in other words, the appearance of superior results drove the illusion of high functionality. And remember, these were not assessments made by external observers who might have lacked inside information—they were by the very members of the teams themselves. The Halo Effect, in other words, turns conventional wisdom about performance on its head. Rather than the evaluation of the outcome being determined by the quality of the process that led to it, it is the observed nature of the outcome that determines how we evaluate the process.
6
Negating the Halo Effect is difficult, because if one cannot rely on the outcome to evaluate a process then it is no
longer clear what to use. The problem, in fact, is not that there is anything wrong with evaluating processes in terms of outcomes—just that it is unreliable to evaluate them in terms of any
single
outcome. If we’re lucky enough to get to try out different plans many times each, for example, then by keeping track of all their successes and failures, we can indeed hope to determine their quality directly. But in cases where we only get to try out a plan once, the best way to avoid the Halo Effect is to focus our energies on evaluating and improving what we are doing while we are doing it. Planning techniques like scenario analysis and strategic flexibility, which I discussed earlier, can help organizations expose questionable assumptions and avoid obvious mistakes, while prediction markets and polls can exploit the collective intelligence of their employees to evaluate the quality of plans before their outcome is known. Alternatively, crowdsourcing, field experiments, and bootstrapping—discussed in the last chapter—can help organizations learn what is working and what isn’t and then adjust on the fly. By improving the way we make plans and implement them, all these methods are designed to increase the likelihood of success. But they can’t, and should not, guarantee success. In any one instance, therefore, we need to bear in mind that a good plan can fail while a bad plan can succeed—just by random chance—and therefore try to judge the plan on its own merits as well as on the known outcome.
7
Even when it comes to measuring individual performance, it’s easy to get tripped up by the Halo Effect—as the current outrage over compensation in the financial industry exemplifies. The source of the outrage, remember, isn’t that bankers
got paid lots of money—because we always knew that—but rather that they got paid lots of money for what now seems like disastrously bad performance. Without doubt there is something particularly galling about so-called pay for failure. But really it is just a symptom of a deeper problem with the whole notion of pay for performance—a problem that revolves around the Halo Effect. Consider, for example, all the financial-sector workers who qualified for large bonuses in 2009—the year
after
the crisis hit—because they made money for their employers. Did they deserve to be paid bonuses? After all, it wasn’t
them
who screwed up, so why should they be penalized for the foolish actions of other people? As one recipient of the AIG bonuses put it, “I earned that money, and I had nothing to do with all of the bad things that happened at AIG.”
8
From a pragmatic perspective, moreover, it’s also entirely possible that if profit-generating employees aren’t compensated accordingly, they will leave for other firms, just as their bosses kept saying. As the same AIG employee pointed out, “They needed us to stay, because we were still making them lots of money, and we had the kind of business we could take to any competitor or, if they wanted, that we could wind down profitably.” This all sounds reasonable, but it could just be the Halo Effect again. Even as the media and the public revile one group of bankers—those who booked “bad” profits in the past—it still seems reasonable that bankers who make “good” profits deserve to be rewarded with bonuses. Yet for all we know, these two groups of bankers may be playing precisely the same game.
Imagine for a second the following thought experiment. Every year you flip a coin: If it comes up heads, you have a “good” year; and if it comes up tails, you have a “bad” year. Let’s assume that your bad years are really bad, meaning that you lose a ton of money for your employer, but that in your
good years you earn an equally outsized profit. We’ll also adopt a fairly strict pay-for-performance model in which you get paid nothing in your bad years—no cheating, like guaranteed bonuses or repriced stock options allowed—but you receive a very generous bonus, say $10 million, in your good years. At first glance this arrangement seems fair—because you only get paid when you perform. But a second glance reveals that over the long run, the gains that you make for your employer are essentially canceled out by your losses; yet your compensation averages out at a very handsome $5 million per year. Presumably our friend at AIG doesn’t think that he’s flipping coins, and that my analogy is therefore fundamentally misconceived. He feels that his success is based on his skill, experience, and hard work, not luck, and that his colleagues committed errors of judgment that he has avoided. But of course, that’s precisely what his colleagues were saying a year or two earlier when they were booking those huge profits that turned out to be illusory. So why should we believe him now any more than we should have believed them? More to the point, is there a way to structure pay-for-performance schemes that only reward real performance?
One increasingly popular approach is to pay bonuses that are effectively withheld by the employer for a number of years. The idea is that if outcomes are really random in the sense of a simple coin toss, then basing compensation on multiyear performance ought to average out some of that randomness. For example, if I take a risky position in some asset whose value booms this year and tanks a year from now, and my bonus is based on my performance over a three-year period, I won’t receive any bonus at all. It’s a reasonable idea, but as the recent real estate bubble demonstrated, faulty assumptions can appear valid for years at a time. So although stretching out the vesting period diminishes the role of luck
in determining outcomes, it certainly doesn’t eliminate it. In addition to averaging performance over an extended period, therefore, another way to try to differentiate individual talent from luck is to index performance relative to a peer group, meaning that a trader working in a particular asset class—say, interest rate swaps—should receive a bonus only for outperforming an index of all traders in that asset class. Put another way, if everybody in a particular market or industry makes money at the same time—as all the major investment banks did in the first quarter of 2010—we ought to suspect that performance is being driven by a secular trend, not individual talent.