Category: Evidence-based Management

Dan Pink on Why Financial Rewards Suck for Motivating Performance

Dan Pink makes a compelling case in his Ted Talk that financial rewards undermine performance in tasks that require creativity and complex problem-solving. As he says in the talk, this strong evidence about the negative effects of financial rewards runs counter to the assumptions embedded in nearly all major economic theory. As other research by psychologists on confirmation or "my side" bias shows, we human-beings have a tough time hearing and believing evidence that runs counter to our beliefs.

P.S. Sally, thanks for telling me about this research.

November 29, 2009
The Roar of Sports Car Engines: 100% of Women — But Only 50% of Men –Respond to a Maserati With Increased Testosterone Levels

I put up what I thought was an amusing and not especially original post on Wednesday afternoon that described a study showing that men who drove a new Porsche — but not an old Camry — responded with increased testosterone levels. It got picked up by something called Hacker News and was on the top of their list of hot items for hours (I don't really understand how this thing works). The result was that it drove 5000 or 6000 hits to my blog and generated 35 comments there. By now, after some four years of blogging, I have learned that it is impossible to know what will take off. And although it is fun when it happens, I have learned that when I stick to what feels interesting and authentic to me, I have the most fun and learn the most. But based on reactions to my two testosterone posts, this is clearly something people seem to be interested in and I confess the research on it intrigues and bewilders me.

So, to add to my posts here and here, I have one more study about cars and T levels to add to the mix. Following a link that appeared in one of the Hacker news comments, there was a related study described over Telegraph.com in the UK, headlined Sound of a sports care engine arouses women. Here is how the study is described "The 40 participants listened to the recordings of a Maserati, a Lamborghini
and a Ferrari, along with a Volkswagon Polo, before having a saliva specimen
collected." I have much less information about the nuances of this research than the other two studies, but on the face of it, the evidence seems to be that women respond more strongly than men to the sound of cars and to different cars. Note this excerpt:

The results found 100 per cent of female participants had a significant
increase in testosterone secretion after listening to the Maserati, compared
to only half for men.Men fared better at the sound of a Lamborghini, with 60 per cent showing a
testosterone increase. Psychologist David Moxon, who conducted the study commissioned by motor
insurer Hiscox, said: "We saw significant peaks, particularly in women."

"The roar of a luxury car engine does cause a primeval physiological
response." He added the sound of an average car engine actually led to a decreased level
of testosterone.

I promise this is my last post on T levels for a long time. I just couldn't resist this one.

P.S. Check out Ellie's comment. She raises excellent points about the legitimacy of this research. I am trying to contact the David Moxon to see if he can share the original data and research report with us, l hope he answers. Once again, to be clear, the other two testosterone studies were published in a top peer-reviewed journal, and while they are imperfect, they are carefully done, the authors are careful not to overstate claims, and they acknowledge flaws and alternative explanations for their findings.

November 28, 2009
More on Testosterone Levels: Driving a Porsche vs. Toyota Camry

A couple weeks back, I put up a post on Testosterone Levels, Top Dogs,and Collective Confidence, which described a study showing that groups enjoyed more collective confidence when the people with higher levels where at the top of the pecking order and those with lower levels are at the bottom (compared to "mismatched groups" where the top dogs had low levels and the underlings had high levels). There were some extremely thoughtful comments on that post, including a comment that "T" levels, as researchers call them, are heavily influenced by situational factors. Well, to that point, it turns out that — as I learned from the always useful BPS Research Digest – that this article was one of a set published in a special issue of Organizational Behavior and Decision Processes on "The Biological Basis of Business." I read through the table of contents for the issue, and came upon a study that just cracked me up on the effects of driving a Porsche vs. a Camry.

It is called "The Effects of Conspicuous Consumption on Men's Testosterone Levels" and was conducted by Gad Saad and John Vongas of Concordia University. Here is roughly what they did (I am focusing on the first of the two studies in the article). They had 39 young heterosexual men drive both "a 2006 Porsche 911 Carrera 4S Cabriolet estimated to be worth over $150,000" and a "a dilapidated 1990 Toyota Camry wagon having over 186,000 miles," each for an hour,split evenly between city and and highway driving. They randomly assigned subjects to driving either the Porsche or Camry first. They took a total of six "T" samples from each young man at various stages on the process. Most crucial for our purposes are the changes in "T" that occurred after driving the Porsche vs. the Camry, but also relevant are the two "baseline" samples taken before and after the experience.

The effect was that driving the Camry did not seem to lead in a significant change in T levels, but — no doubt to the delight of many people and perhaps the disgust of many others — the young guys who drove the Porsche experienced significant and substantial increases in T levels after driving the Porsche (in the final sample of 31 guys, 8 were excluded from the day analysis because their samples were tainted by excessive excessive blood in their mouths).

Here is the key table:

I am not sure if these results are completely obvious and trivial or completely shocking and crucial. I always had a sneaking suspicion that the "manly" feeling that comes from driving a sports car was nonsense promoted by car companies. But I guess it may have some truth. Also, I want to commend the researchers for demonstrating a lot of creativity and for — despite the straight and serious academic writing — producing one of the most entertaining academic studies I have read in a long time.

P.S. Here is the citation: G. Saad, J.G. Vongas / Organizational Behavior and Human Decision Processes 110 (2009) 80–92

November 25, 2009
Intuition vs. Data-Driven Decision-Making: Some Rough Ideas

A Stanford undergraduate doing a case analysis on using intuition versus systematic analysis wrote me an email last night to get my thoughts on the difference between the two, especially in light of the work that Jeff Pfeffer and I did on evidence-based management. Below is my lightly edited response. This is just off the top of my head (is it mostly intuition?). I would love to hear your thoughts on this distinction — if it is useful, how the two concepts fit together, when one is more useful than the others, and so on:

I don't think that intuition and evidence-based management
are at odds. There are many times when decision-makers don't have very good
data because something is new, the situation has changed (e.g., where do you invest
money right now?), or because what might seem like
intuition is really mindless well-rehearsed behavior that comes from years of
experience at something, so even though people can't articulate the pattern they
recognize, they still are acting on a huge body of experience and knowledge.
And on the very other side of experience there are virtues to the gut reaction of naive people, as those who are not properly
brainwashed may see things and come up with ideas that expertise drives out of
their brains (e.g, that is why Jane Goodall was hired to observe chimps, in
part, because she knew nothing).

The trouble with intuition is that we now have a HUGE
pile of research on cognitive biases and related flaws in decision-making that
show "gut feelings" are highly suspect. Look-up confirmation bias — people have a
very hard time believing and remember evidence that contradicts their beliefs. There
is also the fallacy of centrality, a lot more obscure, but important in that
people — especially those in authority — believe that if something important
happens, they will know about it.

My belief — and it is only partially evidence-based —
is that intuition works best in the hands of wise people (this is all over hard
facts), when people have the mindset to "act on their beliefs, while
doubting what they know," so that they are always looking for
contradictory evidence, encouraging those around them to challenge what they
believe, and constantly updating (but always moving forward), then I think that
intuition — or acting on incomplete information, hunches, conclusions — is
right. Here is one place I've talked about it. Brad Bird of Pixar is a good
example of someone with this mindset, as we learned when we interviewed him for
the McKinsey Quarterly. So is Andy
Grove. I think the most interesting
cases to look at are those where people with a history of good guesses or gut
decisions — what mistakes has Steve Jobs made?
What about Google… indeed, it is interesting that they believed they
were going to crush Firefox with Chrome , but their market share remains modest a year later. My point here isn't to say anything negative about Jobs or Google — they have impressive track records, plus some history of the usual failures that all humans and human organizations suffer from. Rather, my point is that by looking at errors by people and firms that have generally good track records, you can learn a lot about conditions under which judgment fails, because you can rule out the explanation that they generally suffer from judgment.

There is a lot written on intuition and the related topic
of quick assessments — see Blink — and some evidence (although Gladwell
exaggerates about the virtues of snap judgments, as the best are often made by
people with much experience in the domain, but as always he makes wonderful points). Also see this book by David Myers for a balanced
and evidenced perspective on intuition.

My view is that intuition and analysis are not opposing
perspectives, but tag team partners that, under the best conditions, where
hunches are followed and then evaluated with evidence (both quantitative and
qualitative, that is another issue, qualitative data are different than
intuition, and often better) versus when hunches and ingrained behaviors are
mindlessly followed and impervious to clear signs that they are failing.

Work Matters readers: Again, I would appreciate your thoughts, as this is one of those core challenges for every boss and for a lot of behavioral scientists too!

November 1, 2009
I Am Just Like You

A few days back, I wrote about David Dunning's book Self-Insight, which presents a compelling case that there are numerous impediments to self-awareness and that many of these roadblocks are mighty difficult to overcome. I am now on the last chapter, which contains some interesting ideas about how to increase our awareness of how skilled or unskilled we might be at things and our awareness of how others see us. Dunning points out that a host of studies show that one major impediment to self-awareness is that people see themselves as unique — usually as superior to others – when that actually are not: as more ethical, emotionally complex, skilled, and so on. Dunning proposes on page 166 that:

"People would hold more accurate self-perceptions if they conceded that their psychology is not different from the the psychology of others, that their actions are molded by the same situational forces that govern the behavior of other people. In doing so, they could more readily learn from the experiences of others, using data about other people's outcomes to forecast their own."

I find this quite fascinating. I believe that the average person would benefit from this perspective, but some industries would suffer — especially those that have a kind of Ponzi scheme quality where most people fail, a rare successes happens now and then, but no matter what happens, the people who run the system always seem to benefit. Both casino operators and venture capitalists come to mind.

The implication, however, that if we assume "I am just like you" rather than "I am special and different," or even that "we are all the same," we might make better decisions and learn at others' expense rather than our own strikes me as a lesson that could be quite valuable. For example, I've been rather obsessed about the virtues and drawback of learning from others mistakes rather than your own (see this post on Randy Komisar and Eleanor Roosevelt), as this question has huge implications about how to teach people new skills and the best way to develop competent and caring human-beings.

October 30, 2009
Reducing Interruptions and Saving Lives: New Study on Drug Treatment Errors

I have written here and other places on Amy Edmondson's wonderful research on how, when nurses feel as if they have psychological safety, they openly talk about and try to correct drug treatment errors, but when they work in a climate of fear, they are afraid to even admit when they have made mistakes — which led to a rather bizarre finding in Amy's early research that in nursing units where people felt safe, even compelled,to talk about and learn from mistakes, they reported ten times more errors than in a nursing unit where the supervisor slammed nurses who admitted or where "caught" making mistakes.

This morning's San Francisco Chronicle reports an equally fascinating study on reducing drug treatment errors. This one focuses on the evils of interruptions, which as research by Gloria Mark shows, slows and undermines performance, and creates great job stress. As the article reports "A UCSF program to improve accuracy in administering drugs – with
particular emphasis on reducing interruptions that often lead to
mistakes – resulted in a nearly 88 percent drop in errors over 36
months at the nine Bay Area hospitals, according to results being
released today." The cool thing about the article is that the nurses at different hospitals invented different local methods for reducing interruptions, to the vest you see pictured above to covering windows so colleagues couldn't see them (and thus run in and interrupt them), to developing quiet zones, or quiet times during drug administration. Note that drug treatment errors are huge problem, resulting in over 400,000 preventable injuries per year and 3.5 billion in costs. So a 88% reduction is huge.

This research is also fascinating to me because it shows how, so often, when people say they are too busy, don't have enough money, or their will be resistance to change that these are excuses, or worse yet, negative self-fulfilling prophecies. In particular, I think that people — especially managers — often use spending money as a substitute for thinking, when inexpensive and low-tech solutions work just fine. I am looking forward to digging into this research further.

October 28, 2009
Flawed Self-Evaluations: David Dunning’s Facinating Work

Professor David Dunning from Cornell University, along with numerous colleagues, has done fascinating and sometimes discouraging research on self-awareness. His most famous paper on the topic was published in 1999 with Kruger … check-out the abstract of Unskilled and Unaware of it. I have known about it for a long time, but I have just discovered Dunning's book, Self-Insight: Roadblocks and Detours on the Path to Knowing Thyself. This is a pretty pure academic book, but it sure is fascinating, and should make all of us stop and pause when we feel supremely confident about ourselves. You can learn tidbits like people do a pretty bad job of guessing their IQ scores, are downright awful at rating their ability to catch other people's lies, that workers do a far worse job of assessing their own social skills than their superiors or peers, that in survey of thousands of high school seniors 70% of respondents rated their leadership ability as above average while only 2% rated their leadership ability as below average, and — turning to my own profession — that 94% of college professors say they do above average work.

Self-Insight also contains an update of research for the 1999 article — the basic finding is that people with worst skill levels at diverse tasks (ranging from debating skill to having a good sense of humor) consistently overestimate their abilities by huge amounts. For example, people who had skill levels at the 12th or 13th percentile usually estimated that they were in the 60th percentile of performance. In contrast, people above the 50th percentile made far more accurate assessments — although the most skilled people tended to underestimate their relative skill a bit.

The upshot of this rather famous work is that you should be wary of self-assessments in general, but especially wary of people who seem to be incompetent. As Dunning puts it, "The central contention guiding this research is that poor performers simply do not know — indeed cannot know — how badly they are performing. Because they lack the skills required to produce correct answers they also lack the skills to accurately judge whether their own answers are correct."

The book has all sorts of great research and I found it a lot more fun to read than most academic books, but be warned that it contains a lot of studies and such.

October 27, 2009
Selecting Talent: The Upshot from 85 Years of Research

I recently wrote about how the "talent wars" are likely to be returning soon in the U.S. (and indeed, there are signs they have already returned in places like China and Singapore), and how companies that have treated people well during the downturn will have an advantage in keeping and retaining the best people –and those that have not damn well better change their ways or will face the prospect of their best people running for the exits in concert with the inability to attract the best people. A related question has to do with the problem of determining who the best people might be — what does the best evidence say about the best way to pick new people?

Its is always dangerous to say there is one definitive paper or study on any subject, but in this case there is candidate — a paper I have blogged about before when taking on graphology (handwriting analysis). But there is one article that just might qualify. It was published by Frank Schmidt and the late John Hunter in the Psychological Bulletin in 1998. These two very skilled researchers
analyzed the pattern of relationships observed in peer reviewed journals during
the prior 85 years to identify which employee selection methods were best and
worst as predictors of job performance. They used a method called "meta-analysis" to do this, which they helped to develop and spread. The advantage of this method is — in the hands of skilled researchers like Schmidt and Hunter — is it reveals the overall patterns revealed by the weight of evidence, rather than the particular quirks of any single study.

The upshot of this research is that work sample tests (e.g., seeing if people can
actually do key elements of a job — if a secretary can type or a programmer can write code ), general mental ability (IQ and related tests), and structured interviews had the highest validity of all methods examined (Arun, thanks for the corrections). As Arun also suggests, Schmidt and Hunter point out that three combinations of methods that were the most powerful predictors of job performance were GMA plus a work sample test (in other words, hiring someone smart and seeing if they could do the work), GMA plus an integrity test, and GMA plus a structured interview (but note that unstructured interviews, the way they are usually done, are weaker).

Note that this information about combinations is probably more important than the pure rank ordering, as it shows what blend of methods works best, but here is also the
rank order of the 19 predictors examined, rank ordered by the validity coefficient, an indicator of how strongly the individual method is linked to performance:

1. Work sample tests (.54)

2. GMA tests …"General mental ability" (.51)

3. Employment interviews — structured (.51)

4. Peer ratings (.49)

5. Job knowledge tests (.48) Test to assess how much employees know about specific aspects of the job.

6. T & E behavioral
consistency method (.45) "Based
on the principle that past behavior is the best predictor of future
behavior. In practice, the method involves describing previous
accomplishments gained through work, training, or other experience
(e.g., school, community service, hobbies) and matching those
accomplishments to the competencies required by the job. a method were past achievements that are thought to be important to behavior on the job are weighted and score

7. Job tryout procedure (.44) Where employees go through a trial period of doing the entire job.

8. Integrity tests (.41) Designed to assess honesty … I don't like them but they do appear to work.

9. Employment interviews — unstructured (.38)

10. Assessment centers (.37)

11. Biographical data measures(.35)

12. Conscientiousness tests (.31) Essentially do people follow through on their promises, do what they say, and work doggedly and reliably to finish their work.

13. Reference checks (.26)

14. Job experience –years (.18)

15. T & E point
method (.11)

16. Years of education (.10)

17. Interests (.10)

18. Graphology (.02) e.g., handwriting analysis.

19. Age (-01)

Certainly, this rank-ordering does not apply in every setting. It is also important to recall that there is a lot of controversy about IQ, with many researchers now arguing that it is more malleable than previously thought. But I find it interesting to see what doesn't work very well — years of education and age in particular. And note that unstructured interviews, although of some value, are not an especially powerful method, despite their widespread use. Interviews are strange in that people have excessive confidence in them, especially in their own abilities to pick winners and losers — when in fact the real explanation is that most of us have poor and extremely self-serving memories.

Many of these methods are described in more detail here by the Society for Industrial and Organizational Psychology. Also note that I am not proposing that any boss or company just mindlessly apply this rank ordering, but I think it is useful to see the research.

The reference for this article is:

Schmidt, F.L.
& Hunter, J.E. (1998) The validity and utility of selection methods in
personnel psychology: Practical and theoretical implications of 85 years of
research findings,” Psychological Bulletin, 124, 262–274.

P.S. Note the corrections, thanks Arun!

October 23, 2009
Squeaky Wheels, The Health Care Debate, and Student Complaints About Grades

The academic year at Stanford has started and, although my main teaching isn't until next quarter, I am starting to review my courses and think about what changes I am going to make this year. After thinking about last year, and some of the complaints I had about grades, I am thinking that I need to spell-out my policy more strongly and clearly than before: If you complain about your grade on an assignment, I regrade the whole assignment and your grade can go up and down. This kind of policy is necessary in my classes as — especially for the engineering students I teach — doing well requires strong writing and creative skills, and is more objective than the problem sets and other objective tests that students often get in other classes. My final exam question, for example, is "Design the ideal organization. Use course concepts to defend your answers." I have learned over the years that there seems to be little relationship between how much students complain and the quality of their work. I sometimes think it is a personality characteristic. More likely, however, there are a subset of students who have learned that the more they complain about grades, the better grades they get.

Although I don't like student complaints,some compelling research shows there are considerable rewards for people who complain. This brings us to the health care debate because there is good reason to believe that whatever system we end-up in the U.S., that we ought to take the squeaky wheel problem into account — both to protect patients and insurance companies. There was fascinating 2004 study published in the Annals of Emergency Medicine by Carole Roan Gresenz and David M. Studdert on the outcomes of approximately 3500 disputes filed by patients over insurance payments they received for emergency room visits (here is the abstract). These data were provided by two of the largest Health Maintenance Organizations in the United States. The researchers found patients who filed formal complaints through the appeals process won more than 90% of the time — and the average size of the bill disputed was $1,107, so not exactly chicken feed. The other lesson from this research is that people who did not appeal never got a penny — so squeaking definitely paid-off. The policy questions are complex and I lack the knowledge to untangle them here. Many people do not appeal, so the lesson might be that it is cheaper from HMOs and other health insurance operations to underpay consistently and just cave in quickly when people do complain. The result may be that a lot of people are unwittingly getting worse coverage than they deserve because they don't have the time, motivation, or information about the odds of success. And a related result might be that insurance providers have a system (not entirely of their own design… they are constrained by laws and rules) that is producing a massive number of complaints.

The broader lesson, to go back my grading and the squeaky wheel problem, is that there are probably too many incentives out there for all of us to complain… and if you are running organization or system that you believe uses fair standards to judge people's merit, performance, or whatever — but people seem to be complaining constantly anyway — take a good look at how you respond to complaints. Do the squeaky wheels get the grease, whether they deserve it or not?

October 7, 2009
What are the Dumbest Practices Used By U.S. Companies?

I am going to be in Singapore next week to do a number of things associated with the Singapore Human Capital Summit and something called the Distinguished Human Resource Visitors Programme. I am doing an in-depth speech on innovation and implementation, and the link between the two, as part of the visit — stuff I have been writing and thinking about for a long time. That talk is finished and I am just working on rehearsing it a bit.

I am also part of the closing panel for the Human Capital Summit (You can see the line-up here if you are interested in the schedule– senior executives from P&G and DBS, a the Minister of Trade and Industry from Singapore, and a Professor from Insead). I am developing a list of things that I might talk about, but plan to make my final decision at the last minute. There are a few reasons for this delay. My remarks are supposed to be informal, as it is a wrap-up for the conference I want to be open to learning from it rather than go in knowing what I am going say — experience be damned, and finally — to be a realist — I am going last after the first three speakers, and years of experience have taught me that, when you are in that position, you've got to be ready to cut back your remarks on the spot. It used to upset me early in my career, but now I take it as a weirdly fun challenge (indeed, I once was asked in Dubai to cut a talk from 30 minutes to 8 with about 5 minutes warning…. I was amazed how easy it was to do and how much the audience seemed to like it. Ever since then, I've kind of enjoyed when strange things happen during talks.. so long as there isn't over hostility or dysfunctional dynamics in the room.)

One thing I am thinking about emphasizing is that, despite the recent clear evidence that U.S. leaders and companies make flawed assumptions and use suspect management practices, some of the our worst ideas seem to keep spreading to the rest of the world anyway. I am brainstorming a list of dumb but widely used American management practices. Here is my initial list but I would love to hear more ideas:

1. Dangerous Complexity. The assumption that when we can't understand an expert, they must be both smart and right. This is certainly part of the Wall Street story — for years the financial wizards and economists have conveyed to the rest of us that we are far too dumb to ever understand what they are doing. An interesting contrast, by the way, is JP Morgan CEO Jamie Dimon. If you read Gillian Tett's Fools Gold, you will see that one reason that JP Morgan avoided the worst of the collapse was that Dimon believed that, if you were investing in something you couldn't understand, you should get out. Clearly, most companies did not follow and are not following P&G 's A.G. Lafley's advice to keep things "Sesame Street Simple."

2. Dysfunctional Internal Competition. This is a big theme in The Knowing-Doing Gap and Morten's Hansen's masterpiece Collaboration. If you dig into the problems in the banks and a lot of other companies, they actually punish people who help others succeed, both via the reward systems and who gets the most prestige. This seems to persist even though the evidence against such assumptions and systems are so clear.

3. Breaking-up Teams Constantly. American companies often seem to love moving people around constantly, breaking-up teams, giving people new experiences, and so on. Certainly, there is a time for fresh blood, but if you read J. Richard Hackman's Leading Teams you will see that the weight of the evidence is that breaking up teams less often rather than more often is linked to all sorts of effectiveness indicators. Also, see this post about the Miracle on the Hudson where I discuss this literature.

So, those are just three that I am toying with. But I bet that you have a lot more ideas and a lot of good ones. What do you think… What are the dumbest practices used by U.S. companies, the practices that unwittingly drive them to ruin, or probably more often, they succeed DESPITE rather than BECAUSE they are used?

P.S. A big thank you to everyone below for your thoughtful and wide-ranging comments. I think there is enough material here for a book, not just a short speech. Also, I just found-out this post was also put on BNET and there are another 35 or so comments there, which are also — like those below — quite troubling, inspiring, and often funny in a twisted kind of way.

September 23, 2009