Expert Political Judgement
Philip Tetlock
This 2007 book made a splash when it was released because it tested the ability of experts to predict outcomes in various areas of finance, economics, and politics. Experts did very poorly, in fact they did worse than chance alone. Well-informed people actually did better than experts, though still worse than chance. This book describes, in great detail, how this conclusion was reached.
In many aspects of life and business, we depend on experts to use their knowledge to help us make decisions. And in many cases, this means that we depend on their judgement. Experts want you to take their word for it, but what do you do when experts disagree? How do you evaluate their skill? This book examines this question from a scientific foundation and understanding this foundation can help you understand both the value and limits of expertise.
If I ask you to make a prediction about tomorrow’s temperature in Almaty Kazakhstan, without any special information you could make one of three predictions. Tomorrow’s temperature would be definitely higher, definitely lower or about the same. Without more information, the chance of each outcome is about 33%, but if you were going to place a bet the safest bet would be “about the same”. If you knew that it was November, you’d increase the odds on definitely cooler and in April you’d be safer betting on warmer. This is a low level of expertise at work. This sort of pattern is responsible for setting a base rate expectations. For many predictions (or judgments), the starting point for judgment should be the base rate.
Sources of base rates are historical data or developing trends. If I was going to predict the weather in Almaty tomorrow, a good starting point would be the average weather in Almaty for that day for the last 25 years. My prediction might improve if I also consider how the weather has been changing over the last 10 years. Has the high temperature been rising or falling over the last few years? Has the frequency of rain been rising? Has the temperature been rising for the last few days? This sort of information gives you a base rate. Where expertise begins to kick in is when you use today’s weather in Tashkent Uzbekistan to predict tomorrow’ weather in Almaty. There is still a degree of uncertainty in the prediction and this is where the role of judgment comes into play.
Thousands of weather forecasts are made every day, and this allows weather forecasters to measure their success and failure rates and improve their forecasts. As we all know, weather prediction has improved but is not yet perfect. Skill in forecasting increases where the outcomes are clear, occur soon after the forecast so that the forecaster can compare the prediction to the outcome and learn. Many aspects of life do not lend themselves to good learning, and judgment becomes even more difficult to assess and develop. The author undertook a study of judgment in just such an area – political economy. This is the domain or economics, politics, finance, marketing and social change. Information may be sparse or contradictory, there may be long lags between prediction and outcome, so it is hard to build up experience with feedback. For example, it would be hard to find many examples of something like the Soviet Union disaggregating.
The author compares this sort of judgment to predicting the weather and asks the question, “How well do experts predict the weather compared to well-informed non-experts and compared to just picking the most likely outcome (which is often the average or the same as current conditions)?” The author recruited about 275 people with expertise in the political economy and asked them to make predictions about 100s of issues. Altogether, there were about 80,000 specific predictions. In every case, participants were asked to assign probabilities to a set of outcomes such that the total probability equaled 100%. In other words, if was not good enough to say that it would be warmer tomorrow. Participants were required to assign a probability to the prospect that tomorrow’s high temperature would be more than 5o warmer tomorrow, 2-5 o warmer, plus/minus 2 o, 1-5 o cooler, or more than 5 o cooler. A participant might assign the probabilities of 10%, 30%, 50%, 10%, and o%, respectively. This could then be compared to the actual high temperature and the average difference between predicted and actual recorded.
The overall summary is that people are not very aware of base rates, so they typically make poor predictions compared to just picking the base rate. But people vary considerably in their prediction skill: some people are quite good and some people are quite bad. In addition to making predictions, the participants were asked to complete numerous other questionnaires – and to explain the basis for assigning predictions. This allowed the author to attribute better skill to certain decision making attributes or styles. The differences can be framed using the “The fox knows many things but the hedgehog knows one big thing”, which is taken from an Isiah Berlin essay. Basically, hedgehogs tend to have a grand theory for how things work and are both committed to and confident in their predictions. They tend to assign high probabilities to their predictions, often 100%. This degree of extreme commitment makes them more likely to be wrong. Foxes tend to see the world as complex and slightly unpredictable. Small things can have a disproportionate effect on outcomes and so foxes imagine many outcomes to be uncertain. Foxes are much less likely to assign a 100% probability to an outcome. Because of this, foxes make better predictions than hedgehogs. To be clear, the experts actually agreed in many cases in their predictions and were correct. The difference becomes notable when the outcomes were different from the predictions.
Because the participants represented a variety of specialties, the predictions of experts in their own domains could be compared to the predictions of experts from adjacent domains. For example, the predictions of an economist about the economy could be compared to the predictions of a political scientist about the economy. The author called these adjacent experts “dilettantes”. Dilettantes tended to better forecasters than hedgehogs. To be clearer, a hedgehog dilettante was better in an adjacent field than the hedgehog expert in that field. In an adjacent field, hedgehogs are less certain and more like foxes.
The concept of learning how to make or recognize better decision making has been a long running theme in psychology, but it also has many skeptics. Is there really such a thing as a good decision making process? Is it strictly based on probability or luck? If the reality is that it is lucky, then skill is an illusion. The book devotes many pages to addressing the question. Condensed to its essence, in a highly uncertain domain like political science, experts make better predictions than non-experts because they know more, but the returns for expertise drop quickly. Near experts may be better forecasters than deep experts. And moderately-confident (as opposed to highly-confident), non-ideological (in the sense that the experts are not very driven by a particular theory) experts are much more likely to make good predictions.
In an increasingly complex world with many domains of uncertainty, we need expert opinion to guide us in making choices. We need both to become better consumers of expert opinions and better creators of expert opinions. The suggestion of this book is that confidence in a prediction should be a red flag. Explanation of an opinion based on a theory (that is untestable), ought to be a red flag. The inability to adjust predictions when new information becomes available is a red flag.
One of the techniques that has been proposed as a way to expand our ability to predict beyond our current thinking is scenario planning. The author tested the idea that scenario planning would make hedgehogs more like foxes by forcing them to consider more possible futures more explicitly. Scenario planning did open up hedgehog thinking some, but it really opened up expert fox thinking. The inclination to think of additional possibilities was massively exaggerated and expert fox predictions became quite poor. Dilettante foxes were barely effected. Apparently, expert foxes could come up with good stories for the most unlikely outcomes and talk themselves into applying a probability to that possibility. In effect, they identified 120-150% of the possible outcomes (hedgehogs also identified an excess slightly over 100%). So another possible red flag is many possible outcomes supported by a detailed story. After correction back to a total of 100% and depending on who was doing the scenario analysis method, predictions either did not change (hedgehogs) or got worse (foxes). Scenario planning might not be a good tool for improving prediction quality.
A counterpoint to the negative effect of overactive imagination could be forced application of theory. If increased use of imagination can improve the accuracy of hedgehogs, increased use of theory can improve the accuracy of foxes. These are both examples of self-subversion of our preferred cognitive modes. In reality, people are constantly switching between cognitive modes sometimes using theory-based and sometimes using imagination-based thinking, so the impact of hedgehog/fox styles might be associated with acting as an expert. Hedgehog-thinking increases closure and simplicity, while fox thinking increases confusion and complexity.
The author explains that the hedgehog-fox divergence is about what happens in the edges of expert thinking. Both foxes and hedgehog have operating theories of the world and these are the primary cognitive tools for making predictions. Theory-driven thinking gives better results than imagination-drive thinking the vast majority of the time. The difference lies in the commitment to the theory. Hedgehogs are certain that the theory fits, while foxes imagine deviations from theory more easily. In questions with less uncertainty, experts of both types make the same predictions.
There is another interesting feature of looking at these large groups of people. The average forecast of hedgehogs is 95% better than any individual hedgehog. The average forecast of foxes was 70% better than any fox. This is an example of the wisdom of crowds. The extremes of hedgehog predictions balance out so that the average is more accurate. So another red flag might be over-reliance on only a few experts. The average predictions of many people of varied expertise may be better than that of a few “real” experts.
Comment and interpretation:
- The book is centered on political judgment, but the same issues occur in other areas where judgment is required. For example, innovators need to use their judgment is assessing potential new ideas or products. Business people also use their judgment in creating strategies, business plans, and building a project portfolio. This is where a tendency to be a hedgehog meets business culture. Business culture approves of hedgehogs (decisive, confident) and dislikes foxes (indecisive, equivocal). This could lead foxy people to act like hedgehogs and reinforce an overconfident and absolute perspective on judgement calls.
- The book is clearly written for a semi-specialized audience that knows some of these methods and so needs minimal explanation. There were exhaustive explanations of certain aspects of data collection followed by skipping and jumping over how the data was converted into charts. I could not follow the methods well enough to know for sure that the data supported the conclusions. I am in the position of accepting that others would have torn this work apart if the methods were off. As such, this is not a book for everyone (I am an experienced skimmer of content that I don’t really understand and close reader of that I do).
- The author strives to take a scientific/logical approach to the analysis, but most social sciences lack the predictive power of the hard sciences like chemistry and physics. It is possible that in the future social sciences will reach that stage that physics is now (physics took a few centuries to reach today’s quantitative power).
- One of the more powerful ideas from modern decision-making theory is that people have a poor understanding of base rates, and this leads them to be overly optimistic or pessimistic about change. It is often the best bet to assume that nothing will really change. Applied to innovation, one of the challenges for an “objective innovator” is to understand the base rate of success. What percentage of new ideas make it to market and what percentage succeed? The answers probably vary across industries and markets, and depend on how you measure success. But almost any way that you try to measure innovation success, the rate is low. Probably in the single digits. This creates a dilemma. More than 90% of ideas should be rejected, but which 90%? Teams go out to collect information, but should they be foxy (uncertain, which reinforces the tendency to reject) or “hedgehoggy” (where conviction is more important than inconsistent or indistinct evidence)? There is no coherent theory of innovation yet, so it is hard to invoke theory in support of any single choice to support or abandon a specific innovation. The closest that you can come is to invoke a portfolio approach to make many small bets and let survivors become clear over time.
- When applied to making a forecast in an ambiguous situation, foxes clearly have an advantage in preparing for multiple possible outcomes because they are less likely to over-commit to an outcome. However, this same trait might be detrimental in other circumstances. Foxes are less likely to be visionaries; the high commitment of a hedgehog to some theory of action enables them to ignore all the ways that things can and do go wrong (https://www.npr.org/2018/04/30/606024243/the-fox-and-the-hedgehog-the-triumphs-and-perils-of-going-big ). It is not hard to imagine that various prominent business leaders are hedgehogs. The history of innovation is full of stories of people who stubbornly stuck with their vision to overcome obstacles and eventually obtain success. If you believe that the future can be changed, then it is probably stubborn people who change the future most.
- It is not hard to imagine that executive training emphasizes being like a hedgehog. Be firm, simple, and consistent. Don’t admit mistakes but act as if you were right all along. It is easy to see how this behavior would be view as strong leadership. And it is easy to see how this approach could work in worlds where bets are small, the future is predictable, or others can’t judge the effects of predictions. It also makes me wonder if this over-commitment to one view of the future is behind the decreasing tenure of CEOs. If the CEO is committed to one view of the future and future evolves differently, their inability to recalibrate and change might make them ineffective and need replacing. Boards may continue to seek the same sort of confident visionary without realizing that they are creating the problem themselves – they keep hiring hedgehogs, when they need hedgefoxes.
- I can accept the argument that scenario planning is a poor way to build up a prediction quality. But my previous understanding was that prediction was not a goal of scenario planning – possibility was. People could use scenario planning to thinking through how particular forces could result in different combinations of outcomes. These could be used to develop contingency plans to be activated under appropriate situations.
- The book has some interesting quotes related to reaching decisions and making predictions. “You never know you have had enough until you’ve had more than enough.” – William Blake. “The impossible sometimes happens and the inevitable sometimes does not.” – Daniel Kahneman. “Men must use the past to prop up their prejudices.” – AJP Taylor. “When the facts change, I change my mind. What do you do, sir?” John Maynard Keynes. “The test of a first-rate mind is the ability to hold opposing ideas in the mind at the same time, and still retain the ability to function.” – F. Scott Fitzgerald. “I do not pretend to start with precise questions. I do not think you can start with anything precise. You have to achieve such precision as you can, as you go along.” – Bertrand Russell.
Recent Comments