Why do we use similarity to gauge statistical probability?


The Representativeness Heuristic

, explained.

What is the representativeness heuristic?

The representativeness heuristic is a mental shortcut that we use when making judgments about the probability. Specifically, when we are trying to assess how likely it is that an event or object A belongs to class B, we tend to make this judgment based on how closely A resembles B (or how representative we believe A is for B).

Where this bias occurs

Let’s say you’re going to a concert with your friend, Sarah. Sarah has also invited two of her friends, whom you’ve never met before. You know that one of them is a mathematician, while the other one is a musician.

When you finally meet Sarah’s friends, John and Adam, you see that John wears glasses and is a bit shy, while Adam is more outgoing and dressed in a T-shirt and jeans. Without asking what they do for a living, you assume that John must be the mathematician and Adam must be the musician. Except, you were mistaken, as the contrary is true.

Individual effects

Because we tend to rely on representativeness, we often fail to take other kinds of information into account, which can cause us to make mistakes. This heuristic is so pervasive that researchers attribute many other cognitive biases to it, including the conjunction fallacy and the gambler’s fallacy.

Systemic effects

The representativeness heuristic can contribute to prejudice and systemic discrimination. Because we rely on categories and prototypes to guide our perception of others, we can easily end up drawing on stereotypes to make judgments about other people.

Why it happens

The representativeness heuristic was coined by Daniel Kahneman and Amos Tversky, two of the most influential figures in behavioral economics. The classic example used to illustrate this bias asks the reader to consider Steve, whom an acquaintance has described as “very shy and withdrawn, invariably helpful, but with little interest in people, or in the world of reality. A meek and tidy soul, he has a need for order and structure, and a passion for detail.” After reading a description of Steve, do you think it’s more likely that Steve is a librarian, or a farmer? 2 Intuitively, most of us feel like Steve must be a librarian because he’s more representative of our image of a librarian than he is our image of a farmer.

As with all cognitive biases and heuristics, there is one main reason we rely on representativeness so often: we have limited cognitive resources. Every day, we make thousands of separate decisions, and our brains are wired to do so while conserving as much energy as possible. This means we often rely on shortcuts to make quick judgments about the world. However, there is another major reason that the representativeness heuristic happens. It is rooted in the fundamental way that we perceive and understand people and objects.

We draw on prototypes to make decisions

Grouping similar things together—that is, categorizing them—is an essential part of how we make sense of the world. This might seem like a no-brainer, but categories are more fundamental to our ability to function than many people realize. Think of all the things you are likely to encounter in a single day. Whenever we interact with people, objects, or animals, we draw on the knowledge we’ve learned about their category so that we can know what to do. When you go to a dog park, for example, you might see animals in a huge range of shapes, sizes, and colors, but because you can categorize them all as “dog,” you immediately know roughly what to expect from them: that they like to run and chase things, that they like getting treats, and that if one of them starts growling, you should probably back away.

Without categories, every time we encountered something new, we would have to learn from scratch what it was and how it worked—not to mention the fact that storing so much information about every separate entity would be impossible, giving our limited cognitive capacity. Our ability to understand and remember things about the world relies on categorization. On the flip side, the way we have learned to categorize things can also affect how we perceive them.3 For example, in Russian, lighter and darker shades of blue have different names (“goluboy” and “siniy,” respectively), whereas, in English, both are referred to as “blue.” Research has shown that this difference in categorization affects how people see the color blue: Russian speakers are faster at discriminating between light and dark blues, compared to English speakers.4

According to one theory of categorization, known as prototype theory, people use unconscious mental statistics to figure out what the “average” member of a category looks like. When we are trying to make decisions about unfamiliar things or people, we refer to this average—the prototype—as a representative example of the entire category. There is some interesting evidence to support the idea that humans are somehow able to compute “average” category members like this. For instance, people tend to find faces more attractive the closer they are to the “average” face, as generated by a computer.5

Prototypes guide our guesses about probability, like in the example above about Steve and his profession. Our prototype for librarians is probably somebody who resembles Steve quite closely—shy, neat, and nerdy—while our prototype for farmers is probably somebody more muscular, more down-to-earth, and probably less timid. Intuitively, we feel like Steve must be a librarian because we are bound to think in terms of categories and averages.

We overestimate the importance of similarity

The problem with the representativeness heuristic is that representativeness doesn’t actually have anything to do with probability—and yet, we put more value on it than we do on information that is relevant. One such type of information is prior probability or base rates: how common something is in general. For instance, at least in the U.S., there are many more farmers than there are librarians. This means that in statistical terms, it would always be incorrect to say Steve is “more likely” to be a librarian, no matter what his personality is like or how he presents himself.2

Sample size is another useful type of information that we often neglect. When we are trying to make estimates about a large population, based on data from a smaller sample, we want our sample to be as large as possible, because then we have a more complete picture. But when we focus too much on representativeness, sample size can end up being crowded out.

To illustrate this, imagine a jar filled with balls. ⅔ of the balls are one color, while ⅓ are another color. Sally draws 5 balls from the jar, of which 4 are red and 1 is white. James draws 20 balls, of which 12 are red and 8 are white. Between Sally and James, who should feel more confident that the balls in the jar are ⅔ red and ⅓ white?

Most people say Sally has better odds of being right because the proportion of red balls she drew is larger than the proportion drawn by James. But this is incorrect: James drew 20 balls, much greater than Sally’s 5, so he is in a better position to judge the contents of the jar. We are intuitively tempted to go for Sally’s 4:1 sample is because it is more representative of the ratio we’re looking for than James’ 12:8, but this leads us to an error in our judgment.

Why it is important

The representativeness heuristic is a very pervasive bias, and many researchers believe it is the foundation of several other biases and heuristics that affect our processing. One example is the conjunction fallacy, which occurs when we assume that it is more likely for multiple things to co-occur than it is for a single thing to happen on its own. Statistically speaking, this is never true.

The most famous example of the conjunction fallacy also comes from Tversky and Kahneman. In one experiment, they gave people this description:

“Linda is 31 years old, single, outspoken and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.”

After reading this, Tversky and Kahneman had people rank several statements in order of how likely they were to be true. This list included these three: “Linda is active in the feminist movement,” “Linda is a bank teller,” and “Linda is a bank teller who is active in the feminist movement.”6 People believed that it was more likely for Linda to be a bank teller and a feminist than it was for Linda to just be a bank teller. This stems from the representativeness heuristic: the fact that Linda matches up with people’s prototypical image of a feminist skews their perception of probability.

Another bias caused by the representativeness heuristic is the gambler’s fallacy, which causes people to apply long-term odds to short-term sequences. For example, in a coin toss, there is roughly a fifty-fifty chance of getting either heads or tails, but that doesn’t mean that if you flip a coin twice, you’ll get heads one time and tails the other. This probability only works over long sequences, like tossing a coin a hundred times. However, we believe that short-term odds should be representative of their long-term counterparts, giving rise to the gambler’s fallacy.7 Like its name suggests, this bias can have serious consequences for gamblers—for example, if somebody believes that their odds of winning are better if they’ve been on a losing streak for a while.

The representativeness heuristic and stereotypes

Our reliance on categories can easily tip over into prejudice, even if we don’t realize it. The portrayals of minority groups in the mass media often reinforce commonly-held stereotypes about those groups. For instance, Black men tend to be overrepresented in coverage on crime and poverty, while they are underrepresented as “talking head” experts or users of luxury goods.9 These patterns support a narrative about Black men as being violent and lazy, which viewers (including Black viewers) can internalize and incorporate into their idea of the “prototypical” Black person, as well as the prototypical criminal.

This bias can play out through the representativeness heuristic and contributes to systemic discrimination. For example, police who are looking for a suspect in a crime might focus disproportionately on Black people in their search, because the representativeness heuristic (and the stereotypes that they are drawing on) causes them to assume that a Black person is more likely to be a criminal than somebody from another group.

How to avoid it

Because categorization is so fundamental to our perception of the world, it is very difficult to completely avoid the representativeness heuristic. However, being aware of it is a good start: research has shown that when people become aware that they are using a heuristic, they often correct their judgment.10 Pointing out other people’s reliance on representativeness, and asking them to do the same for you, provides useful feedback that might help to avoid bias.

Other researchers have tried to reduce the effects of the representativeness heuristic by encouraging people to “think like statisticians.” These nudges do seem to help, but the problem is that without an obvious cue, people don’t think to use their statistical knowledge—not even educated people, such as graduate students.10 Another strategy that might have slightly more durability is formal training in logical thinking. In one study, children were taught how to think more logically about a problem involving the conjunction fallacy, and their performance on this problem got better.10 With this in mind, learning more about statistics and critical thinking might be useful to get around the representativeness heuristic.

How it all started

While it’s a staple of modern psychology, the concept of sorting objects into categories can actually be traced all the way back to the Ancient Greeks philosophers. While Plato first touched on categories in his Statesman dialogue, it became a philosophical mainstay of his student, Aristotle, who, in a text simply called Categories, aimed to sort every object of human apprehension into one of ten categories.

Prototype theory was coined by the psychologist Eleanor Rosch in 1974. Up until this point, categories were thought of in all-or-nothing terms: either something belonged to a category, or it did not. Rosch’s approach recognized that members of a given category often look very different from one another and that we tend to consider some things to be “better” category members than others. For example, when we think of the category of birds, penguins, while they technically belong, don’t seem to fit into this group as neatly as, say, a sparrow. The idea of prototypes lets us describe how we perceive certain category members as being more representative of their category than others.

At around the same time, Daniel Kahneman and Amos Tversky introduced the concept of the representativeness heuristic as part of their research on strategies that people use to make judgments about probabilities in uncertain situations. Kahneman and Tversky played a pioneering role in behavioral economics, demonstrating how people make systematic errors in judgment because of their reliance on biased strategies, including the representativeness heuristic.

Example 1 - Representativeness, stress, and stomach ulcers

Stomach ulcers are a relatively common ailment, but they can be gravely serious if left untreated, resulting in deadly stomach cancer. For a long time, it was common knowledge that stomach ulcers were caused by one thing: stress. So in the 1980s, when an Australian physician named Barry Marshall suggested at a medical conference that ulcers might be caused by a kind of bacteria, his colleagues initially rejected it out of hand.11 After being ignored, Marshall finally proved his suspicions using the only method ethically available to him: he took some of the bacteria of the gut of a sick patient, added it to a broth, and drank it himself. He soon developed a stomach ulcer, and other doctors were finally convinced.12

Why did it take so long (and such an extreme measure) to persuade other people to consider this new possibility? According to the social psychologists Thomas Gilovich and Kenneth Savitsky, the representativeness heuristic played a role here. The physical sensations people experience because of a stomach ulcer—burning pains, and the feeling of a churning stomach—is similar to what we feel when we’re experiencing stress. On an intuitive level, we feel like ulcers and stress must have some connection. In other words, stress is a representative cause of an ulcer.11 This may have been why other medical professionals were so resistant to Marshall’s proposal.

Example 2 - Astrology and representativeness

Gilovich and Savitsky also argue that the representativeness heuristic plays a role in pseudoscientific beliefs, including astrology. In astrology, the various signs are all associated with certain traits: for example, Aries, a “fire sign” symbolized by the ram, is often said to be passionate, confident, impatient, and aggressive. The fact that this personality meshes well with the prototypical ram is no coincidence: as Gilovich and Savitsky argue, the personality types that are linked to each star sign were chosen because they are representative of that sign.11 The predictions that are made by horoscopes, rather than foretelling the future, are reverse-engineered based on what best fits with our image of each sign.


What it is

The representativeness heuristic is a mental shortcut that we use when trying to decide whether object A belongs to class B. Specifically, we tend to overemphasize the similarity of the A and B to help us make this estimate.

Why it happens

Our perception of people, animals, and objects relies heavily on categorization: grouping similar things together. Within each category, there is a “prototype”: the “average” member of a given category that best represents the category as a whole. When we use the representativeness heuristic, we compare something to our category prototype, and if they are similar, we instinctively believe there must be a connection.

Example 1 – Representativeness, stress, and stomach ulcers

When an Australian doctor discovered that stomach ulcers were caused by a bacterium, not stress, other medical professionals initially didn’t believe him, because the effects of an ulcer are so similar to the effects of stress. In other words, stress is a more representative cause of an ulcer than bacteria are.

Example 2 – Astrology and representativeness

The personality types associated with each star sign in astrology are chosen because they are representative of the animal or symbol of that sign.

How to avoid it

To avoid the representativeness heuristic, learn more about statistics and logical thinking, and ask others to point out instances where you might be relying too much on representativeness.

Related TDL articles

Why We See Gambles As Certainties

The representativeness heuristic drives many other biases, including the gambling fallacy. This article explores the problem of gambling addiction, and why it is so difficult to dissuade people from gambling.


  1. Bordalo, P., Coffman, K., Gennaioli, N., & Shleifer, A. (2016). Stereotypes. The Quarterly Journal of Economics, 131(4), 1753-1794.
  2. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. science, 185(4157), 1124-1131.
  3. Feldman, N. H., Griffiths, T. L., & Morgan, J. L. (2009). The influence of categories on perception: Explaining the perceptual magnet effect as optimal statistical inference. Psychological Review, 116(4), 752-782. https://doi.org/10.1037/a0017196
  4. Winawer, J., Witthoft, N., Frank, M. C., Wu, L., Wade, A. R., & Boroditsky, L. (2007). Russian blues reveal effects of language on color discrimination. Proceedings of the national academy of sciences, 104(19), 7780-7785.
  5. Radvansky, G. A. (2011). Human memory. Prentice Hall.
  6. Tversky, A., & Kahneman, D. (1981). Judgments of and by representativeness (No. TR-3). STANFORD UNIV CA DEPT OF PSYCHOLOGY.
  7. Fortune, E. E., & Goodie, A. S. (2012). Cognitive distortions as a component and treatment focus of pathological gambling: a review. Psychology of Addictive Behaviors, 26(2), 298.
  8. Bordalo, P., Coffman, K., Gennaioli, N., & Shleifer, A. (2016). Stereotypes. The Quarterly Journal of Economics, 131(4), 1753-1794.
  9. Donaldson, L. (2017, December 19). When the media misrepresents Black men, the effects are felt in the real world. The Guardian. https://www.theguardian.com/commentisfree/2015/aug/12/media-misrepresents-black-men-effects-felt-real-world
  10. Kahneman, D. (2003). A perspective on judgment and choice: mapping bounded rationality. American psychologist, 58(9), 697.
  11. Gilovich, T., & Savitsky, K. (1996, March/April). Like goes with like: The role of representativeness in erroneous and pseudoscientific beliefs. The Skeptical Inquirer, 20 (2), 34-30. https://www.researchgate.net/profile/Thomas_Gilovich/publication/288842297_Like_goes_with_like_The_role_of_representativeness_in_erroneous_and_pseudo-scientific_beliefs/links/5799542208ae33e89fb0c80c/Like-goes-with-like-The-role-of-representativeness-in-erroneous-and-pseudo-scientific-beliefs.pdf
  12. Weintraub, P. (2010, April 8). The doctor who drank infectious broth, gave himself an ulcer, and solved a medical mystery. Discover Magazine. https://www.discovermagazine.com/health/the-doctor-who-drank-infectious-broth-gave-himself-an-ulcer-and-solved-a-medical-mystery