Evidence suggests overconfidence is a robust cognitive bias (Harvey, 1997; Meikle, Tenney, & Moore, 2016). However, much of this evidence comes from studies done in the United States, and the United States has its own distinctive culture. Alexis de Tocqueville wrote about Americans’ confidence in “Democracy in America” (1889). It was again noted by the Irish philosopher Charles Handy, who retraced de Tocqueville’s steps in 2001: “Anyone visiting America from Europe cannot fail to be struck by the energy, enthusiasm, and confidence.” Is overconfidence a particularly American affliction?
Questions about the universality of confidence are important, given its large potential consequences. Previous research has blamed overconfidence for a host of errors and tragedies, including the nuclear accident at Chernobyl, the loss of space shuttles Challenger and Columbia, the subprime mortgage crisis of 2008 and the great recession that followed it, the 2010 oil spill in the Gulf of Mexico caused by the Deepwater Horizon, and many others (Labib & Read, 2013; Plous, 1993). Some have suggested that confidence can encourage entrepreneurial spirit (Koellinger, Minniti, & Schade, 2007), but overconfident leaders can get their organizations into all sorts of trouble, from risky investments to bad acquisitions (Malmendier & Tate, 2015). There are three forms overconfidence (Moore & Healy). One is overestimation, in which people overestimate their performance, ability, or chance of success. A second is overplacement, in which individuals believe that they are better, relative to others, than they actually are. Finally, there is overprecision, which refers to an individual’s unjustifiable certainty in their beliefs. In this paper, we compare cultures on all three measures.
There are many dimensions on which cultures differ, but the one that has received the most attention is individualism vs. collectivism (Hofstede, 1980; Sinha & Kao, 1988; Triandis, 1995). Individualists tend to focus on themselves and to put personal goals ahead of group goals. They value their distinctive traits and standing out from the crowd as unique individuals. Collectivists, by contrast, view their group memberships as central to their self-identity. They make the goals of the collective a priority, whether that be at the level of the family, the organization, or the society (Klassen, 2004; Markus & Kitayama, 1991; Triandis, 1996; Triandis, 2001). People in collectivistic cultures are more likely to define themselves in term of their group identities.
The individualism-collectivism distinction has become central to most cultural comparisons (Diener & Diener, 1995). Our study includes two cultures traditionally considered individualistic, the United States and the United Kingdom; we compare them with two collectivistic cultures, India and Hong Kong (Oyserman, Coon, & Kemmelmeier, 2002). At the same time, we acknowledge the fact that the individualism-collectivism distinction oversimplifies important cultural differences and neglects enormous variation that exists within cultures (Göregenli, 1997; Klassen, 2004). We acknowledge this variation and attempt to capture it by measuring individualism-collectivism at the individual level (Singelis, 1994; Triandis & Gelfand, 1998).
Some research has documented cultural differences in self-enhancement (Muthukrishna et al., 2017). The self-enhancement motive biases individuals to prefer positive self-beliefs over negative ones, which could contribute to overconfidence. There is some evidence that self-enhancement is less prevalent in East Asian collectivistic cultures (Heine & Hamamura, 2007), but this claim is controversial. Other researchers propose that East Asians do self-enhance, albeit in different, culturally-specific domains that are important to collectivists (Sedikides, Gartner, & Toguchi, 2003; Sedikides, Gaertner, & Vevea, 2005). If individuals from individualistic cultures are also biased toward self-enhancement, this may contribute to overconfidence.
Greater self-enhancement among individualists would suggest that they might be more likely to display overestimation or overplacement. However, prior research has also examined cultural differences in overprecision; researchers have found that Chinese participants exhibit greater overconfidence in the accuracy of their knowledge than American participants (Yates, Lee, & Bush, 1997; Yates et al., 1998). Participants were invited to bet on their confidence, and researchers found that bets were consistent with their “cheap talk” claims of confidence, inviting the inference that the cultural differences were not merely attributable to superficial peculiarities in the way Chinese and American respondents used language or interpreted the response scale. Instead, the authors argued that the differences they documented represented real cultural differences in judgment and behavior, and that their Chinese respondents held beliefs that were more overprecise.
Benchmarking Overconfidence Effects
We compare cross-cultural differences in overconfidence with a well-established situational effect: task difficulty. Prior research has shown that people overestimate performance on hard tasks and underestimate it on easy tasks. This is the so-called “hard-easy” effect (Erev, Wallsten, & Budescu, 1994). At the same time, people show overplacement on easy tasks and underplacement on hard tasks (Moore & Small, 2007). We will refer to this as the under-hard/over-easy effect.
The counter-intuitive co-existence of the hard-easy and under-hard/over-easy effects is parsimoniously explained by Moore and Healy (2008). It depends in part on systematic errors people make inferring the effect of task difficulty on others’ performance (Windschitl, Rose, Stalkfleet, & Smith, 2008). To be specific, the under-hard/over-easy effect results when judgments show the actor-observer asymmetry in situational attribution, reporting that one’s own performance will be more attributable to the situation, and that others’ performance will be more attributable to their stable traits. The result is that people believe that difficult tests will be especially difficult for them and easy tests will be especially easy for them.
If East Asians are less susceptible to the actor-observer bias (Choi & Nisbett, 1998), we might expect the under-hard/over-easy effect to be moderated by cultural differences. In particular, if those from collectivistic cultures are less likely to make dispositional attributions and better able to infer how situations will affect others, then we might expect them to show a weaker under-hard/over-easy effect. If those from a collectivistic culture are better at considering the perspective of others, they may be less likely to be egocentric in their approach to comparative social judgments (Rose & Windschitl, 2008).
Our comparison between the effects associated with task difficulty and cultural differences can inform an assessment of the relative importance of situational vs. dispositional influences on overconfidence. Our manipulation of task difficulty functions as a situational manipulation and a counterpoint to our cross-cultural comparison. It is useful to calibrate our assessment of individual differences by comparing them to situational effects in part because human intuition so often dwells on importance of individual differences and underestimates the power of situations to affect judgments and behaviors (Gilbert, 1998; Ross & Nisbett, 1991).
In two studies, we test for cultural differences in all three forms of overconfidence. In Study 1, we examine collectivistic cultures (Hong Kong and India) and individualistic cultures (the United Kingdom and the United States). In Study 2, we use a new task to compare overconfidence in participants from a collectivistic culture (India) and an individualistic culture (the United States). We hope that by using clearly operationalized measures of overconfidence in each of its three forms, and by attempting to replicate findings across two studies and multiple different tasks, we will be able to reach a better understanding of overconfidence across cultures. Moreover, in the interests of comparing the size of these cultural differences with an established situational effect, we vary task difficulty both between-subjects and within-cultures.
We report how we selected sample size, how we recruited participants, all conditions we ran, and all variables we collected. Our pre-registered research plan, as well as data and research materials, are all available online: https://osf.io/nu6jt/. Study 1’s data on US and Hong Kong participants was collected before work on this paper began, and therefore, collected before pre-registering the study.
Participants. We analyze data from four national samples, all recruited online. Amazon Mechanical Turk made it easy to obtain samples from the United States (US) and India. Participants from Hong Kong were recruited on a Chinese platform similar to MTurk, where participants are English-speaking Hong Kong residents. We managed to recruit some United Kingdom (UK) residents via MTurk, but because MTurk participation is limited in the UK, the bulk of our British participants came from Prolific Academic, a UK-based online pool.
Sample sizes were determined ahead of any data analysis. Our US and Hong Kong sample sizes were chosen for us. The US sample size was determined prior to its usage in this paper (see Prims & Moore, 2017, Study 4). The Hong Kong sample size was determined by collecting as much data as we could in a single semester. In planning sample sizes for our other two samples, we considered effect sizes comparing overprecision between individualistic and collectivistic cultures reported by Yates et al. (1998): 1.012 and 0.604. These effect sizes suggested we could achieve 80% power with between 17 and 45 participants per country. We thought these numbers seemed low and worried that published effect sizes could overestimate true effect sizes (Open Science Collaboration, 2015). We therefore sought sample sizes of 200 Indian participants and 200 British participants.
Table 1 shows the total number of participants after exclusion (based on the pre-registered exclusion criteria), gender breakdown, and age information for each of the countries that participants were recruited from.
M: 39, SD: 12.06
M: 40, SD: 16.11
|Hong Kong||503||78% female
M: 22, SD: 5.65
M: 32, SD: 7.29
Procedure. The experimental task asked participants to estimate the weights of ten people. The weight-guessing task has proven itself a useful context for measuring overconfidence, and replicating results from the literature (Moore & Klein, 2008; Prims & Moore, 2017; Sah, Moore, & MacCoun, 2013; Tenney, Logg, & Moore, 2015). After providing informed consent, participants reported which weight unit (pounds, kilograms, or stone) they preferred, and all weight estimation instructions and questions employed the preferred unit. Participants then saw photographs of ten people and estimated each person’s weight as accurately as they could. All ten photographs were taken of people that were on UC Berkeley’s campus who agreed to be photographed by research assistants. While we did not ask the cultural backgrounds of the ten people, the set of individuals appears to be relatively culturally diverse.
In the hard condition, participants had to guess weights fairly accurately (within 2 kilograms, 4 pounds, or 0.3 stone) for their answers to count as correct; the easy condition used a more lax standard for answers to count as correct (20 kilograms, 44 pounds, or 3.1 stone). The data for participants from Hong Kong employed a survey with accuracy hurdles of 2 and 20 kilograms for the hard and easy conditions, respectively. To determine the hurdles for pounds and stones, we converted from 2 and 20 kg to their equivalents. For pounds, we rounded down to the nearest whole number. Because rounding the stones hurdles would cause the hurdles to be far less consistent, we chose to use decimals in the stones hurdles. Participants were randomly assigned to either the hard or easy condition and were informed of the accuracy hurdle for the condition they were in.
For each photograph, we asked participants to indicate a point estimate of the person’s weight. After participants saw all ten photos, we collected a full probability distribution from each participant: their self-reported probability of obtaining each of the eleven possible scores (from zero items correct to all ten items correct). Participants did this twice: once for the self and for a randomly selected other.
In order to obtain a measure of overconfidence on a different task, participants then saw an irregular shape and reported the area of the shape. Participants estimated the likelihood that this point estimate was within ten units of the actual area. They were also asked to provide their 5th and 95th percentiles for the area of the shape.
Country. Nation of origin was our key independent variable.
Task difficulty. Participants’ guesses had to be within either a narrow weight interval (hard) or a wide weight interval (easy) surrounding the right answer in order to count as correct. In the hard condition, this was 2 kilograms, 4 pounds, or 0.3 stone. In the easy condition, this was 20 kilograms, 44 pounds, or 3.1 stone. For example, a participant in the hard condition whose guess was 3 kilograms away from the right answer would have been counted as incorrect.
Individualistic and collectivistic self-construals. For participants from India and the UK, we were able to measure levels of individualism and collectivism with the Singelis Self-Construal Scale (1994). We computed an individualism score for each participant by subtracting the collectivism score from the individualism score. A positive score would suggest more individualism than collectivism, and a negative score would suggest the opposite, with a more extreme positive score signifying more individualism and a more extreme negative score signifying more collectivism. We lack individual-level self-construal data for our American and Hong Kong samples.
Participants’ estimated scores for themselves were calculated by multiplying each of the eleven possible scores with its reported subjective probability and summing these to yield the estimated score. We applied the same method to calculate participants’ estimated scores for others.
Overestimation. Each participant’s actual score was subtracted from their estimated score to yield an estimate of the degree to which participants overestimated how well they had done.
Overplacement. The participant’s estimated score for a randomly selected other was subtracted from their estimated score for themselves and corrected for the actual degree to which the participant’s score was better than others’ scores.
Overprecision. One measure of overprecision is the variance in participants’ subjective probability distributions of others’ scores. We call this SPD variance. If the participants’ SPD variance was smaller than the actual variance in others’ scores, that is evidence of overprecision. If participants’ SPD variance was greater than the actual variance in others’ scores, this indicates underprecision.
A second measure of overprecision, which we will call item confidence, is based on the one score each participant reported being most likely. We compare item confidence to the rate at which participants actually get that score correct to get a second measure of overprecision on the weight-guessing task.
Another measure of overprecision is participants’ average confidence that they correctly estimated the irregular shape’s area within 10 units, compared with the rate at which they actually did. This is item confidence for the shape task. A fourth measure is the rate at which the shape’s area actually falls inside the 90% confidence interval participants stated. This is CI confidence.
Planned Analyses. First, we investigate the reliability of the Singelis Self-Construal Scale, as well as comparing Singelis Self-Construal Scale scores between our participants from collectivistic countries and our participants from individualistic countries. Then, for each form of overconfidence, we conduct a multiple regression analysis to test for significant predictors of overconfidence. We then perform an ANCOVA examining the experimental manipulation of condition on overconfidence and controlling for country, age, and gender, to ensure that the effect of difficulty condition exists independently of country, age, and gender differences. We also conduct an ANOVA to rule out the possibility of an interaction effect of condition and country on overconfidence. The latter two tests help to confirm that these different forms of overconfidence can be viably compared across cultures. Finally, we investigate the relationship between overconfidence and Singelis individualism score, overconfidence and age, and overconfidence and gender.
Singelis Self-Construal Scale
Scale Reliability. The Singelis self-construal scales show satisfactory reliabilities. The fifteen individualistic items had a Cronbach’s alpha of 0.80. The fifteen collectivistic items had a Cronbach’s alpha of 0.83.
Comparing India and the UK. The difference in the average individualism score between participants in India (M = –0.10, SD = 0.38) and the UK (M = –0.02, SD = 0.44) barely attained statistical significance (t (364.42) = 1.96, p = 0.05, d = 0.20), with Indian participants scoring as slightly more collectivistic. Looking strictly at items tapping collectivism, Indian participants reported higher levels of collectivism (M = 5.38, SD = 0.65) than UK participants (M = 4.60, SD = 0.64), t (381.91) = 12.012, p < .001, d = 1.22. Interestingly, Indian participants also reported higher levels of individualism (M = 5.28, SD = 0.64) than UK participants (M = 4.58, SD = 0.66) when examining only individualism items, t (376.72) = 10.57, p < .001, d = 1.08.
Overestimation. Overestimation is observed when individuals overestimate their performance, ability, or chance of success. A multiple regression analysis predicting overestimation revealed significant effects of country, condition, age, and gender. See Table 2.
|Country (Hong Kong)||0.231||0.157||0.031||1.470|
An ANCOVA examining overestimation found significant effects of country (F (3, 1693) = 26.64, p < .001), age (F (1, 1693) = 21.57, p < .001), and gender (F (1, 1693) = 6.639, p < .001). However, condition remained a highly significant predictor of overestimation, F (1, 1693) = 1881.43, p < .001.
Finally, in an ANOVA examining the interaction of condition and country on overestimation, the interaction effect between condition and country was not significant, F (3, 1694) = 0.18, p = 0.91.
Condition. Consistent with our expectations, overestimation was greater for participants in the hard condition (M = 2.16, SD = 2.18) than participants in the easy condition, who showed underestimation (M = –2.64, SD = 2.51), t (1700) = –42.20, p < .001, d = 2.04.
Country. Figure 1 shows overestimation for participants in each country. Participants from the UK and the US showed greater underestimation (M = –0.47, SD = 3.28) than participants from Hong Kong and India (M = –0.04, SD = 3.47), t (1700) = –2.56, p = 0.01, d = 0.13. However, as Figure 1 shows, overestimation was only present in our Indian sample. Elsewhere, underestimation prevailed.
In order to provide a sense of the magnitude of this country effect, we compared the proportion of the variance attributable to country (2.14%) with the proportion of the variance attributable to task difficulty (50.99%). The variance attributable to task difficulty was roughly 23 times larger.
Singelis Self-Construal Scale (IN and UK participants only). The bivariate correlation between individualism score on the Singelis Self-Construal Scale and overestimation was not significant, r (384) = –0.05, p = 0.25.
Age. The bivariate correlation between age and overestimation was not significant, r (1698) = 0.045, p = 0.06.
Gender. On average, female participants showed greater underestimation (M = –0.47, SD = 3.48) than male participants (M = –0.07, SD = 3.20), t (1698) = –2.50, p = 0.01, d = 0.12.
Overplacement. Overplacement is observed when individuals believe that they are better, relative to others, than they actually are. A multiple regression analysis tested the degree to which country, condition, age, and gender predict overplacement.
Difficulty and gender emerge as significant predictors of overplacement in a linear regression. See Table 3. Dropping age and gender from the model did not result in a dramatically different R2: F (4, 1697) = 10.68, p < .001, R2 = 0.02.
|Country (Hong Kong)||0.061||0.123||0.015||0.491|
In an ANCOVA examining the experimental manipulation of difficulty on overplacement and controlling for country, country was a significant predictor of overplacement (F (3, 1693) = 3.75, p < .01), as was age (F (1, 1693) = 4.44, p = 0.04), and gender (F (1, 1693) = 8.54, p = 0.004). However, condition remained a stronger predictor of overplacement, F (1, 1693) = 31.60, p < .001.
Finally, examining the interaction of condition and country on overplacement, the interaction effect between condition and country was not significant, F (3, 1694) = 1.059, p = 0.37.
Condition. As expected, overplacement was greater for participants in the easy condition (M = 0.38, SD = 1.81) than for participants in the hard condition (M = –0.11, SD = 1.81), t (1700) = 5.60, p < .001, d = 0.27.
Country. See Figure 2 for a bar chart of mean overplacement for participants in each country. Overplacement did not differ between participants from the UK and the US (M = 0.15, SD = 1.69) and participants from Hong Kong and India (M = 0.14, SD = 2.00), t (1700) = 0.10, p = 0.92, d = 0.004. As Figure 2 demonstrates, only participants from the US and India showed overplacement.
In order to provide a sense of the magnitude of this country effect, we compared the proportion of the variance attributable to country (0.45%) with the proportion of the variance attributable to task difficulty (1.78%). The variance attributable to task difficulty was roughly 4 times larger.
Singelis Self-Construal Scale (IN and UK participants only). The correlation between individualism score on the Singelis Self-Construal Scale and overplacement was not significant, r (384) = –0.02, p = 0.72.
Age. The correlation between age and overplacement was small but significant, r (1698) = 0.06, p = 0.01.
Gender. Male participants showed greater overplacement (M = 0.31, SD = 1.74) than female participants (M = 0.01, SD = 1.87), t (1698) = –3.40, p < 0.001, d = 0.16. This replicates high-profile claims of male overplacement (Niederle & Vesterlund, 2007), a finding that does not always replicate (Moore & Dev, 2017).
Overprecision (SPD variance). Overprecision is observed when individuals are unjustifiably certain in their beliefs. Our first measure of overprecision takes the actual variance in scores and subtracts from it variance of the individual’s reported probability distribution for others’ scores. Positive numbers indicate overprecision, whereas negative numbers indicate underprecision (since smaller SPD variance means less underprecision). Table 4 presents the results of a multiple regression analysis testing the effects of country, condition, age, and gender on overprecision, as measured by SPD variance.
|Country (Hong Kong)||8.482||1.035||0.251||8.194||***|
In an ANCOVA examining the effect of difficulty on overprecision and controlling for country, country was a significant predictor of overprecision (F (3, 1695) = 15.69, p < .001), as was age (F (1, 1695) = 24.99, p < .001), and gender (F (1, 1695) = 18.27, p < .001). Difficulty was also a significant predictor of overprecision, F (1, 1693) = 5.27, p = .021.
An ANOVA examining the interaction of condition and country on overprecision revealed that the interaction effect between condition and country was not significant, F (3, 1696) = 2.53, p = 0.06.
Condition. Participants in both the easy and the hard condition showed underprecision. Underprecision was greater for participants in the hard condition (M = –23.43, SD = 14.42) than for participants in the easy condition (M = –21.82, SD = 16.76), t (1702) = 2.14, p = .03, d = 0.10.
Country. See Figure 3 for a bar chart of mean SPD overprecision for participants in each country. Participants from the UK and the US showed greater underprecision (M = –23.53, SD = 15.46) than participants from Hong Kong and India (M = –21.30, SD = 15.71), t (1702) = –2.91, p = .004, d = 0.15. Recall that SPD overprecision is calculated by subtracting SPD variance from actual variance, so in the case of underprecision, less negative numbers mean smaller SPD variance and reflect greater precision. In other words, participants from Hong Kong and India showed less underprecision than participants from the UK and the US. We note that it appears this difference is primarily driven by the better calibration seen in participants from Hong Kong; participants in India were the most underprecise of all groups.
Singelis Self-Construal Scale (IN and UK participants only). The correlation between individualism score on the Singelis Self-Construal Scale and overprecision was not significant, r (384) = 0.04, p = 0.45.
Age. Overall, the correlation between age and overprecision was insignificant, r (1700) = 0.04, p = 0.11. However, because prior overconfidence research using a US sample identifies a relationship between age and overprecision (Prims & Moore, 2017), we broke down the correlation by country. The correlation between age and overprecision was not significant among the British (r (181) = 0.08, p = 0.27), Hong Kong (r (501) = 0.05, p = 0.22), or Indian (r (200) = 0.03, p = 0.71) participants, but it was significant in US participants, r (812) = 0.165, p < 0.001.
Gender. Male participants did not differ in overprecision (M = –21.81, SD = 16.07) from female participants (M = –22.28, SD = 15.24), t (1700) = –0.62, p = 0.54, d = 0.03.
Overprecision (weight guessing item confidence). Item confidence measures reveal participants to have higher confidence in predicting the score they thought was most likely (M = 33%) than they are accurate (M = 21% hit rate), t (1701) = 17.25, p < .001, d = 0.42. This is evidence of overprecision. However, hit rates were higher among participants in the UK and the US (M = 26%) than participants in Hong Kong and India (M = 16%), t (1700) = 5.18, p < 0.01, d = 0.25. Participants in Hong Kong and India were just as confident as participants in the UK and the US, but less accurate, and so their scores indicate greater overprecision.
Overprecision (90% Confidence Intervals). Participants estimated the area of an irregular shape by providing their 5th and 95th percentile estimates of the shape’s area; these estimates define a 90% confidence interval. If participants were well calibrated, roughly 90% of these confidence intervals should have contained the correct answer. Confidence intervals contained the right answer only 30.52% of the time, significantly less than 90%, t (1703) = –53.308, p < .001, d = 1.29; this is evidence of overprecision.
In order to test our hypothesis of lower overprecision among those from individualistic cultures, we compared the rate at which participants’ 90% confidence intervals included the right answer. Participants from the UK and the US hit more often (M = 0.40, SD = 0.49) than did participants from Hong Kong and India (M = 0.18, SD = 0.39), t (1702) = 10.38, p < .001, d = 0.49. Specifically, 40% of participants from the US and the UK got the right answer within their confidence intervals, compared to 18% of participants from Hong Kong and India, which is evidence of greater overprecision in participants from Hong Kong and India. See Figure 4.
The 90% confidence intervals of participants from the UK and US were wider (M = 128.50, SD = 109.73) than the 90% confidence intervals of participants from Hong Kong and India (M = 65.13, SD = 96.73), t (1702) = 12.60, p < .001, d = 0.61. This is further evidence of greater overprecision in participants from Hong Kong and India. Their 90% confidence intervals were narrower than participants from the UK and US, leading their hit rate to be lower than participants from the UK and US. Figure 5 shows that this result is primarily attributable to Chinese respondents.
Overprecision (Shape item confidence). We also asked participants for a point estimate of the shape’s area and the likelihood that their point estimate was close to the right answer. Participants’ self-reported likelihood that their point estimate was within ten units of the right answer (M = 0.51, SD = 0.25) was significantly larger than the actual percentage of point estimates that were within ten units of the right answer (M = 0.03, SD = 0.17), t (1702) = 83.484, p < 0.001, d = 2.86. There was no significant difference in hit rates between participants from the US and the UK (M = 0.03, SD = 0.18) and those from Hong Kong and India (M = 0.02, SD = 0.15), t (1702) = 1.12, p = 0.26, d = 0.05. However, participants from Hong Kong and India reported greater confidence (M = 0.56, SD = 24.14) than did participants from the US and the UK (M = 0.47, SD = 25.71), t (1702) = –6.91, p < .001, d = 0.34. Figure 6 shows that this result is primarily attributable to greater confidence reported among Indian respondents.
Different Measures of Overprecision. The differences between the different measures of overprecision invite the question of the degree to which these different measures correlate with one another. The answer, as shown in Table 5 is: not much.
|1. Weight Guessing SPD Overprecisiona||–|
|2. Weight Guessing Item Confidence||0.65||***||–|
|3. Shape Item Confidence||–0.01||0.08||**||–|
|4. Shape 90% Confidence Intervals||–0.06||*||–0.03||–0.19||**||–|
We report how we selected sample size, how we recruited participants, all conditions we ran, and all variables we collected. Our pre-registered research plan, as well as data and research materials, is available online: https://osf.io/z7htw/.
Participants. Participants from India and the United States were recruited on Amazon Mechanical Turk using the built-in country qualification condition.
In Study 1, we compared SPD overprecision in Indian and US participants and found an effect size of 0.20. Using an effect size of 0.2, an a priori power analysis suggested that to achieve 80% power, we should recruit 310 participants per country group. We initially recruited 610 participants. However, our pre-registered exclusion criteria called for us to exclude all participants who failed a comprehension check, which ended up being far more difficult than expected: our final sample size, after exclusions, included only 136 Indian participants and 233 US participants. Because the sample size dropped by nearly half after exclusions, we report results using the full dataset (without exclusions) in the online supplement.
Procedure. The study’s primary experimental task involved estimating the number of marbles in a glass container. After providing informed consent, participants saw 10 photographs of glass receptacles containing different numbers of marbles, and were asked to guess the number of marbles in each container as accurately as possible.
For each photograph, we first asked for a point estimate for the number of marbles. After participants saw all ten photos, we collected a full probability distribution from each participant—the probabilities for each of the eleven possible scores (from zero correct to ten correct), both for the self and for a randomly selected other.
Participants were rewarded for accuracy in score estimates for both the self and the other: each accurate score estimate increased their chance of winning a lottery for a monetary bonus. Additionally, participants were rewarded for accuracy in guessing the number of marbles, again with increased chances of winning the bonus lottery. Each correct guess earned two tickets for a lottery whose prize was $50, so perfect performance could earn up to twenty lottery tickets. Moreover, accurate estimates of their scores could earn them up to one lottery ticket for the same $50 prize. Participants learned of all rewards prior to starting the task.
Country. Culture of origin was our key independent variable.
Task difficulty. Participants’ guesses had to be within either a narrow interval (hard) or a wide interval (easy) surrounding the correct answer (the correct number of marbles) in order to count as correct. In the hard condition, for an answer to count as correct it had to be within 26% of the right answer. In the easy condition, it had to be within 78%.
Individualism and Collectivism. We measured levels of individualism and collectivism with the Singelis Self-Construal Scale and the Triandis and Gelfand Culture Orientation Scale. The Singelis scale asks questions tapping individualism and collectivism (e.g., I enjoy being unique and different from others in many respects, I feel good when I cooperate with others), yielding two scores: one for the level of individualism, and one for the level of collectivism. We also computed an individualism score for each participant by subtracting the collectivism score from the individualism score. A positive score would suggest more individualism than collectivism, and a negative score would suggest the opposite, with a more extreme positive score signifying more individualism and a more extreme negative score signifying more collectivism.
The Triandis Scale consists of items that measure four dimensions of collectivism and individualism, yielding four scores: two for the level of individualism (horizontal and vertical), and two for the level of collectivism (horizontal and vertical).
Vertical Collectivism: seeing the self as a part of a collective and being willing to accept hierarchy and inequality within that collective.
Vertical Individualism: seeing the self as fully autonomous, but recognizing that inequality will exist among individuals and accepting this inequality.
Horizontal Collectivism: seeing the self as part of a collective but perceiving all members of that collective as equal.
Horizontal Individualism: seeing the self as fully autonomous, and believing that equality between individuals is the ideal.
We compared scores by countries of origin (India or the United States) to test for construct validity.
Dependent variables are similar to Study 1, with the exception that Study 2 did not include the shape area guessing task, and has none of the associated overprecision measures.
Planned Analyses. As noted, these results exclude data from participants who failed our attention check. Consistent with our pre-registered plan, we report results with the full dataset in the online supplement, but we note that the results were not materially different in the dataset without exclusions.
Singelis Self-Construal Scale
Scale Reliability. The Singelis Self-Construal Scale showed satisfactory reliabilities (see Table 6).
|Cronbach’s alpha||United States||India|
Comparing India and the US. Consistent with expectations, US participants had higher Singelis individualism scores than Indian participants, t (367) = –6.77, p < .001, d = 0.73. For items assessing collectivism, Indian participants reported higher levels of collectivism than US participants, t (367) = 10.20, p < .001, d = 1.10. However, mirroring our results from Study 1, Indian participants also reported higher levels of individualism than US participants, t (367) = 2.63, p < .01, d = 0.28. See Table 6.
Triandis and Gelfand Culture Orientation Scale
Scale Reliability. The Triandis and Gelfand Cultural Orientation Scale displayed satisfactory reliabilities (see Table 6).
Comparing India and the US. Participants from India scored higher on horizontal collectivism than participants from the US, t (332) = 5.23, p < .001, d = 0.53. Indian participants also scored higher on vertical collectivism than US participants, t (367) = 8.93, p < .001, d = 0.96. While the collectivism results are in line with expectations, individualism results were less so: Indian participants also scored higher on vertical individualism than US participants, t (358) = 12.88, p < .001, d = 1.26. There was no significant difference in horizontal individualism between participants from India and the US, t (312) = 0.44, p = 0.661, d = 0.05. The inclusion of the Triandis scale was unique to Study 2, but results are not inconsistent with the Singelis scale results in Studies 1 and 2, which find that our Indian participants generally score higher on both collectivism and individualism items. See Table 6.
Overestimation. Overestimation is observed when individuals overestimate their performance, ability, or chance of success. A multiple regression analysis tested the degree to which condition, country, age, gender, numeracy, education, MacArthur ladder SES, Singelis individualism, Singelis collectivism, and the four Triandis measures predict overestimation (Table 7).
We also performed a multiple regression analysis identical to the one above, with the exception that it excluded the Singelis and Triandis variables (see Table 8). The model including the Singelis and Triandis variables explained 17% of the variance (F (13, 355) = 6.66, p < .01), while the model excluding the Singelis and Triandis variables explained a slightly smaller 15% of the variance (F (7, 361) = 10.21, p < .01).
Condition. On average, participants underestimated their scores. Consistent with our expectations, overestimation (in the form of less underestimation) was greater for participants in the hard condition (M = –0.51, SD = 2.50) than participants in the easy condition (M = –2.73, SD = 3.96), t (367) = –6.48, p < .001, d = 0.67.
Country. Figure 7 shows overestimation for participants in each country. Consistent with Study 1, participants from the US showed greater underestimation (M = –1.94, SD = 3.40) than participants from India (M = –1.04, SD = 3.56), t (367) = –2.40, p = 0.017, d = 0.27.
In order to provide a sense of the magnitude of this country effect, we compared the proportion of the variance attributable to country (1.55%) with the proportion of the variance attributable to task difficulty (10.17%). The variance attributable to task difficulty was roughly 6.5 times larger.
Overplacement. Overplacement is observed when individuals believe that they are better, relative to others, than they actually are. A multiple regression analysis tested the degree to which condition, country, age, gender, numeracy, MacArthur ladder SES, Singelis individualism, Singelis collectivism, and the four Triandis measures predict overplacement (Table 9).
We also performed a multiple regression analysis identical to the one above, with the exception that it excluded the Singelis and Triandis variables (see Table 10). The model including the Singelis and Triandis variables explained 26.9% of the variance (F (13, 355) = 11.42, p < .001), while the model excluding the Singelis and Triandis variables explained 25.6% of the variance (F (7, 361) = 19.04, p < .001).
Condition. On average, participants showed underplacement. Consistent with expectations, underplacement was greater for participants in the hard condition (M = –3.81, SD = 2.38) than participants in the easy condition (M = –0.56, SD = 3.65), t (367) = 10.16, p < .001, d = 1.06.
Country. Figure 8 shows overplacement for participants in each country. Overplacement did not differ between participants from India (M = –2.29, SD = 3.84) and participants from the US (M = –2.16, SD = 3.25), t (367) = 0.34, p = 0.73, d = 0.04.
Overprecision (SPD variance). Overprecision is observed when individuals are unjustifiably certain in their beliefs. As previously detailed, our SPD variance measure of overprecision subtracts actual variance in individuals’ scores from the variance of the reported probability distribution for others’ scores. Positive numbers indicate overprecision, whereas negative numbers indicate underprecision (and for underprecision, less negative numbers mean less underprecision). Table 11 presents the results of a multiple regression analysis testing the effects of country, condition, age, gender, numeracy, MacArthur ladder SES, Singelis individualism, Singelis collectivism, and the four Triandis measures on overprecision, as measured by SPD variance. Consistent with Study 1, the results show that Indian respondents exhibited more underprecision than did American respondents by this measure.
We also performed a multiple regression analysis identical to the one above, with the exception that it excluded the Singelis and Triandis variables (see Table 12). The model including the Singelis and Triandis variables explained 19.5% of the variance (F (13, 355) = 7.86, p < .001), while the model excluding the Singelis and Triandis variables explained a slightly smaller 15.3% of the variance (F (4, 364) = 17.66, p < .001).
Condition. Underprecision was greater for participants in the hard condition (M = –23.77, SD = 13.92) than for participants in the easy condition (M = –15.84, SD = 15.04), t (367) = 5.26, p < .001, d = 0.55.
Country. Figure 9 shows overprecision for participants in each country. Similar to study 1, participants from both India and the US showed overall underprecision. We found that participants from India showed more underprecision (M = –25.55, SD = 14.91) than participants from the US (M = –16.53, SD = 14.05), t (367) = 5.81, p < .001, d = 0.63. A less negative number corresponds to smaller SPD variance and reflects greater precision, therefore, like in Study 1, we see that our Indian participants actually show the least amount of overprecision (in the form of the greater underprecision).
In order to provide a sense of the magnitude of this country effect, we compared the proportion of the variance attributable to country (8.4%) with the proportion of the variance attributable to task difficulty (6.8%). The variance attributable to country was roughly 1.25 times larger.
Overprecision (marble-guessing item confidence). As in Study 1, item confidence measures revealed participants to, on average, have higher confidence (M = 25%) than accuracy (M = 2.79%) in predicting the score they thought most likely, t (368) = 22.94, p < .001, d = 1.19 This is evidence of overprecision. Similar to Study 1, accuracy in predicting their own score did not differ between participants from India (M = 4.41%) and participants from the US (M = 1.72%), t (367) = –1.54, p = 0.12, d = 0.17. However, in this study, US participants had higher peak confidence (M = 27.2%) than Indian participants (M = 21.6%), t (367) = 2.81, p < .01, d = 0.30. In other words, although US participants appeared to have higher item confidence, they were not more accurate in estimating their own scores. This means that participants from the US were more overprecise than Indian participants, as measured by item confidence.
We set out to explore cultural differences in overconfidence using clearly defined measures of overconfidence. While we are not the first to study overconfidence across cultures, prior studies have not measured all three distinct forms (overestimation, overplacement, and overprecision). We can boast the successful cross-cultural replication of hard-easy and under-hard/over-easy effects: in both of our studies, hard tasks produced more overestimation and less overplacement than easy tasks. The successful replication of the hard-easy and under-hard/over-easy effects provides some comfort that the measures we employ operate similarly across cultures. The size of the situational effect also provides a useful benchmark against which we can compare the size of the cultural effects we document: For overestimation, the effect of task difficulty is 23 times larger than the effect of culture in Study 1 and 6.5 times larger in Study 2. For overplacement, the effect of task difficulty was 4 times larger than the effect of culture in Study 1, and there was no effect of culture at all in Study 2.
Based on prior literature, we hypothesized that collectivistic cultures (Hong Kong and India) would be more overprecise than individualistic cultures (the United States and the United Kingdom). Here, our study’s results are more mixed. We do replicate the results of Yates et al. (1998), finding greater overprecision in collectivistic countries, and in Hong Kong especially, in the form of narrower 90% confidence intervals and, in Study 1, lower hit rates. In Study 1, we also find that collectivist participants report greater confidence in the accuracy of their point estimates. But this effect is reversed in Study 2, where it is individualistic participants who report greater confidence in the accuracy of their point estimates, and without the higher hit rates to back it up. And, as measured by the width of their subjective probability distributions, we observe general underprecision in all of our participants; respondents from Hong Kong were least underprecise and those from India most underprecise. This inconsistency underscores the fact that different ways of measuring overprecision differ dramatically and are routinely inconsistent with one another (Moore, Tenney, & Haran, 2016). Indeed, these differences are large enough that it is worth asking whether it makes any sense to treat them as different measures of the same underlying psychological construct. We suspect not.
Measuring Cultural Differences
We measured individualistic and collectivistic self-construals using the Singelis Self-Construal Scale (1994), which generated scores for both individualism and collectivism among our Indian and British respondents in Study 1, and our Indian and US respondents in Study 2. In Study 1, participants from both the UK and India skewed more collectivistic than individualistic, and while Indian participants did show higher levels of collectivism than UK participants, this effect was modest and barely statistically significant in our sample. Results were more consistent with expectations in Study 2: Indian participants again skewed more collectivistic, and US participants skewed more individualistic. However, in both studies, Indian participants had higher individualism and collectivism scores than UK or US participants. This is mirrored in our results for our measure of horizontal and vertical individualism and collectivism (Triandis & Gelfand, 1998). Consistent with expectations, Indians scored higher than Americans on horizontal and vertical collectivism; however, they also scored higher on vertical individualism, and we saw no differences in horizontal individualism.
These results could suggest that our Indian participants simply answered more extremely on most questions, which would obscure meaningful differences between the participant groups. It is also plausible that our somewhat mixed bag of results is a product of the large-scale recruitment sites we used to recruit our Indian, US, and UK participants: perhaps the users of these sites are more similar to each other and less representative of the populations of their respective countries, disguising any marked cross-cultural differences in individualism and collectivism.
Alternatively, it is possible that these cultures are not actually as different as one might predict based on stereotypes that attempt to distinguish them from one another. It may be that in a modern technological age where exchange of people, information, and entertainment occurs at ever-increasing rates, cultural distinctions are simply blurring and shrinking. This might reduce the value of attempting to distinguish cultures on dimensions like individualism-collectivism. Some researchers defend the construct, citing the fact that results from fifty separate studies using the Singelis Scale (and two other similar scales) identify consistent differences, an unlikely result if the scales were measuring a nonexistent construct (Gudykunst & Lee, 2003). Other researchers propose that there are a variety of studies employing self-construal scales that do not converge with the other studies, and note that the Singelis Self Construal Scale is problematic for its lack of negative correlation between individualistic and collectivistic items and poor reliability (Levine et al., 2003; Miramontes, 2011). Still others assert that simply looking at individualism as a whole is too broad, and that the measurement of individualism is worthwhile, but should be conducted across a variety of different but related cultural constructs (Schimmack, Oishi, & Diener, 2005). But our attempt to diversify our measurement of cultural constructs by adding the Triandis and Gelfand Cultural Orientation Scale did not do much to clarify results. Our own surprising results contribute to our concerns about challenges associated with measuring cross-cultural differences in individualism and collectivism.
National Differences in Overconfidence
Because nations differ in innumerable ways from one another, no correlational study can prove the cause of these differences. While we take steps to control for additional potential confounding variables in Study 2 (including numeracy, education, and SES), our study still cannot address whether the differences we observe are due to diet, genetics, climate, culture, or something else entirely. We measured a prominent cultural difference, but individualism-collectivism fails to account for the differences we observe: in our first study, only participants from the US and India showed overplacement and only Indian participants showed overestimation, but in our second study, we observed underestimation and underplacement across the board, and found no cross-cultural differences in overplacement. Results from Study 1 would suggest that Indian participants were consistently the most extreme in confidence; this mirrors Lee et al.’s (1995) finding that Indian participants are particularly overconfident when compared to Taiwanese, Japanese, Singaporean, and American participants. Still, the fact that country of origin was not a significant predictor of overestimation or overplacement in Study 2 discourages us from drawing any real conclusions about cross-cultural differences in overestimation and overplacement.
While we do see stronger cultural effects on overprecision, we largely failed to replicate previous work that finds greater overprecision in collectivistic participants (Acker & Duck, 2008; Fang & Li, 2004; Li-Jun & Kaulius, 2013; Yates, Lee, & Bush, 1997; Yates, Lee, Shinotsuka, Patalano, & Sieck, 1998). In Study 1, we find evidence of higher item-confidence overprecision in collectivistic participants (India and Hong Kong) than individualistic participants (UK and US), in both our weight-guessing and shape area tasks. But in Study 2, it was US participants who showed higher item-confidence overprecision than Indian participants. Overall, this study offers a clarification to previous research that would assert robust cross-cultural differences in overconfidence. While we may observe greater cultural effects in overprecision than in overestimation or overplacement, the lack of clear directionality in our overprecision results would suggest that cross-cultural differences in overprecision are not as straightforward as previously reported.
While our results may offer some insight into cross-cultural overconfidence, they leave at least as many questions unanswered. First, our lack of ability to verify our Indian and UK participants as truly collectivistic and individualistic, via the Singelis Self Construal Scale or the Triandis and Gelfand Cultural Orientation Scale, is problematic. Another weakness in this study was our inability to distribute the Singelis Self Construal Scale to Study 1’s Hong Kong and US participants. Future research should not only investigate the discrepancy between our expected and actual cultural scale results, but should employ additional measurements of culture. We focused on measures surrounding individualism and collectivism, but there may be other cultural differences that do a better job explaining differences in overconfidence. Indeed, we find that while overconfidence does vary cross-culturally, levels of individualism and collectivism do not separately predict overconfidence. This suggests some other cultural difference is at play.
It is worth mentioning that some of our overconfidence results suggest differences in overconfidence based on gender and age. In Study 1, we found that female participants showed less overestimation and less overplacement than male participants; in Study 2, we found that female participants showed greater overprecision than male participants. While some past research claims to find gender differences in overconfidence, the empirical record is varied and spotty (Moore & Dev, 2017). We are skeptical that the gender differences we found are durable or general. Study 1 also finds that age and overprecision are positively correlated, but this result is restricted to our American sample (Prims & Moore, 2017). There was no such correlation between age and overprecision in the participants from Hong Kong, India, or the United Kingdom. Further research should explore whether the correlation between age and overprecision replicates across various different cultural samples, and why cross-cultural differences might arise in the effect of age on overconfidence.
This study is necessarily correlational. We could not randomly assign subjects to cultures. Consequently, there are other differences between our samples that could account for the differences we observe, including language, wealth, and governmental system, among many others. However, it is theoretically possible to manipulate cultural mindset. This might be possible by priming bicultural participants with the symbols of one culture or another (Hong, Morris, Chiu, & Benet-Martinez, 2000). Moreover, it might be possible to specifically investigate the role of individualism/collectivism by priming participants to think of themselves in either an individualistic or collectivistic way (Goncalo & Staw, 2006; Kastenmüller, Greitemeyer, Jonas, Fischer, & Frey, 2010; Matsumoto & Yoo, 2006).
Taken as a whole, we believe that this research provides a glimmer of insight into cultural differences in overconfidence. Using three clearly defined and easily operationalized forms of overconfidence, we find differences in overconfidence between Hong Kong, India, the US, and the UK. Our results suggest that these differences may be highly task dependent, and that prior findings of greater overconfidence in collectivistic cultures should be taken with a grain of salt. While we cannot claim to have nailed down the precise mechanisms by which cultures give rise to differences in overconfidence, the existence of systematic differences between countries is both provocative and important. We hope that future research can investigate the underlying cultural processes that give rise to differences in confidence.
Data Accessibility Statement
All the stimuli, presentation materials, participant data, and analysis scripts can be found on this paper’s project page on Open Science Framework: https://osf.io/nu6jt/.
The additional file for this article can be found as follows:Supplementary Study 2 Analyses
This supplement contains a variety of supplementary Study 2 Analyses: (1) We compare effort across our US and Indian participants. (2) We report the results from Study 2 without any data exclusions (the main manuscript reports results with exclusions). (3) We check for a relationship between race, caste, and overconfidence. DOI: https://doi.org/10.1525/collabra.153.s1