Introduction

Numeracy — the ability to understand and use numbers — matters for medical, financial and legal decisions1,2. Numeracy is more than mathematical skills because it involves practical applications of such skills and associated reasoning. Frequencies (counts), fractions and proportions such as probabilities are all examples of numbers. Numbers dictate life and death, as when the frequency of cases of a deadly contagious disease explode exponentially3. Understanding these numbers helps laypeople and professionals reduce risk. Numbers drive government investments (for instance, in construction of levees as sea levels rise) and personal choices (such as vaccination or travel to war zones). In a world increasingly awash in numerical information, numeracy offers advantages in health, wealth and well-being4.

Overall, research has shown that low proficiency with numbers is pervasive and is generally associated with adverse life outcomes such as death, disability and lost educational and career potential. In medicine, numeracy is robustly related to accurate perceptions of health benefits and health risks in patients; the quality of medical decision making and shared decision making between doctors and patients; and health outcomes in patients5. For example, patients with low numeracy were less able to manage their diabetes, which involves monitoring and comparing blood glucose numbers6,7. In economics, performance on numeracy tests significantly predicts employment, retirement savings and overall wealth1,8,9. For example, simple numerical tasks about finances predicted national wealth — the per capita gross domestic product — explaining from 16% to 27% of the variance in gross domestic product in nationally representative samples from 31 countries10. In law, low numeracy compromises the ability of a judge or jury in criminal cases to make reliable sentencing decisions or to appreciate conditional probabilities, such as those involved in DNA tests, and in civil cases to formulate reliable dollar damage awards11,12,13. For example, highly educated judges were subject to a host of biases that are similar to those exhibited in less educated people, including imposing shorter sentences when assigning sentences in months rather than in years14. Thus, education does not equate to numeracy.

Despite the need for numerical skills in highly industrialized societies, standardized tests of representative samples of individuals indicate that numeracy rates declined significantly from 2003–2008 to 2012–2017 in the United States, Canada, Hungary, the Netherlands and Norway. Among countries tested, only New Zealand showed an increase during this recent period of testing15. Low performance and lack of progress among developed nations do not auger well for the world’s ability to cope with social, economic and health challenges that require understanding the importance of numbers, such as the COVID-19 pandemic16,17.

In summary, many people do not possess basic, practical mathematical skills that are often essential to judge risks, probabilities and outcomes and to make adaptive decisions4. These judgements and decisions cannot easily be outsourced to experts with high numeracy. For example, the US jury system relies on the participation of ordinary citizens — a jury of one’s peers — to uphold common-sense community standards, as opposed to reflecting only the values and perspectives of the elite. More than 40 nations use a jury system to accomplish this goal. In medicine, a movement towards patient-centred decision making has shifted the responsibility for decisions from trained clinicians (some with specialized statistical training that improves numeracy) to patients, who typically lack both medical and statistical training. COVID-19 has brought into sharp relief that interpreting numbers and applying them to oneself or to family can be a matter of survival17,18. Therefore, the onus on the average person is greater than ever before to make sense of large amounts of readily accessible quantitative information (such as that accessible via social media or websites), but the ability and confidence to do so are frequently lacking19,20.

In this Review, we first discuss the psychology of how people think about numbers and then the most widely studied forms of numeracy, objective and subjective numeracy, and their connections to several other cognitive and metacognitive abilities. One conclusion we reach is that subjective numeracy is not a form of numeracy, despite its name; instead, it is a metacognitive self-assessment of numerical ability and preference. We next consider evidence that non-human animals and infant humans possess innate numerical abilities, and we examine how mental representations of number build on that foundation. The Review concludes with research that aims to improve numerical cognition and its practical applications through training programmes.

Why psychology is needed to solve the problem of low numeracy

The potential for numeracy to improve human outcomes does not simply rest on better knowledge of numbers and numerical operations, although some rudimentary knowledge is widely lacking and essential for many decisions21. Numeracy’s potential for improving outcomes is also limited by the psychology of how people think about numbers. This psychology encompasses distortions in the perception of frequencies and probabilities, biases in quantitative judgements and decision making, and poorly calibrated confidence about and persistence in numerical tasks22,23. Importantly, the psychology of numbers extends beyond deviations from objective precision to mental representations of the meaning of numbers.

Mental representations of numbers range from precise and literal verbatim to vague but meaningful gist that interprets information in ways that deviate from its surface form24 (Box 1). The following are examples of literal numerical differences versus gist differences (categorical gist examples are used to highlight that gist captures qualitative distinctions, but there are other kinds of gist representations). For vehicle operators, numerical increases from 0.00 to 0.02 to 0.04 to 0.06 to 0.08 in blood alcohol levels are all literally equivalent. However, the 0.00 to 0.02 increase is a categorical gist shift from sober to not sober, and the 0.06 to 0.08 increase is a categorical gist shift from not sober to criminally impaired in many US jurisdictions. The fact that jurisdiction matters shows that literal differences, say between 0.04 and 0.06, are not necessarily meaningful differences; the same literal differences turn out to be meaningful in some jurisdictions but meaningless in others. Declines in personal cancer risk from 5% to 4% to 3% to 2% to 1% to 0% are also all literally equivalent. However, the 1% to 0% decline is a categorical gist shift from low risk of cancer to no risk. Similarly, smaller amounts of bearable pain experienced per unit time during an invasive medical procedure, as judged on a 0–10 pain scale, do not ‘add up’ to unbearable ‘peak’ pain25. Instead, unbearable pain is a categorical gist shift from qualitatively different bearable pain26.

More generally, judgements, decisions and behaviours depend on the mental representations that are extracted rather than the objective information that is presented25,27. Regardless of whether 0.06 is technically sober, 1% cancer risk is technically low or unbearable pain is technically brief, judgements, decisions and behaviours depend on how these numbers are mentally represented psychologically as low or high. Thus, to help people take advantage of numerical information, it is crucial to understand how that information is mentally represented and processed.

Defining and measuring numeracy

In this section, we describe how objective and subjective numeracy are typically measured but disentangle objective performance, underlying competence and subjective self-assessment, among other factors. We also examine how metacognitive abilities modulate the manifestation of underlying numerical abilities drawn upon in numeracy tasks. The analytical and meta-analytical processes described in this section contrast with the intuitive processes discussed in the subsequent section. Understanding all of these processes is required to design training that improves numerical judgements and decisions, which will be discussed in the penultimate section of the Review.

Objective and subjective numeracy

Objective numeracy involves performance on numerical problems that can be scored as correct or incorrect28. Brief objective numeracy assessments continue to be the focus of current research. Schwartz et al.’s29 original assessment contained three items mainly concerning probabilities and proportions, with similar items added later to create 11-item and 15-item versions30,31,32,33. Because they focus on probabilities and statistical computation, such tests are sometimes referred to as ‘statistical numeracy’ or ‘risk literacy’ assessments34,35. These numerical competencies predict informed and accurate risky decision making in business and engineering36,37,38, medicine and health communication39,40,41 and civil and criminal law11,13. For example, it makes sense that people who are better able to order a 1%, 10% and 5% risk of side effects from different medications would be better able to choose among treatment options42.

Objective numeracy tests play an important role in public policy because they can be used to gauge the numeracy skills of the workforce and electorate (as an example, see https://nces.ed.gov/surveys/piaac/measure.asp?section = 2&sub_section = 3). However, comprehensive performance tests of numerical skills can be difficult to administer because respondents must solve mathematical problems, which usually takes them longer and they find harder to perform than providing a self-assessed rating of their ability. Subjective numeracy scales measure self-assessed numerical ability and preference for numerical information — for example, Q: how good are you at working with fractions? A: rated on a scale from not at all good (1) to extremely good (6) — and are less burdensome to administer to respondents compared with objective numeracy tasks43,44. Self-assessed numerical ability, preference for numbers and cognitive reflection (see below) each tap metacognition (cognitions about cognition) rather than directly tap** cognition (Fig. 1). Naturally, these metacognitions differ in details, as discussed below.

Fig. 1: Types of numerical cognition and metacognition, and relationships between them.
figure 1

Three types of metacognition (cognitions about cognition) are shown (upper part): self-assessed numerical ability (domain-specific because it is about numbers, measured by the SNSa), self-assessed preference for numbers (domain-specific because it is about numbers, measured by the SNSp) and cognitive reflection (a domain-general tendency to reflect on one’s cognition, monitoring cognitions and inhibiting thoughts and responses that, on reflection, seem wrong, measured by the CRT). Two types of cognition are shown (lower part) and include objectively assessed numerical ability (domain-specific, as measured by the ONS) and objectively assessed overall cognitive ability (domain-general cognition, as measured by general intelligence tests, especially executive processes). Arrows indicate strength and direction of influences, for example, individuals subjectively assess their numerical ability based, in part, on evidence of their numerical ability such as observing their own performance on mathematics tests: high cognitive ability produces high performance which, in turn, influences metacognitive self-assessments of high ability. However, stereotypes — not based on cognitive ability — also influence self-assessments of numerical ability (not shown). Numerical ability also influences self-assessed subjective preferences for thinking using numbers (SNSp) because those with higher numerical ability generally find using numbers easier and more enjoyable than those with lower numerical ability. Self-assessments of preference for thinking using numbers can also be biased by stereotypes (not shown). Influences also flow, as the arrows indicate, from metacognitions to cognitions. For example, higher self-assessed numerical ability (mathematical confidence) encourages individuals to attempt to solve more problems on objective tests of numerical ability, which can increase opportunities to learn about mathematics and can increase objective test scores. CRT, cognitive reflection test; ONS, objective numeracy scale; SNSa, subjective numeracy subscale for ability; SNSp, subjective numeracy subscale for preference.

Subjective numeracy correlates moderately with objective numeracy32,45,46, but it also reflects biases inherent in self-assessments and other metacognitions. One such bias is the Dunning–Kruger effect in which those of lower ability have higher confidence than is warranted by the accuracy of their performance on a task, an effect that has been shown in many tasks for many abilities47,48. The Dunning–Kruger effect implies that those low in numerical ability will be overconfident about their ability, hence showing poorly calibrated confidence relative to their performance level on tasks such as solving mathematical problems. Poorly calibrated confidence is a problem when it curtails persistence on tasks that might ultimately be solved (underconfidence) or when it interferes with sufficient deliberation in a task to correct detectable errors (overconfidence)1,23,49.

Nevertheless, because subjective numeracy is correlated with objective numeracy, it is a useful proxy measure and exhibits relationships with other variables that are similar to those observed for objective numeracy. Subjective numeracy also relates, as might be expected, to other self-assessments involving numbers, such as mathematical anxiety, the self-reported anxiety about using mathematics — for example, Q: how much anxiety does working with percentages make you feel? A: rated on a scale from low (1) to high (5)50. Thus, subjective numeracy and mathematical anxiety are not forms of numeracy per se but, rather, are perceptions of one’s ability and comfort with numbers, which reflect multiple indicators that inform self-judgements such as self-observation of task performance, differential opportunities to learn mathematics and stereotypes about numerical ability51,52,53.

In summary, the key points are that objective numeracy scales measure a type of cognition; subjective numeracy scales measure a type of metacognition; and the subjective numeracy scale draws heavily on observed and inferred evidence of domain-specific cognitive ability (Fig. 1). As described in this section, influences flow in many directions, from subjective numeracy (self-perceptions of ability, such as mathematical confidence) to objective numeracy (mathematical ability and performance on mathematical tests) and vice versa (Fig. 1; not all influences shown). For example, subjective numeracy judgements draw on personal knowledge about objective numeracy such as memories of successful or unsuccessful mathematical performance on tests. Conversely, performance on objective numeracy tests relies on aspects of subjective numeracy such as confidence; those who are confident in their mathematical ability attempt more problems and thus, can attain objectively higher scores20,54. Confidence provides but one example of the ways in which metacognitive processes scaffold the relationship between ability and performance. Confidence enables people to take advantage of ability and, without confidence, ability is sometimes not enough to achieve good life outcomes (such as financial success)55. Because of the intertwining influences, some have argued that the theoretical construct of numeracy is multifactorial, consisting of both understanding of numbers and mathematical operations together with metacognition (applied to numeracy) and other self-regulation skills23. However, despite multiple influences that produce correlations between subjective and objective numeracy measures, research has shown that they are distinct — self-perception is not the same thing as objective ability (Fig. 1). In addition, subjective and objective numeracy differ from domain-general metacognition — or reflective processes — which we discuss below.

Metacognition

We next describe other types of metacognitive processes beyond subjective numeracy, how they influence the manifestation of numerical competence (ability) and how measures that have been characterized as objective numeracy — for instance the cognitive reflection test (CRT) — reflect both numeracy and metacognition.

The relationship between objective and subjective numeracy is not unlike the relationship between cognitive abilities generally and thinking dispositions (or cognitive styles) found using measurement scales of self-assessed cognitive style preference such as need for cognition — which measures the tendency to engage in and enjoy effortful cognitive activities56. That is, thinking dispositions such as need for cognition operate at the ‘reflective level’, as do other metacognitive processes that override fast, intuitive responses and that guide algorithmic operations57,58,59 (for contrary evidence and perspectives about the sufficiency of this specific dual-process approach, see refs. 24,60,61,62). Naturally, reflection occurs in other types of metacognition, not just on the CRT, and thinking disposition refers to an individual’s tendency to engage in such processes as reflection. Thus, just as higher need for cognition increases the tendency to engage in cognition generally, higher subjective numeracy increases the tendency to engage in numerical cognition and, as with other forms of metacognition (for example through checking answers to problems and correcting errors or inconsistences), influences algorithmic processing (one example of which is objective numeracy) (Fig. 1).

Algorithmic operations consist of domain-general cognitive abilities, such as executive processes reflected in measures of general intelligence, as well as domain-specific strategies, rules and procedures. In theory, numeracy, as measured in objective numeracy tasks and when distinguished from general intelligence33,45, would be an example of domain-specific algorithmic processing (Fig. 1).

Hence, as suggested above in our discussion about confidence, the implementation of the domain-specific competence of numeracy would depend to some extent on metacognitive thinking dispositions21,23,63. Researchers tested this hypothesis, distinguishing general ability (fluid intelligence), numeracy and thinking disposition (reflective versus impulsive) to study their relationships to probabilistic reasoning (an example of numerical processing)64. They found that, although individual differences in thinking disposition (metacognition) did not moderate the relation between numeracy and probabilistic reasoning, an experimentally induced thinking ‘disposition’ did moderate it. Disposition to think was manipulated experimentally by instructing some participants to reflect on their analytical reasoning. Instructions to reflect analytically facilitated reasoning such that numeracy predicted probabilistic reasoning when general ability was high. In other words, general intelligence allowed those high in numeracy to take advantage of the instructions to reflect so that they could implement their numerical ability to solve probabilistic problems. Thus, encouraging metacognitive reflection by itself, without general and domain-specific ability (numeracy), does not necessarily yield insight into how to solve numerical problems, illustrating distinctions we have discussed above (Fig. 1).

In a similar vein, the CRT65 has been argued to draw on both numeracy — it contains mathematical problems — and reflection skills, especially the metacognitive ability to monitor for and then inhibit fast intuitive system 1 (or type 1) responses — gut responses — that are wrong. For example, $0.10 is the intuitive (but wrong) response to the following: “a bat and a ball cost $1.10 in total, the bat costs $1 more than the ball, how much does the ball cost?” The correct answer is the ball costs $0.05 because the bat costs $1.05, and $1.05 minus $1.00 equals $0.05. Researchers argue that those high in reflective ability are more likely to reflect on their answer (a system 2 response), that is, check their intuitive fast response ($0.10), realize that the total would be $1.20 (wrong) and recalculate their answer using algorithmic processes. In support of the argument that the CRT draws on numeracy, studies have shown that the CRT loaded together with numeracy measures on the same dimension in exploratory or confirmatory factor analyses32,66,67,68. However, some evidence supports the idea that the CRT measures distinct faculties, as numeracy was less related to decision making than were measures of executive functioning or cognitive impulsivity measured by the CRT69 (but see ref. 70). In addition, although the CRT and numeracy tests correlated with one another, the CRT also accounted for unique variance beyond numeracy in predicting risk and ratio judgements33,45 (see also refs. 59,64). In a study in which subjects described their thinking aloud, it was similarly concluded that the CRT is a multi-faceted construct rather than a single dimension, as both numeracy and reflectivity accounted for performance71.

In our view, the question is not whether the CRT is a measure of either numeracy or of reflection1, as the CRT design does not allow numeracy and reflection to be easily extricated from one another, as in a mathematical model (but see ref. 72 for an approach). However, mathematical and psychometric models have been used to extricate reflective from other cognitive processes, including numerical ones, indicating that reflection (metacognitive monitoring and inhibition) constitutes a distinct mental faculty73,74,75. These models have been applied to decision-making, memory and judgement tasks involving numbers and this observed separation (taking into account other CRT evidence that also links to these models, such as in ref. 45) applies directly to the mechanisms purportedly tapped in the CRT. In addition, other researchers have developed a verbal alternative to the traditional CRT that helps disentangle contributions of reflection and numeracy from one another on the CRT76.

The relationships among objective numeracy, subjective numeracy and reflection — such as metacognitive monitoring for and inhibiting of intuitive responses — that we have discussed have a general architecture (Fig. 1). When numbers are processed in everyday life to make judgements or decisions, ability (objective numeracy), comfort with numbers (subjective numeracy) and strategic metacognitive engagement (reflection) often go hand in hand.

As discussed below, our view differs from the standard view of system 1 and system 2 processing in several ways. First, based on research, we disagree with the claim that intuitive processing is often low-level impulsive responding that must be overridden. Second, metacognitive reflection does not necessarily provide cognitive insights about the central meaning (gist) of numbers. Nevertheless, we agree with the remaining distinctions between cognitive ability and metacognition so have integrated these concepts into our account of numerical cognition. To preview, we next discuss two kinds of intuitive processing that complement objective and subjective numeracy that are both strengths of human cognition but are not the same thing: the psychophysical perceptions of number (the approximate number system) and gist mental representations. A critical demonstration that these two kinds of intuitive processing are not the same thing is that psychophysical distortions of number (non-linearities in perception) are more pronounced in children than in adults. By contrast, we then discuss gist-based distortions (for example, the framing illusion and semantic false memories), which increase from childhood to adulthood as the tendency to rely on meaningful gist representations increases.

Intuitive processing of number

Given the widespread (and worsening) difficulty processing numbers in humans, it might be surprising to realize that people have an innate ability to process number. (What is innate is processing frequencies or magnitudes, not processing numerals.) In this section, we discuss the way in which this innate ability to process number — the approximate number system — is reflected in psychophysical laws77. These psychophysical laws apply to perceptions of counts of discrete objects, such as the number of cups on a table, and to continuous magnitudes, such as the amount of coffee in those cups. Some evidence suggests that the latter magnitudes are more automatic and basic than the former numerosities. The approximate number system enables non-human animals and human infants to judge differences in frequencies (perceiving discrete numbers of objects) and magnitudes (perceiving continuous amounts) without being able to count or do explicit calculations78. In non-human animals this innate ability aids foraging decisions and in human infants the approximate number system lays the groundwork for later acquisition of formal mathematical ability. In contrast to the difficulties achieving adequate numeracy, this innate ability to process number implies humans are born as intuitive mathematicians.

For example, a meta-analysis showed that performance on mental number line tasks, such as locating 72 on a line whose end points are labelled 0 and 100, consistently correlated with formal mathematics competence79 (see ref. 80 for evidence linking the acuity of the approximate number system in 6-month-old infants to standardized mathematics scores 3 years later). Another meta-analysis of 26 studies by Christodoulou et al.81 upheld the once controversial finding that infants are capable of simple arithmetic with small quantities82.

The approximate number system is often characterized as a mental number line because it represents relative magnitudes of number in a left-to-right spatial orientation (although specific map**s between number and space seem to differ across cultures)83,84. However, an association between the magnitude of a number and the spatial location of a response generalized to close versus far (small numbers were associated with close and large numbers were assocated with far, rather than with left to right)85. Thus, a general number–spatial location association seems to be represented in the brain, in part, in the posterior parietal cortex86,87, which is also associated with processing fundamental numerical concepts such as risk and probability88,89. Such results (and others) speak to whether brain activation during symbolic number, arithmetic and spatial processing (mental rotation) tasks are consistent with shared processing accounts (in other words, number processing and spatial processing overlap)90.

Although the approximate number system orders quantities, perceptions of quantities are subject to several psychophysical distortions91,92. First, relative magnitude is governed by Weber’s law, that perceived differences in quantities vary as a function of the ratio of quantities rather than their absolute differences. For instance, a difference in quantity between piles of 8 rocks and 16 rocks is perceived as similar to the difference between piles of 80 rocks and 160 rocks. Second, discriminability varies with the distance between quantities, with smaller differences (8 versus 10) judged more slowly than larger differences (8 versus 20). Last, discriminability decreases as the magnitude of numbers increases. For example, the difference between $0 and $100 seems bigger than the same objective difference between $100,000 and $100,100. Thus, these three psychophysical results provide evidence about the nature of the psychological function that translates actual physical quantities into mental representations of quantities.

Multiple negatively accelerated functions have been proposed to account for how the psychophysics of number changes as numerosity increases, the third result discussed above, including logarithmic93 and power functions94,95,96. Yet extensive cognitive developmental research has supported a representational change account, the logarithmic to linear shift in number perception78,97,98,99,100,101,102,103. In short, children’s number representations are more distorted and less differentiated (a flatter function of objective quantity), whereas adults’ perceptions of number track objective quantity (are linear functions of objective quantity). However, the ratio, distance and size effects discussed above imply psychophysical distortions that were also observed in adults and each of these effects violates assumptions of linearity.

Moreover, major theories of adult decision making — expected utility theory and prospect theory among them — posit non-linear functions of quantities, outcome values (dollars) and outcome probabilities. These are representational accounts of the psychophysics of quantities that are used to explain decision making. However, the adult decision theories make the opposite assumptions about adults’ perceptions (non-linearity) than the cognitive developmental theories do (linearity). As we discuss below, their predictions for decisions depend crucially on assuming non-linearity104,105. This contradiction between adult studies in decision making and developmental studies that include adults is somewhat reconciled by noting that non-linearity varies with numerical range and that individual adults differ in their psychophysical ‘acuity’87,106. For example, adults demonstrated a logarithmic estimation pattern when the count range was increased from 0–100 to 0–100,000 (ref. 107) (see also ref. 108, which demonstrates the implications of this distortion of large numbers for laypeople’s understanding of government expenditures). In addition, adults with little formal education map symbolic and non-symbolic numbers onto a logarithmic scale, whereas formally educated adults (often including college student samples) use linear map** with small or symbolic numbers and logarithmic map** when large numbers are presented non-symbolically under conditions that discourage counting109 (Fig. 2) (see also ref. 110 for a review of research on intuitive representations of probabilities and relationships to counting).

Fig. 2: Psychophysical relations posited between actual and perceived magnitudes for small and large numbers.
figure 2

Each curve illustrates the relationship between objective numbers of discrete objects or magnitudes of quantities as presented in studies and their subjective perceived numbers or magnitudes as inferred from judgements of intensities, similarities or differences. For example, individuals might be asked to place a mark on a continuous unnotched line with labelled end points to indicate the magnitude of a presented number (for example, place 57 on a scale with end points labelled as 0 and 1000). Many such judgements for a range of numbers are elicited from each individual. Each mark is converted to millimetres along the line to derive the perception of each number (and can be plotted with x indicating the objective value of the number and y indicating its subjective perception). The absolute deviation from the objectively correct placement of that mark can also be calculated. As the figure indicates, the perception of small numbers generally tracks objective values linearly, whereas the perception of large numbers bends as numbers increase, indicating that each additional unit of objective value is not perceived as increasing in equal intervals subjectively. Thus, absolute deviations from objective values of numbers increase as numbers increase.

In summary, the map** of number onto a spatial array in the mind is a widespread human intuition because judgements about number have some similarities to judgements about space. In addition, intuitions about number display features of a logarithmic function (or similar non-linear functions) when the crutch of counting is discarded (when the task makes it difficult to count objects). Thus, formal education layers a more literal representation — a linear representation that faithfully captures actual differences in magnitude — on top of an intuitive representation, the latter revealed by task modifications as in changing the range of judged numbers from small to large; large numbers are distorted in perception more than small numbers. As we discuss in the next section, although psychophysical intuitions about number using the approximate number system have been related to other numerical judgements and decisions, multiple levels of numerical representations must be assumed to fully account for the gamut of numerical cognition111,112,113.

Precision and gist in decision making

Numeracy relies not only on an intuitive sense of number but also on judgements and decision-making processes. In particular, numeracy often has a role in situations involving risks (variable outcomes with known probabilities) and uncertain outcomes (variable outcomes with unknown probabilities).

Leading theories of decision making have built on psychophysical properties of quantity to represent the value of outcomes of sure or risky options such as winning or potentially winning $1,000 (refs. 114,115), assuming that as outcome quantities increase the perceived differences between outcomes diminish (as discussed above, the psychophysical function that translates the actual physical quantity into a mental representation of quantity bends as quantity increases)25,104,105,116. These theories predict risk aversion when choosing between a sure gain (winning $1,000 guaranteed) and a gamble of equal expected value (for example, flip** a coin and winning $2,000 if heads and $0 if tails; 0.5 × $2,000 + 0.5 × $0 = $1,000 expected value). Because the sure gain outcome, being a fraction of the gamble outcome, is valued closer to its objective value ($1,000), the sure gain is worth more overall than the gamble. In other words, in the gamble, a fraction (0.5) of a smaller number (a discounted $2,000) is a smaller overall expected value compared with the expected value of the sure option, predicting risk aversion. Risk aversion means avoiding variability in outcomes: less variability is preferred over more variability (that is, the less variable sure gain option is preferred over the more variable riskier gamble above). Thus, the psychophysical theories of decision making suggest that risk aversion (or risk avoidance) falls out of the perception of numbers such as outcomes.

The role of psychophysical perception in predicting such risk aversion has been demonstrated in a series of experiments with ants, underscoring the primitive origins of some kind of approximate number system117. Based on the psychophysical Weber–Fechner law, experimenters manipulated the relative differences between risky and safe options to determine whether ants’ evaluation of resources, such as magnitudes of food, depended on logarithmic rather than linear differences117 (see also ref. 118, which identifies similarities and differences across species in risky choices). The authors found that ants evaluated resources using logarithmic map** and that this created risk avoidance. Indeed, ants were extremely risk-averse, with 91% choosing the safe option, demonstrating that the ants’ choices violated strong rationality — to choose options based on expected value — as assumed in optimal foraging theory. Thus, foraging behaviour in ants exhibits both a keen sensitivity to quantities, such as outcome magnitudes and their probabilities, and irrational biases that resemble those observed in humans119. That is, both species are risk-averse for gains.

Risk according to prospect theory

Prospect theory further predicts that risk preferences shift towards riskier gambles when outcomes are framed as losses (they shift as compared with when outcomes are framed as gains) — even when the net outcomes are gains (Box 2). The psychophysics of outcome values as described in prospect theory are again sufficient to predict the shift towards risk-taking for losses (although the psychophysics of probabilities also supports this finding104,120,121). Opposite risk preferences for outcomes when framed as gains or losses, despite identical consequences, violate even weak rationality — it is irrational to have opposite risk preferences when consequences are identical — yet this effect of gain–loss ‘framing’ is among the most replicable effects in psychology122.

There are good reasons to believe that outcome values and probabilities obey non-linear functions that reflect perceptions of quantity123. Yet a substantial literature points to alternative explanations of framing effects (the shift from risk aversion for gains to risk-seeking for losses) and other classic paradoxes112,119,124,125,126,127,128,129 (see also refs. 24,73 for reviews of evidence and theories of framing effects and other classic paradoxes). The question is not whether the psychophysics of numbers influences gist representations of numbers (which it does) but whether psychophysics as contrasted with gist representations explain framing effects; as we discuss, psychophysical explanations are disconfirmed whereas the gist representational explanation is confirmed. Clearly, translating an objective number to a subjective value using a power function or a logarithmic function (simply as a description of magnitude perception) does not specify the gist interpretation of that number. Framing effects and their variations provide one demonstration of these distinctions (for other demonstrations, see refs. 21,130,131).

As noted, framing effects include risk aversion for gains. A major argument against the psychophysical explanations of risk aversion for gains is that risk aversion in classic paradoxes can be made to appear and disappear by focusing processing on the gist of decision options without changing the numbers (Box 2). Fuzzy-trace theory provides an alternative explanation of risk aversion for gains (and for risk-seeking for losses) that hinges on the categorical qualitative difference between zero and non-zero outcomes.

Risk according to fuzzy-trace theory

Fuzzy-trace theory is an account of numerical gist that is extracted from objective numbers, in parallel with verbatim representations. Foundational evidence for fuzzy-trace theory encompasses memory for numbers, probability judgements, magnitude estimation, multiplicative processes, transitive inference and mental arithmetic132. Fuzzy-trace theory predicts risk aversion for gains by distinguishing between mental representations of gist — fuzzy but meaningful representations with content that distil the essence of information — and verbatim representations — also symbolic representations but capturing the literal surface form (for example, capturing exact words and exact numbers as presented) (Box 3). Objective numbers are important inputs to both gist and verbatim representations. Thus, we are referring to how numbers are mentally represented and how they are thought about and not to the physical inputs when we argue, as we do below, that numbers should not be reified or should not be processed literally.

Fuzzy-trace theory builds on prospect theory but differs from it in critical tests (experiments that pit alternative predictions of fuzzy-trace theory and prospect theory against one another to determine which theory is consistent with results or is ruled out). For example, using common consumer financial decisions, research showed that the likelihood of choosing a certain reward over a risky or uncertain reward with a greater expected value was affected by manipulating gist processing of choice options, as opposed to the problem’s verbatim details133. Five especially rigorous and painstaking experiments testing risk aversion for gains revealed that focusing an individual’s attention on the gist of choice options accentuated the preference for certainty and, conversely, focusing an individual’s attention on the details of the choice options attenuated the preference for certainty133. For instance, individuals were randomly assigned to two instructional conditions — to make an intuitive decision or to elaborate details about reasons for their decisions — and had to choose between a sure option versus a numerically superior risky option (that is, options of unequal certainty and expected value). Individuals in both conditions also had to rate the degree to which their decision strategies were qualitative (‘I saw it more as a choice between a prize for sure and an uncertain prize’) and whether their strategies focused on numbers. Individuals assigned to the gist condition of making an intuitive decision chose the sure option more133. Furthermore, individuals who preferred the sure option when making a decision were more likely to indicate that they used a qualitative gist strategy to make their decision, such as the strategy quoted above, which mediated the effect of instructional condition on choice preference. Moreover, those who received instructions to focus on details and who reported less qualitative gist thinking were less likely to favour the sure option, shifting towards the quantitatively superior (higher expected value) risky option.

Mental representations of qualitative gist — such as winning some money for sure versus possibly winning some money or winning nothing — can account for risk aversion for gains, risk-seeking for losses and variations on these gain and loss framing effects, and under stringently controlled conditions exact numbers are neither necessary nor sufficient to observe gain and loss framing effects24,73,119,125,129. As examples, the numbers can be removed entirely and framing effects are still obtained as long as the gist is conveyed and, in other experiments, the numbers can be emphasized but the effects disappear when the gist is removed. To illustrate, in head-to-head critical tests, such as truncation experiments (Box 2) with all the verbatim numbers that should elicit gain and loss framing effects according to psychophysical theories (such as prospect theory) still present in the problems, framing effects were eliminated as per the predictions of fuzzy-trace theory because gist differences between options were removed. According to fuzzy-trace theory, eliminating the zero part of the risky options — the truncation of zero — eliminated the categorical gist that distinguished options, which was the cause of framing effects. Fuzzy-trace theory predicts that individuals extract both verbatim and gist representations but emphasize the simplest gist (categorical level: some quantity or no quantity) in their judgements and decisions. For example, gaining some money for sure is preferred to possibly gaining some money or gaining nothing (because gaining some quantity of money is better than gaining no quantity); the same some–none categorical level of gist produces risk-seeking for losses. Furthermore, in expected-value equal framing problems, verbatim representations — which include exact numbers and rote computations performed on those numbers — yield indifference between options whereas gist representations yield the framing effect. Verbatim representations include exact numbers and the computations that research shows are performed automatically on those numbers; something akin to expected value is computed automatically from early childhood110. Removing gist differences by the truncation of zero produces reliance on the already encoded verbatim representations whose effects are masked when categorical gist differences between options are present. Thus, the truncation of the zero option, which amounts to subtracting literally nothing in other theories, produces indifference between options and eliminates framing effects: neither risk aversion for gains nor risk-seeking for losses is produced.

Conversely, accentuating the gist augmented framing effects, again per predictions of fuzzy-trace theory. That is, truncating the non-zero part of the gamble while leaving the zero part produces starker categorical gist contrasts between options: gaining some money compared with gaining none and losing some money compared with losing none, resulting in larger preferences for the sure gain and for the risky loss (enhanced framing effects). These preferences turn on the simplest gist representation of quantity, its categorical presence or absence (zero). Thus, in many qualitative gist representations in judgement and decision making, these and other results point to the crucial nature of ‘zero’ (Box 2) despite the fact that zero literally contributes nothing quantitatively17,134.

Effects in truncation experiments that focused processing on gist or verbatim representations were not due to ambiguity because all of the information was presented in all of the conditions, as further demonstrated by ambiguity tests given to participants, and these truncation effects have been replicated for large numbers of decision problems in different experiments125,127,128,129. Using different truncations, framing effects were made to appear and disappear for the same people and for problems with identical (non-zero) quantities, ruling out psychophysical and inter-individual differences as explanations for the effects. These effects of truncation on framing effects have been extended from choices to ratings of each decision option in both within-subjects and between-subjects designs119 (complementing the results of refs. 135,136).

Although results from truncation experiments confirm fuzzy-trace theory’s predictions and disconfirm psychophysical explanations assumed in prospect theory for risk aversion for gains (or for framing effects), they do not imply that psychophysical representations of number do not exist. On the contrary, there is good evidence that psychophysical representations of number are extracted, and they undergird similarity judgements that influence decisions137,138. However, they are distinct from categorical gist representations of quantity as shown for judgements of ‘approximately equal’131 (see ref. 139 for an independent corroboration). Crucially, combining gist representations with literal verbatim representations can mimic the standard negatively accelerated psychophysical functions (see discussion of psychophysical numbing in ref. 17 and explicit quantitative models of this process in refs. 73,140).

Risk, numeracy and decision making

When separate measures of numeracy and psychophysics (such as approximate number sense) — and, in some studies, measures of gist — are included as predictors of judgements and decision biases, each accounts for variance in responses45,111,112,113,141. Although numeracy has been identified with objective processing of quantities (objective numeracy tests have objectively right and wrong numerical answers) and related to reduced levels of psychophysical distortion and cognitive biases, these connections can be tenuous or even reversed142 (see refs. 1,23 for extensive discussion). Higher numeracy has been associated with rating numerically inferior gambles as more attractive than superior gambles; more numerate individuals, including professional accountants, were more prone to numerical comparisons that produced judgement biases143. In other problems, more numerate people emphasized harder-to-compute ratios rather than differences in numbers of lives saved for different charities, thereby choosing programmes that saved many thousand fewer lives144. Although competing gist representations about which option is affectively superior (good or bad) also support choosing the wrong answer, it is the relationship to numeracy and its promotion of the wrong answer through mechanically computing ratios that is the central consideration here. Individuals who focus on computing ratios (a harder computation than simply noting that more lives are saved in one of the options) miss the forest for the trees — they miss the bottom-line gist that the point is to save lives, and computing ratios is beside the point in this problem. This kind of thinking focuses on surface details rather than simpler, deeper meaning.

Note that gist thinking does not depend on ‘verbatim understanding’ because gist representations are not derived from verbatim representations (gist representations such as which is more or which is most are encoded independently from encoding of verbatim numbers); and, also, there is no such thing as verbatim understanding because verbatim means without meaning or understanding by definition112,145. Problem solving can require multiple levels of precision, including performing exact computations, but gist guides the selection and deployment of computations, and exact answers are not usually what is relied on in judgements and decisions. It is the meaningful interpretation of that number in context — the gist — that matters. As examples, a numerically small prevalence rate of an epidemic infection can be a huge risk and doses of a poison can be expressed in numbers that are psychophysically similar but boil down to different categorical gists of lethal versus non-lethal doses. Objective and precise representations of numbers by themselves (prevalence rate, dose) do not deliver insight into the bottom-line qualitative gist of the numerical information (huge risk, lethal dose). In fact, literal thinking about numbers can be misleading26 (Box 1). Thus, the aforementioned results in which more numerate individuals who deploy more precise numerical processing do worse are instructive143,144. The results show that mechanically processing numbers without sufficient attention to meaning — literal (verbatim) thinking — that reifies numbers and number crunching can lead to inferior judgements and decisions (Box 1).

Indeed, verbatim representations of exact numbers or words holds less sway over memory, judgement and decision making as human development progresses from childhood to adulthood or from novice to expert. Not only can numerical problems sometimes be solved non-numerically (Box 4) but also the tendency to rely on qualitative gist, as opposed to verbatim representations of number, increases with age and experience alongside advancing computational abilities129,146,147,148. Under theoretically predicted conditions, increasing emphasis on verbatim representations of literal numbers can drive up errors74,132,149. For example, putting numerical information out of sight — thereby reducing accurate verbatim memory for number — improves accuracy in class-inclusion problems (Box 4). In the class-inclusion problem asking ‘are there more roses or more flowers’, correct answers are increased by removing a visual display with eight roses and two tulips that decreases memory for number132. Although questions in class-inclusion problems ask about which classes of objects are more numerous or more probable, they are more accurately answered by ignoring numerical information132. That is, the number of roses and tulips is irrelevant to the question of whether there are more roses or more flowers (Box 4).

Similarly, the individual numerical risks of contracting COVID-19 or HIV/AIDS from a single encounter are low, but these accurate numbers can be misleading; the gist of these risks is arguably high119. Therefore, it is not that people necessarily ignore numerical risks, as researchers frequently assume, but rather that they temper their appreciation of the level of objective risk with qualitative considerations about what those numbers mean in context. For example, although ‘it only takes once’ is a categorical representation of risk that violates traditional approaches to probability training150, providing training that included that representation, along with numerical probabilities for both treatment and control groups, produced long-lasting changes in risk attitudes, intentions and self-reported behaviours, compared with the control group151.

Note that training was not ‘numerical’ in the sense of being solely about numbers but, instead, was about both presenting numbers (verbatim risks and probabilities) and also educating individuals about how best to understand the categorical gist — the simple bottom line — of those numbers151. Understanding information is a process through which information is always mentally represented; there is no such thing as understanding without a mental representation. Multiple gist representations of numbers are formed in the minds of individuals as part of the process of understanding numbers, in parallel with forming a literal verbatim representation which is not part of understanding (as verbatim is without meaning by definition). An individual can memorize a number verbatim (without meaning) and perform rote (without meaning) calculations on that number (0.01 = 1% = 1/100), then take that number to be the answer (literally) to such questions as ‘what is the risk of unprotected sex’ because someone said that number was the probability of HIV infection. Literal thinking promotes risk-taking in this example because the number is objectively small but this literal thinking misses the point that public health experts make, namely, that the risk is substantial and thus protective measures are warranted152.

The distinction between verbatim (literal) and gist mental representations of numbers and associated processing has implications for long-term retention in memory and transfer to additional contexts. Gist is not only encoded into working memory as problems are solved or decisions are made but also is the residue of numerical information that is retained in memory long-term153. In addition, the fuzziness and simplicity of gist allow it to be more flexibly applied in real-world contexts that differ superficially from training, thereby facilitating transfer to new contexts post training17.

Moreover, numerical superiority is not the same thing as decision superiority, in the sense of promoting health, wealth and well-being154,155. For example, calculating the expected value of having unprotected sex can yield a numerical answer that favours unprotected sex (low objective risk of bad outcomes such as HIV but high rewards)151,156. Similarly, buying home insurance has lower expected value than not buying home insurance because premiums take into account risks and outcomes, and provide a profit beyond expected value to insurance companies; on average, homeowners will come out ahead financially if they do not buy insurance and risk catastrophic total loss. Research has shown that choosing the numerically superior option in these types of decisions, rather than choosing on the basis of categorical gist, is associated with decision ‘inferiority’ — bad outcomes for individuals in terms of health, wealth and well-being130,151,152,155,157. Therefore, training programmes that aim to improve life outcomes should distinguish verbatim from gist assessments of knowledge about risks and outcomes158,159.

Improving numeracy

Most training programmes to improve numeracy per se have not built on many of the key research findings that we have sketched thus far. In this section, we summarize these numeracy training programmes with the goal of providing a foundation for future training research that takes these findings into account. Most numeracy training programmes contrast with risk-communication programmes or decision-training programmes; the latter have drawn on these key research findings as briefly discussed in this section17,130,151,160. Building on prior sections, we next discuss how future research that focuses on numeracy training ought to distinguish the following goals: instilling purely mechanical skills without understanding (literal verbatim thinking, which is not useless but is different from gist); hel** learners get the bottom-line gist of numbers in context; and encouraging confidence that is calibrated to objective numerical skills (ideally, that subjective numeracy would be high because objective numeracy was high).

Note that, in our approach (as described in Fig. 1), subjective numeracy is a type of metacognition (it is a self-assessment of numerical ability and preference, what individuals think about their thinking) that usually indirectly reflects numeracy (numerical ability) because ability creates bits of evidence that people use to self-assess their abilities and preferences. However, there are many other sources of metacognitive self-assessments of numeracy that do not reflect objective ability; for example, subjective numeracy is likely to be biased by ethnic and gender stereotypes about which types of people are good at mathematics that individuals apply to themselves. These extraneous influences on subjective numeracy are why we emphasize fostering a veridical link between subjective numeracy and objective skills (calibration) as part of training.

In addition, other metacognitions (beyond subjective numeracy), such as reflecting about numerical processing, must be taught beyond teaching numerical skills because increased cognition (skills) does not necessarily translate into increased metacognition (reflecting about numerical processing) (Fig. 1). Moreover, the innate intuitive appreciation of number reflected in the approximate number system should be built on as an asset in training with less emphasis on its deviations from literal linearity. Highly sophisticated numerical inferences, such as conditional probabilities, can be derived using that approximate number system161. Finally, we describe how connections between numbers and gist intuitions about those numbers should be inculcated to improve the application of numerical information to judgements, decisions and behaviours in the real world.

Training risk, probability and magnitude

Effective training programmes applying the concept of gist have been developed to help patients understand the risks and probabilities of diseases151,162 (see also ref. 130 for a review of such programmes). Web-based interactive tutorials have been developed to communicate probabilities of cancer given genetic risks160 and to train more domain-general quantitative reasoning163. Training programmes have also been effective in building on approximate number sense, especially with children (Box 5). For instance, teaching mental number line skills with linear board games transfers to mathematical tasks; such games might include rolling dice and then moving a specific number of steps (counting steps based on the outcome of the dice throw) along a path (line) in the board game78,164,165,166 (see discussions of related concepts in refs. 167,168,169,170,171).

In addition, although hypotheses that frequency formats — which use counts of discrete objects rather than ‘normed’ quantities such as probabilities, proportions or percentages — improve accuracy have not been borne out40,172, hypotheses about disentangling numerators from denominators have been borne out132 (Box 4). These effects have been demonstrated with and without frequencies, which do not bear on effects. Instead, separating events into non-overlap** classes (such as using two-by-two tables of probabilities) reduces a host of biases that can be traced to denominator neglect rather than lack of understanding of probabilities173. Despite lacking formal knowledge of marks such as slashes and decimal points and exhibiting biases, individuals can manifest an intuitive appreciation of probability early in development and without formal education161,174. Consistent with fuzzy-trace theory, biases traced to part–whole inclusion confusion — base-rate neglect and fallacies involving combining probabilities — are reduced considerably by making classes of events discrete160,175,176. Training interventions adopting this approach have been used effectively in law, medicine and public health160,175,176. Thus, ‘visual aids’ that do not separate event classes as theoretically indicated are less effective in reducing part–whole inclusion kinds of errors132,177.

Training numeracy

Objective numeracy correlates with differences in target abilities (the abilities that training is aimed at improving, such as mathematical skills or financial skills), even quantitative target abilities such as estimating how many questions were answered correctly on an examination, but this does not show that numeracy causes differences in target abilities. That is, objective numeracy might correlate with target abilities because people higher in numeracy happened to be wealthier or higher in intelligence overall (raising all scores due to a third variable), not because numeracy caused these outcomes (correlation is not causation). To establish causal connections, randomized control training experiments are required to determine whether subjects who received numeracy training have improved target abilities compared with untrained control subjects. Further, assessing improvement in such numeracy training experiments requires a theoretical understanding of three classical criteria of effectiveness: whether training effects are durable, transfer to untrained abilities and are large enough to be of practical importance. A curious feature of the numeracy literature is that there are few training experiments, as compared with many studies demonstrating that numeracy is a reliable predictor of many forms of decision making, judgement, problem solving and reasoning177. Indeed, only one corner of the literature, financial decision making, contains enough experiments to support some preliminary conclusions about causal connections between numeracy training and improvements in target abilities178. Depending on the stringency of selection criteria, roughly 20 randomized control experiments have been published.

As a group, these training experiments have three notable features: the subjects sampled, the nature of the training interventions and the presence of transfer tests. The subjects were overwhelmingly samples of convenience. Specifically, there were some target groups such as high school students or low-income people in financial distress that were already scheduled to receive financial numeracy training, such as a unit in an economics course or government-mandated financial education179. Regarding interventions, the focus was not on theory-driven numeracy training but, rather on providing facts and tools thought to be crucial for individuals to manage their personal finances. The training interventions fell into two main categories: learning financial facts (credit, interest, investment, loans, planning and savings); and learning and practising numerical computations (credit scores, annual interest on loans and savings, and investment growth)180. The transfer tests were personal financial behaviours that the experimenters aimed to increase, which were measured over time in trained and control subjects. The outcomes assessed included savings, planning for retirement, stock ownership and investments, management of cash flow, absence of debt, contributing to retirement plans and financial inertia (paying of unnecessary fees, passive acceptance of default options).

The general picture that has emerged from financial numeracy training experiments has four broad elements. First, these training programmes have been successful in the simplest sense because they produced substantial improvements in the target abilities that were taught, as periodic assessment revealed that trained subjects displayed superior knowledge of financial facts and superior accuracy in numerical computations relative to controls181 (for reviews of financial training programmes, see refs. 1,177). Second, training produced transfer to real-world behaviour. Trained subjects exhibited higher levels of subsequent savings, of planning for retirement, of investments and of management of cash flow relative to control subjects179,182,183,184. Third, training had durability, as trained subjects displayed better transfer performance weeks and months after training compared with control subjects185. Fourth, despite these positive outcomes, financial numeracy training had only small effects on transfer and limited durability177,178. For instance, effect sizes indicated that differences between trained and control subjects on the various transfer measures averaged 1–2% (refs. 177,178). Moreover, durability differences between trained and control subjects were detected weeks and months after training, but these differences were undetectable after 6 months.

Two straightforward conclusions emerge from randomized control financial numeracy training experiments. On the one hand, there is overwhelming evidence that such training produces reliable learning that transfers to personal financial behaviours and is stable over short periods of time. On the other hand, the practical goals of financial numeracy training are to produce large changes in personal financial behaviour that last for years (or perhaps decades in the case of saving for retirement). To achieve these types of practical goals, training programmes must somehow be improved, but the question remains how to best accomplish this.

More broadly, some hints for how to move forward are provided by correlational studies finding that level of schooling was associated with numeracy and numeracy, in turn, was associated with wealth, suggesting that protracted education might have long-term effects on life outcomes186,187 (see also refs. 188,189). Neuroimaging, event-related potentials and other neuroscience studies are also instructive but similarly correlational190,191,192. However, theories that identify underlying mental processes can be used to create experimental manipulations that mimic effects of individual and developmental differences to test causal links64,129. Although general brain training programmes are not necessarily effective193, a randomized control approximate arithmetic training experiment improved the consistency of risk judgements in trained versus control subjects, the first such causal experiment194 (but see ref. 195, which failed to observe a causal link between approximate training and symbolic arithmetic).

A framework for training

Training numeracy is often assumed to require extensive practice over a long period of time72. However, brief numeracy training can instil statistical concepts, such as the law of large numbers, that endure after a delay, and that transfer to new instances of the statistical concept not directly trained and to reasoning about everyday life196,197 (see also ref. 198, which reviews literature beyond statistical concepts, including whole numbers, operations, word problems, fractions and algebra). The most effective approach to numeracy training seems to be to combine instruction about general principles with specific examples. For example, the broad general numeracy principle that larger samples are more likely to capture population statistics than smaller samples was explained verbally and this general principle was demonstrated with balls in urns (namely, abstractly) and with specific examples (beyond urns). Importantly, the training with balls in urns was not about balls and urns specifically but, instead, they were explained as representing any elements in any sets. That numeracy training then transferred to instances of the general principle that were not trained, such as problems about slot machines, sports and social inferences199. Thus, what is learned from effective numeracy training is not completely abstract (that is content-free rules or structures) because concrete examples helped but neither is it completely concrete (dependent on surface features explicitly presented in training)200.

The construct of gist in fuzzy-trace theory occupies this intermediate territory between content-free rules or structures abstracted from experience and content-specific concrete examples of experience (for reviews about definitions and examples of gist, see refs. 24,26,158). Gist representations of experience are general but they have content (they are not abstract structures) and they contrast with literal (verbatim) representations of reality. Gist also differs in important ways from abstract schemas (generalized rules derived from past experience; see ref. 24). Although conceptual content has been mentioned in learning approaches, it is often lumped in with abstract rule learning (as contrasted with concrete exemplar learning201), neglecting the rich literature identifying special properties of gist memories that capture essential meaning.

Combining these training approaches and material covered in earlier sections on intuitive processing provides a plausible framework for research on numeracy training (Fig. 3). Individuals are born with an innate and intuitive approximate number system that facilitates the acquisition of rudimentary number skills if individuals are exposed to some formal schooling or relevant experience (for a review, see ref. 202). Training can then most effectively build on that numeracy foundation by targeting gist representations and processing. More general forms of training have been successful (for example, teaching the law of large numbers196,197; see also ref. 173 for training conjunctive and disjunctive probability), as have more specific forms of training (for example, teaching cumulative probability as it applies to how a small probability of pregnancy for a single instance of unprotected sex rapidly approaches certainty with repeated instances151; see also ref. 160 for training conditional probability as it applies to breast cancer and genetic mutation) (Fig. 3). Some studies have trained individuals to extract gist in general (for example, understanding the gist of whole narratives with varying content203) but most studies highlight gist to improve an individual’s understanding of presented information about a specific domain204. For example, one study assessed the effect of a gist-based intervention on 27 psychosocial and self-reported behavioural outcomes related to adolescents’ sexual behaviour, such as age of initiation151. Training the gist had reliable effects that were of sufficient magnitude to matter practically, and some effects endured over 6 and 12 months in follow-up assessments12 (see ref. 130 for a review of gist training in health and medical decision making that identified 94 studies). Gist training is likely to transfer to examples that are conceptually similar to the specific trained examples but it is possible for an even more general principle to be induced in some learners (of cumulative probability, for example) (Fig. 3). Gist training also usually transfers to more specific examples of the concepts that were taught; successful training — getting the gist — means that learners understand underlying meaning and thus can deduce that specific examples instantiate the taught gist205. However, verbatim training (rote memorization) with highly concrete content is difficult to transfer to new instances. In addition, most environments rarely offer cues to completely abstract learning and they rarely offer cues to exact numbers and words that were learned (literal copies of concrete examples that were taught). Therefore, environmental cues are most likely to remind learners of gist stored at the time of learning, scaffolding their ability to solve problems that are not literal copies of what they learned.

Fig. 3: A framework for training and transfer of numeracy.
figure 3

The flow of this figure is that individuals bring the innate approximate number system with them, after which numerical cognition is enriched by schooling and experience and may also be enriched by formal numeracy training. That training can range from completely concrete to completely abstract, with the optimum level being one that is intermediate between these extremes that conveys the meaning of numerical information. That intermediate level maximizes the critical goal of transfer of training which involves being able to solve problems accurately that were not directly taught. Note that training involving purely abstract rules is not ideal for transfer because the lack of content in training fails to convey the exceptions to rules when they are applied in context (as with moral development, rigid rule following is not the most advanced form of cognition). Thus, building on the approximate number system and rudimentary skills, training is predicted to be most effective when it conveys the gist of numbers and numerical principles — neither content-free nor limited to literal thinking about concrete examples. Such training can still vary in scope from more domain-general to more domain-specific. For instance, more domain-general training in the law of large numbers196,197 and conjunctive or disjunctive probability173 or more domain-specific training in cumulative probability of pregnancy151 and conditional probability of breast cancer given genetic mutation160 all conveyed gist representations (and processing) resulting in effective training. Transfer also depends on the retrieval cues that people receive in the later transfer environment: completely general, gist or specific verbatim. ‘Like cues like’ means that the specificity of the cue determines which kinds of memories (general cues bring general memories to mind, gist cues bring gist memories to mind and verbatim cues bring verbatim memories to mind) are remembered from training and, thus, transferred. Therefore, real-world judgements, decisions and behaviours depend on the prepared mind of individuals (approximate number system and rudimentary skills), the representations encoded during training (abstract, gist and verbatim) and the cues provided in the transfer environment (abstract, gist and verbatim). However, gist representations are generally of greater utility because they are easier to learn than completely abstract rules, endure longer than verbatim representations and are more applicable to a wider array of situations than either completely abstract or completely concrete learning24,74.

In this framework, it is important for training to organize numerical information meaningfully and to distil information to its gist, especially when a specific training domain is unfamiliar to learners. For example, merely knowing how to convert a 0.05 probability to a 5% chance because of a memorized rule is not enough; that quantity might be a miniscule amount in one context (such as the probability of rain) and might be a huge amount in another context (such as the base rate of infection of a novel and deadly virus). Knowing where 0.05 is placed on a number line spanning 0 to 1.0 is not irrelevant, but it does not convey the gist of 0.05 in context needed to make personal, professional and public policy decisions. Focusing on the literal magnitude can be misleading in common contexts in which small probabilities signal large risks (and vice versa), a dissociation that fuzzy-trace theory predicts151. Thus, training content-free general mathematical skills is not sufficient for improving decision making involving numbers, but neither should training be mired in superficial details of specific problems (the degree of content specificity could apply within or across topic domains). Training the gist means finding a sweet spot between these two extremes (Fig. 3), to provide learners with substantive content that captures general principles, including insight into exceptions as all general principles have contexts in which the rules make no sense to apply and need to be discarded.

The effectiveness of training depends on the information that was presented, but, ultimately, training success rests upon the gist that the learner derives from what is presented. For best results from training, learning is not passively accepting pre-digested gist but, rather, the learner gets insight from understanding why the facts have been organized and interpreted as they have been; what is learned must be meaningful to be a gist representation (by definition). Ideally, the gist in information — the intended message — and the gist of information — the learner’s mental representation — align16. Once learning has been acquired, environmental cues can later trigger retrieval of gist and verbatim representations stored at the time of training independently (Fig. 3). Thus, successful training depends on what learners store in memory (based on the training effectiveness), the form of the stored representations (verbatim and gist) and the cues that are present in the environment that can trigger subsequent retrieval of what was stored.

Summary and future directions

Numeracy predicts an impressive array of life outcomes in such critical areas as health (perceptions of risk, quality of health decisions, patient outcomes), finance (employment, retirement savings, wealth) and law (sentencing decisions, civil damage awards). Objective numeracy measures focus on individuals’ facility with basic calculation, especially ratios and proportions, and subjective numeracy measures focus on their personal perceptions of numerical competence and confidence. Measures of objective numeracy have revealed declines in numeracy in Western industrialized countries in recent years. Measures of subjective numeracy are convenient proxies for objective numeracy because they correlate moderately with objective measures and are easier to administer. Although subjective numeracy draws on evidence of objective numeracy (ability), it is essentially metacognitive (it involves thinking about thinking, including assessing preferred styles of thinking) and thus is subject to biases in self-assessment. Thus, objective numeracy and subjective numeracy are correlated measures but they are distinct abilities. The approximate number system is an innate intuitive system for processing number that lays the groundwork for develo** objective numeracy. However, an intuitive appreciation of magnitude derived from the approximate system shared with infants and animals is not the same thing as an intuitive appreciation of the gist of mathematical content derived from meaningfully interpreted experience. Although research on the approximate number system has focused on acuity (minimizing deviations from objective quantities), numerical problems can often be solved with fuzzy qualitative gist. Paradoxically, those gist-based solutions can be more accurate and they become more common with increases in education and experience. Despite recent declines in numeracy, training programmes that yield durable and generalizable improvements have been developed.

Some core questions remain to be resolved, of which we note prominent theoretical and practical examples. The first theoretical question to resolve is how to best map the relations between metacognitive abilities and objective numeracy (Fig. 1). In particular, researchers need to better understand why performance often falls short of competence without eliciting corrective recognition — reflection — on the part of the reasoner. The second theoretical issue is to better understand the relations between intuitive representations of number — gist and the approximate number system —and decision making. In order to achieve this, theory-motivated process analyses of decision-making tasks are required. For example, tasks that require exact numerical responses, such as certainty equivalents206, might draw upon mental representations at different levels of granularity than dot discrimination tasks207. The challenge will be to move beyond emphasizing literal thinking, including acuity and linearity, to harness the strengths of human intuition: both the approximate number system, which provides impressions of numerosity and magnitude, and gist representations that capture the bottom-line meaning of numbers in context. Verbatim representations of numbers do not capture meaning and conflating ‘numerical’ with ‘verbatim’ is a mistake. The practical example is current numeracy training, specifically the fact that durable and generalizable numeracy improvements produced by training methods are not very large or very long-lasting. Considering the importance of reversing declines in numeracy, we close with some observations about how to devise more powerful training regimens.

Based on fuzzy-trace theory’s training research in domains that involve processing numerical information130, a promising approach is to enrich current training on numeracy with a focus on the gist of numbers and numerical principles. Thus, training methods should aim for deep learning that transfers beyond the literal examples that were explicitly taught (Fig. 3). Deep learning is not achieved simply through trial-and-error practice with concrete learning, by extracting surface form from training sets as machines do208. Machine learning algorithms do not understand what they have learned; they simulate deep learning but do not achieve it. When machines appear to transfer their so-called deep learning, they nevertheless remain constrained by the training set used to train them. Human intelligence is strong where machines are weak because it takes meaning and context into account, not just the literal data. When humans are at their best cognitively, they engage in the transfer of deep learning needed to meet everchanging (and unpredictable) demands from the environment. A key challenge will be how to best train individuals to exploit human strengths — to train humans to extract gist information to create substantial and long-lasting improvements to their numeracy abilities that transfer across contexts.

A recipe for deciding what the gists of a content domain are when designing training is provided by research on complicated medication regimes in arthritis162 (see also Table 1 in ref. 155). That recipe is aimed at generating the bottom-line, categorical pivot points in decisions, as follows: gather together experts and experienced stakeholders, then ask them what are the categorically (for example, incurable disease, financial ruin, inescapable debt) and ordinally important gists that need to be conveyed (and why) — and ask them what few details must be memorized or communicated by rote. Note that we do not advocate merely presenting the gist that experts understand without hel** learners (who are not experts) understand that gist and, moreover, learn to extrapolate beyond what was taught to transfer their learning to other contexts. Deeply understanding what the facts mean and how to distil what is important from those facts is an initial step informed by expertise and experience in designing training for others who lack that expertise and experience. Thus, our message is not to avoid presenting numbers or numerical representations such as graphs. To the contrary, we advocate presenting numbers and numerical representations but we encourage avoiding confusing objectivity and precision with accurate meaning and usefulness. However, knowledgeable individuals need to digest the information to ascertain what the gist of that information is so that it can be conveyed properly to others. For example, the design of tables or graphs should depend on what the gist of the information is — the bottom-line meaning should shape the graph, not the other way around. Arbitrary rules about objectively graphing data that obscure important differences or emphasize trivial differences should be rejected. Putting the qualitative gist first and the graphs and numbers second, in service of the gist, changes the message and how it is conveyed. The goal is no longer to simply rid judgements and decisions of non-linearity and biases, but to connect the meaning of messages to the values represented by numbers209.