Abstract
Metamemory, or the ability to understand the capacities of one’s own memory, is important for learning. To investigate questions surrounding metamemory, researchers commonly have participants make judgments of learning (JOLs) at encoding, in which participants rate their likelihood of recalling the target in a cue–target word pair when shown only the cue at test. However, the associative direction of cue–target pairs can affect the calibration of JOLs. Unlike forward associates (e.g., credit–card), in which JOLs often accurately predict recall, an illusion of competence has been reported for backward associates (e.g., card–credit), symmetrical associates (e.g., salt–pepper), and unrelated cue–target pairs (e.g., artery–bronze) such that JOLs overestimate later recall. The present study evaluates whether the illusion of competence can be reduced when participants apply deep item-specific or relational encoding tasks relative to silent reading. Across two experiments, we show that both item-specific and relational encoding strategies reduce the illusion of competence for backward associates and unrelated pairs while improving the calibration between JOLs and recall. Our findings suggest that these encoding strategies are effective at reducing the illusion of competence, with increased calibration primarily reflecting improved recall. Thus, item-specific and relational encoding strategies primarily affect retrieval processes rather than metacognitive processes that participants engage in at encoding.
Similar content being viewed by others
Data availability
Study materials and analyzed data are available via OSF (https://osf.io/x9n4f/). Supplemental Materials have been made available at https://osf.io/svzg8/. This study was completed as part of the Honors Thesis requirements for EEC. NPM is now at Midwestern State University.
Notes
JOL accuracy can also be assessed in terms of resolution or the relative accuracy between JOLs and recall (see Rhodes, 2016 for a comparison of calibration and resolution). However, in the present study, we focus on calibration, given that the illusion of competence has often been framed as miscalibration between JOLs and recall (e.g., Koriat & Bjork, 2005, 2006).
References
Arbuckle, T. Y., & Cuddy, L. L. (1969). Discrimination of item strength at time of presentation. Journal of Experimental Psychology, 81(1), 126–131.
Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The English lexicon project. Behavior Research Methods, 39(3), 445–459.
Brysbaert, M., & New, B. (2009). Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41(4), 977–990.
Castel, A. D., McCabe, D. P., & Roediger, H. L. (2007). Illusions of competence and overestimation of associative memory for identical items: Evidence from judgments of learning. Psychonomic Bulletin and Review, 14(1), 107–111.
Craik, F. I. M. (2002). Levels of processing: Past, present … and future? Memory, 10(5–6), 305–318.
Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11(6), 671–684.
De Deyne, S., Navarro, D. J., Perfors, A., Brysbaert, M., & Storms, G. (2019). The “Small World of Words” English word association norms for over 12,000 cue words. Behavior Research Methods, 51(3), 987–1006.
Dunlosky, J., & Nelson, T. O. (1992). Importance of the kind of cue for judgments of learning (JOL) and the delayed-JOL effect. Memory and Cognition, 20(4), 374–380.
Dunlosky, J., & Nelson, T. O. (1994). Does the sensitivity of judgments of learning (JOLs) to the effects of various study activities depend on when the JOLs occur? Journal of Memory and Language, 33, 545–565.
Einstein, G. O., & Hunt, R. R. (1980). Levels of processing and organization: Additive effects of individual-item and relational processing. Journal of Experimental Psychology, 6(5), 588–598.
Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160.
Hanczakowski, M., Zawadzka, K., Pasek, T., & Higham, P. A. (2013). Calibration of metacognitive judgments: Insights from the underconfidence-with-practice effect. Journal of Memory and Language, 69(3), 429–444.
Huff, M. J., & Bodner, G. E. (2013). When does memory monitoring succeed versus fail? Comparing item-specific and relational encoding in the DRM paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39(4), 1246–1256. https://doi.org/10.1037/a0031338.
Huff, M. J., & Bodner, G. E. (2014). All varieties of encoding variability are not created equal: Separating variable processing from variable tasks. Journal of Memory and Language, 73, 43–58.
Hunt, R. R., & Einstein, G. O. (1981). Relational and item-specific information in memory. Journal of Verbal Learning and Verbal Behavior, 20(5), 497–514.
Koriat, A. (1997). Monitoring one’s own knowledge during study: A cue-utilization approach to judgments of learning. Journal of Experimental Psychology, 126(4), 349–370.
Koriat, A., & Bjork, R. A. (2005). Illusions of competence in monitoring one’s knowledge during study. Journal of Experimental Psychology, 31(2), 187–194.
Koriat, A., & Bjork, R. A. (2006). Illusions of competence during study can be remedied by manipulations that enhance learners’ sensitivity to retrieval conditions at test. Memory & Cognition, 34, 927–959.
Koriat, A., & Ma’Ayan, H. (2005). The effects of encoding fluency and retrieval fluency on judgments of learning. Journal of Memory and Language, 52(4), 478–492.
Macleod, C. M., Gopie, N., Hourihan, K. L., Neary, K. R., & Ozubko, J. D. (2010). The production effect: Delineation of a phenomenon. Journal of Experimental Psychology, 36(3), 671–685.
Masson, M. E. (2011). A tutorial on a practical Bayesian alternative to null-hypothesis significance testing. Behavior Research Methods, 43(3), 679–690.
Maxwell, N. P., & Huff, M. J. (2021). The deceptive nature of associative word pairs: Effects of associative direction on judgments of learning. Psychological Research Psychologische Forschung, 85(4), 1757–1775.
Maxwell, N. P., & Huff, M. J. (2022). Reactivity from judgments of learning is not only due to memory forecasting: Evidence from associative memory and frequency judgments. Metacognition and Learning, 17, 589–625.
McCurdy, M. P., Sklenar, A. M., Frankenstein, A. N., & Leshikar, E. D. (2020). Fewer generation constraints increase the generation effect for item and source memory through enhanced relational processing. Memory, 28(5), 598–616.
Mueller, M. L., Dunlosky, J., & Tauber, S. K. (2016). The effect of identical word pairs on people’s metamemory judgments: What are the contributions of processing fluency and beliefs about memory? The Quarterly Journal of Experimental Psychology, 69(4), 781–799.
Mulligan, N. W. (2011). Generation disrupts memory for intrinsic context but not extrinsic context. The Quarterly Journal of Experimental Psychology, 64(8), 1543–1562.
Nairne, J. S., Thompson, S. R., & Pandeirada, J. N. (2007). Adaptive memory: Survival processing enhances retention. Journal of Experimental Psychology, 33(2), 263–273.
Nelson, D. L., McEvoy, C. L., & Dennis, S. (2000). What is free association and what does it measure? Memory and Cognition, 28(6), 887–899.
Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (2004). The University of South Florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, and Computers, 36(3), 402–407.
Nelson, T. O. (1984). A comparison of current measures of the accuracy of feeling-of-knowing predictions. Psychonomic Bulletin, 95(1), 109–133.
Nelson, T. O., & Dunlosky, J. (1991). When people’s judgments of learning (JOLs) are extremely accurate at predicting subsequent recall: The “delayed-JOL effect.” Psychological Science, 2, 267–270.
Nelson, T. O., & Narens, L. (1990). Metamemory: A theoretical framework and new findings. Psychology of Learning and Motivation, 26, 125–173.
Psychology Software Tools, Inc. [E-Prime 3.0]. (2016). Retrieved from https://www.pstnet.com
Rhodes, M. G. (2016). Judgments of learning: Methods, data, and theory. In J. Dunlosky & S. K. Tauber (Eds.), The Oxford handbook of metamemory (pp. 90–117). Oxford.
Rhodes, M. G., & Castel, A. D. (2008). Memory predictions are influenced by perceptual information: Evidence for metacognitive illusions. Journal of Experimental Psychology, 137(4), 615–625.
Rivers, M. L., Janes, J. L., & Dunlosky, J. (2021). Investigating memory reactivity with a within-participant manipulation of judgments of learning: Support for the cue-strengthening hypothesis. Memory, 29(10), 1342–1353.
Senkova, O., & Otani, H. (2021). Making judgments of learning enhances memory by inducing item-specific processing. Memory and Cognition, 49, 955–967.
Slamecka, N. J., & Graf, P. (1978). The generation effect: Delineation of a phenomenon. Journal of Experimental Psychology, 4(6), 592–604.
Soderstrom, N. C., Clark, C. T., Halamish, V., & Bjork, E. L. (2015). Judgments of learning as memory modifiers. Journal of Experimental Psychology, 41(2), 553–558.
Tekin, E., & Roediger, H. L. (2020). Reactivity of judgments of learning in a levels-of-processing paradigm. Zeitschrift Für Psychologie, 228(4), 278–290.
Wagenmakers, E. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin and Review, 14(5), 779–804.
Author information
Authors and Affiliations
Contributions
Study design and conceptualization were completed by NPM and MJH. NPM completed all analyses and prepared figures. NPM, EEC, and MJH all contributed to writing the manuscript. EEC wrote the first draft and NPM and MJH provided revisions.
Corresponding author
Ethics declarations
Conflict of interest
The studies reported were approved by the University of Southern Mississippi Institutional Review Board (Protocol #IRB-18-15) and found to be in accordance with the 1964 Helsinki Declaration ethical principles. Informed consent was obtained from all individuals who participated in this study. The authors report no competing interests.
Open practices statement
The data for all experiments have been made available at https://osf.io/x9n4f/. Neither experiment was pre-registered.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
For both experiments, we assessed whether item-specific or relational encoding instructions affected the resolution between JOLs and recall. Relative accuracy or resolution refers to the degree to which a person’s JOL rating discriminates between what is and what is not remembered (Rhodes, 2016). Unlike calibration, which can be assessed through plots, resolution is commonly assessed via Goodman–Kruskal gamma correlations. The gamma coefficient represents a measure of association between − 1 and + 1, with resolution decreasing as gamma approaches zero. Positive values denote the degree that remembered items were given high JOLs and non-remembered items low JOLs, while negative gamma values denote the inverse of this pattern (Nelson, 1984). While the illusion of competence is generally assessed in terms of calibration (e.g., Koriat & Bjork, 2005), we note that item-specific and relational encoding strategies may additionally improve resolution, given that resolution is affected whenever an encoding task affords participants with an opportunity to adjust their JOLs (i.e., modifying JOLs based on previous trials). Thus, for completeness, we report a series of analyses assessing changes in resolution for each experiment.
Experiment 1: resolution
Following the procedure used by Nelson and colleagues (Dunlosky & Nelson, 1992, 1994; Nelson, 1984), we computed Goodman–Kruskal gamma correlations (G) between JOLs and recall for each participant for each of the four pair types (forward, backward, symmetrical, and unrelated; Table 5 reports mean Gs and 95% CIs as functions of pair type and encoding group). To test for changes in resolution, we assessed differences in mean G using a 3 (Encoding Group: Item-Specific vs. Relational vs Read) × 4 (Pair Type: Forward vs. Backward vs. Symmetrical vs. Unrelated) mixed ANOVA. Overall, main effects/interactions were only marginally significant, Fs ≥ 1.94; ps ≤ 0.07, pBICs > 0.99; however, planned follow-up analyses were still carried out.
For forward pairs, both item-specific and relational encoding resulted in reduced resolution compared to silent reading (0.10 vs. 0.13 vs. 0.35, respectively). All comparisons differed significantly (ts ≥ 2.56, ds ≥ 0.64), except for the comparison between item-specific and relational encoding, t < 1, p = 0.97, pBIC = 0.88. This pattern subsequently extended to backward pairs (0.12 vs. 0.07 vs. 0.24), though only the comparison between the relational encoding and read groups was significant, t(57) = 2.34, SEM = 0.07, d = 0.60, and all other comparisons for backward pairs were non-significant, ts ≤ 1.63, ps ≥ 0.11, pBICs ≥ 0.67. For symmetrical pairs, G was again lower for item-specific and relational encoding relative to the read group (0.15 vs. 0.13 vs. 0.23), however, all comparisons failed to reach conventional significance, ts ≤ 1.53, ps ≥ 0.13, pBICs ≥ 0.70. Finally, for unrelated pairs, resolution was increased for participants who completed item-specific (0.26) and relational encoding tasks (0.33) relative to participants in the read group (0.20). However, again, all comparisons failed to reach significance, ts ≤ 1.06, ps ≥ 0.29, pBICs ≥ 0.81. Thus, while item-specific and relational encoding strategies are effective at reducing the illusion of competence, this reduction appears to occur primarily due to changes in calibration rather than resolution.
Experiment 2: resolution
Next, we assessed whether item-specific or relational encoding instructions influenced the resolution between JOLs and recall (see Table 6 for Mean Gs and 95% CIs for all comparisons). A 3 (Encoding Group: Item-Specific vs. Relational vs. Read) × 4 (Pair Type: Forward vs. Backward vs. Symmetrical vs. Unrelated) mixed ANOVA was used to test for differences in resolution as functions of encoding group and pair type. Overall, this analysis yielded a significant main effect of encoding group, F(2, 99) = 3.59, MSE = 0.24, ηp2 = 0.07. Collapsed across pair types, resolution was greater for participants in the read group (0.19) relative to the item-specific (0.10) and relational encoding groups (0.03). All comparisons were non-significant, ts ≤ 1.04, ps ≥ 0.30, pBICs ≥ 0.82, except for the comparison between the read and relational groups, t(66) = 3.01, SEM = 0.05, d = 0.74. Additionally, this analysis revealed a significant effect of pair type, F(3, 297) = 4.29, MSE = 0.19, ηp2 = 0.04. Post hoc testing indicated that resolution was greatest for unrelated pairs (0.19), followed by symmetrical pairs (0.17), forward pairs (0.08), and backward pairs (0.01). Resolution for backward pairs was significantly lower relative to symmetrical and unrelated pairs, ts ≥ 3.22, ds ≥ 0.37, though comparison between all other pair types were non-significant, ts ≤ 1.49, ps ≥ 0.14, pBICs ≥ 0.72. Additionally, the Encoding Group × Pair Type interaction was non-significant, F(6, 297) = 1.69, MSE = 0.19, p = 0.12, pBIC > 0.99. Thus, like Experiment 1, item-specific and relational encoding reduced the illusion of competence primarily through improved calibration than resolution.
Cross-experimental analysis
Because participants in the item-specific and relational encoding groups in Experiment 2 were required to verbalize their encoding processes, it is possible that this procedure affected the magnitude of the JOLs and/or their recall performance. We tested this possibility using a 2 (Experiment) × 2 (Measure: JOL vs. Recall) × 3 (Encoding Group: Item-Specific vs. Relational vs. Read) × 4 (Pair Type: Forward vs. Backward vs. Symmetrical vs. Unrelated) mixed ANOVA. The only reliable interaction that emerged was the Experiment × Measure × Direction interaction, F(3, 552) = 3.94, MSE = 128.35, ηp2 = 0.02. All other interactions with Experiment, including the four-way interaction, were non-significant, Fs ≤ 2.02 ps ≥ 0.06, pBICs ≥ 0.64.
Overall, collapsed across encoding groups, mean JOL ratings did not differ between Experiments 1 and 2 for forward pairs (70.23 vs. 66.58, respectively), t(188) = 1.67, SEM = 2.23, p = 0.10, pBIC = 0.77, or backward pairs (69.26 vs. 66.55), t(188) = 1.19, SEM = 2.29, p = 0.24 pBIC = 0.87. For symmetrical pairs, JOLs in Experiment 1 were marginally greater than Experiment 2 (75.35 vs. 71.22), t(188) = 1.81, SEM = 2.32, p = 0.07 pBIC = 0.73, while JOLs for unelated pairs were marginally lower in Experiment 1 relative to Experiment 2 (33.69 vs. 39.01), t(188) = 1.81, SEM = 2.94, p = 0.07 pBIC = 0.72. Thus, across pair types, having participants engage in the think-aloud procedure in Experiment 2 did not affect their JOLs.
Regarding recall, no differences emerged between experiments for forward pairs (73.92 vs. 73.72), t < 1, SEM = 2.87, p = 0.92 pBIC = 0.93, or symmetrical pairs (72.70 vs. 75.99), t(188) = 1.22, SEM = 2.64, p = 0.22 pBIC = 0.87. However, for backward pairs, recall was greater in Experiment 2 than Experiment 1 for backward pairs (49.27 vs. 59.16), t(188) = 3.01, SEM = 3.33, d = 0.44, and unrelated pairs (20.91 vs. 28.64), t(188) = 2.27, SEM = 3.41, d = 0.33. Thus, the additional encoding afforded by the think-aloud task boosted recall, but only for more challenging backward and unrelated pairs. Importantly however, the item-specific and relational encoding effects produced similar effects on reducing the illusion of competence on both experiments, demonstrating that participants were indeed applying item-specific and relational processing tasks effectively in Experiment 1 when encoding was completed silently.
Additionally, we examined experiment differences in calibration plots and resolution. First, cross-experimental differences in calibration plots were assessed via a 2 (Experiment) × 3 (Encoding Group: Item-Specific vs. Relational vs. Read) × 4 (Pair Type: Forward vs. Backward vs. Symmetrical vs. Unrelated) × 11 (JOL Increment) mixed ANOVA. Overall, this analysis yielded a significant Experiment × Pair Type interaction, F(3, 546) = 12.57, MSE = 1640.37, ηp2 = 0.12. However, all other interactions, including the four-way interaction, failed to reach significance, Fs ≤ 1.69, ps ≥ 0.08, pBICs > 0.99. Regarding resolution, a 2 (Experiment) 3 × (Encoding Group: Item-Specific vs. Relational vs Read) × 4 (Pair Type: Forward vs. Backward vs. Symmetrical vs. Unrelated) mixed ANOVA confirmed that mean G did not differ as a function of experiment, as no interactions with Experiment were detected, Fs ≤ 1.72, ps ≥ 0.16, pBICs > 0.99. Thus, changes in calibration and resolution across pair types/encoding groups did not differ between experiments.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Maxwell, N.P., Cates, E.E. & Huff, M.J. Item-specific and relational encoding are effective at reducing the illusion of competence. Psychological Research 88, 1023–1044 (2024). https://doi.org/10.1007/s00426-023-01891-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00426-023-01891-z