Introduction

Over four decades ago, James J. Gibson presented the seminal concept of affordances to describe the relationships that exist between organisms and their environments, indicating that “the affordances of the environment are what it offers the animal” (Gibson, 1979, p.127). According to this view, common manipulable objects, such as tools, handles, or kitchenware, automatically trigger responses that have acquired a strong association with them, resulting in automatic and specific motor plans for interacting with them (Makris et al., 2013; Proverbio et al., 2011; Tucker & Ellis, 2001). In the classic version of the affordances task, participants classify images of manipulable objects according to a certain rule (e.g., natural vs. manufactured; upright vs. inverted) by responding with their right or left hand (Tucker & Ellis, 1998, 2004; Wilf et al. 2013). Typically, the objects have a prominent handle and thus trigger an automatic gras** response in one hand (e.g., a cup with the handle facing right or left will trigger a gras** response in the corresponding hand). Reactions decelerate and are more erroneous when the relevant response (classifying the object) and the irrelevant, stimulus-driven, gras** response activate different hands (incongruent condition) than when they activate the same hand (congruent condition). Recent studies have elaborated on this finding by adding a neutral condition to the task and demonstrated that two cognitive conflicts exist in the affordances taska response conflict between responding with the relevant versus the irrelevant hand, and a task conflict between the goal-directed classification task and the stimulus-driven gras** task (Littman & Kalanthroff, 2021, 2022). While response conflict manifests only in incongruent trials, task conflict exists in both incongruent and congruent trials. Thus, typical results indicate a congruency effect (longer reaction time [RT] to incongruent than to congruent trials, indicating a response conflict), a reversed facilitation effect (congruent RT > neutral RT, indicating task conflict), and an interference effect (incongruent RT > neutral RT, which encompasses both task and response conflicts; Littman & Kalanthroff, 2022). Since its presentation in the seminal work by Tucker and Ellis (1998), the affordances task has been employed in a variety of studies and in various iterations to promote our understanding of human cognition, attention, and visuomotor functioning. However, despite its importance in experimental science, an evaluation of the task’s psychometric properties, including its test–retest reliability, has not been undertaken. Critically, the lack of reliability measures poses a significant limitation to our ability to infer valid conclusions regarding aspects of individual differences measured by the task. Thus, our primary goal here was to establish the test–retest reliability of the affordances task.

For cognitive tasks, test–retest reliability is often assessed by correlating RT performance on different occasions of assessment (Enkavi et al., 2019). However, such efforts often result in low test–retest measures, falling short of the minimal satisfactory value of 0.7 (Barch et al., 2008), even with the most well-established tasks (Draheim et al., 2021; von Bastian et al., 2022), serving as large objects that do not afford gras** tendencies (Chao & Martin, 2000). Within each presented orientation (upright vs. inverted), the trials were equally divided into neutral, congruent, and incongruent conditions with equal proportions and random order. As tools were previously shown to evoke affordances effects when presented in their functional orientation, but not in other, unfunctional orientations (Bub et al., 2018; Iani et al., 2019; Littman & Kalanthroff, 2022; Masson et al., 2011), we focused our analyses on the upright trials only (and indeed, the inverted trials data did not produce an affordances effect).

Statistical analysis

We began by trimming RTs shorter than 150 ms (0.06% of the data). To evaluate the within-task effects, a two-way analysis of variance (ANOVA) with repeated measures was applied to the RT data of correct responses with congruency conditions (congruent vs. neutral vs. incongruent) and time of assessment (Time 1, Time 2) as within-subject factors. Next, we assessed test–retest correlations of the RT data of correct responses. First, we employed the traditional summary statistics method and calculated Pearson’s r correlations between the mean RTs of Times 1 and 2 for the congruency conditions (congruent, incongruent, and neutral) and the congruency effects (congruency, interference, and reversed facilitation). Following this, we assessed test–retest reliability for the congruency conditions and congruency effects by using the Bayesian generative model similar to the one presented by Haines et al. (2022). The current results provide evidence of good test–retest reliability for the congruency and reversed facilitation effects, thus supporting the task’s reliability in the assessment of task and response conflicts. Importantly, while past studies mainly demonstrated the emergence of task conflict under conditions that trigger mental reactions such as word-reading in the Stroop task (Goldfarb & Henik, 2007; Parris, 2014) and object recognition in the object-interference task (La Heij et al., 2010; La Heij & Boelens, 2011; Prevor & Diamond, 2005), the affordances task is the first to demonstrate the emergence of task conflict under conditions that trigger a behavioral reaction (object-gras**). As such, the affordances task serves as a nonlinguistic, behavioral measure of task conflict that is potentially closer to participants’ everyday experiences. The current findings also illustrate the affordances task as a promising tool for the assessment of control over stimulus-driven habitual behaviors in healthy populations as well as in pathological populations characterized by increased reliance on stimulus-driven habitual behaviors, such as obsessive-compulsive disorder patients (Gillan et al., 2014, 2015; Kalanthroff et al., 2017, 2018b; Robbins et al., 2012), patients with substance use or behavioral addictions (Voon et al., 2015), and individuals suffering from a pre-supplementary motor area brain lesion (Haggard, 2008). Importantly, while stimulus-driven habitual behaviors have been demonstrated using various tasks, the current findings support the use of the affordances task as a unique measure of the specific cognitive control impairments that result in increased reliance on stimulus-driven habitual behaviors.

An important point regarding the affordances task needs to be acknowledged. Although many researchers attribute the affordances effect to the automatic activation of gras** responses, an alternative view has been suggested. According to this suggestion, the affordances effect represents a spatial correspondence effect, essentially similar to the Simon effect for stimulus location, and not gras** tendencies (Proctor & Miles, 2014). According to this approach, the effect is not triggered by a conflict between a correct response behavior and an incongruent activation of a stimulus-driven behavior, but rather by a conflict between a correct response behavior and an incongruent spatial cue. In other words, this alternative approach suggests that the conflict would be evident regardless of the graspability characteristics of the presented stimulus, since only its asymmetrical spatial form determines the conflict. Behavioral studies which inspected the two alternatives yielded mixed results: while some concluded that the observed affordances effects may be explained by a mere spatial correspondence effect (Cho & Proctor, 2010; Proctor et al., 2017; Song et al., 2014; ** spatial confounds in object-based correspondence effects: The Bimanual Affordance Task (BMAT). Quarterly Journal of Experimental Psychology, 72(11), 2605–2613." href="/article/10.3758/s13428-023-02131-3#ref-CR2" id="ref-link-section-d127818630e1356">2019; Buccino et al., 2009; Iani et al., 2019; Netelenbos & Gonzalez, 2015; Pappas, 2014; Saccone et al., 2016; Scerrati et al., 2020; Symes et al., 2005). Importantly, a wide body of brain imaging studies has demonstrated the activation of premotor areas when participants view manipulable objects (Chao & Martin, 2000; Creem-Regehr & Lee, 2005; Grafton et al., 1997; Grezes & Decety, 2002; Proverbio et al., 2011), an activation which is absent in classic Simon tasks (e.g., Kerns, 2006), and unique patterns of brain activity for manipulable objects that go beyond the effects of spatial correspondence (Buccino et al., 2009; Rice et al., 2007). Nonetheless, the findings of recent studies refined the initial concept of complete automaticity of the affordances effect and suggested that the affordances effect becomes more behaviorally evident when objects are presented in their functional orientation (Bub et al., 2018; Masson et al., 2011), and under conditions which emphasize the object’s graspability (Girardi et al., 2010; Lu & Cheng, 2013). To ascertain the emergence of an affordances effect, we followed the specific suggestions made by these studies. In doing so, we believe that the current study results reflect a reliable measure for control over motor stimulus-driven behavior.

Second, the current study also allows us to evaluate the task’s functioning and reliability under online administration conditions. In recent years, the online administration of cognitive tasks has gained popularity due to its ability to save resources, allow large sample sizes, and reach diverse populations across the globe (Feenstra et al., 2018; Gillan & Daw, 2016; Hansen et al., 2016; Haworth et al., 2007; Ruano et al., 2016). This tendency became even more prominent following the COVID-19 pandemic when the administration of in-lab experiments became limited or impossible for periods of time. Recently, a wide body of studies has reported encouraging findings following online administrations of a variety of cognitive tasks (Anwyl-Irvine et al., 2020; Crump et al., 2013; de Leeuw & Motz, 2016; Hilbig, 2016; Ratcliff & Hendrickson, 2021; Semmelmann & Weigelt, 2017; Simcox & Fiez, 2014). Chiefly, these studies reported results that were comparable to those typically obtained under in-lab administrations. The results of the current study are comparable to those of previous studies which used similar task designs in a laboratory setting (e.g., Littman & Kalanthroff, 2021, 2022; Saccone et al., 2016; Tucker & Ellis, 1998). Specifically, a comparison of the current study results to those reported by Littman and Kalanthroff (2022), Experiment 1, which used an identical design but was administered in the lab, yielded very similar results, albeit with minor differences in general RTs, which were somewhat shorter in the current study than those reported by Littman and Kalanthroff (2022). This full data is presented in section S3 of the Supplementary Material. Most importantly, the effects found in the current study were all in the same direction as the ones reported by Littman and Kalanthroff (2022), and were all significant, yielding medium to large effect sizes. Furthermore, the current results provide essential data regarding the reliability of an online administration of the task, together with the application of generative modeling to behavioral data obtained online. Alongside their advantages, web-based experiments are limited to the extent that administration may be less standardized in comparison to in-lab administration and may contain additional noise sources. Here, the replication of the task’s effects under these (noisier) conditions strengthens their replicability and the utility of the web-based administration of the affordances task.

Lastly, the inspection of test–retest reliability using a traditional method of assessment (Pearson’s r) resulted in weak test–retest correlations for the congruency, interference, and reversed facilitation effects. These findings replicate the “reliability paradox” that is often observed when using summary statistics to assess individual differences in cognitive tasks that yield robust group-level effects (Haines et al., 2020; Rouder & Haaf, 2019), typically resulting in low estimates of the congruency effects (Bender et al., 2016; Hedge et al., 2018; Paap & Sawi, 2016; Soveri et al., 2018; Strauss et al., 2005). Following this, the application of the hierarchical Bayesian generative model resulted in a significant improvement in test–retest evaluation of the congruency, interference, and reversed facilitation effects, all yielding acceptable or good test–retest reliability. These results are in line with recent findings that illustrated the utility of generative models in the assessment of individual differences features (Chen et al., 2021; Haines et al., 2020; Rouder & Haaf, 2019). Recently, Haines et al. (2020) have demonstrated how the employment of generative models results in richer and more accurate test–retest estimations for a variety of well-established cognitive paradigms including Stroop, flanker, and Posner tasks. Additionally, Chen et al. (2021) indicated how the use of generative models accounts for trial-level variability and incorporates it into the model, allowing for a more precise evaluation of reliability in comparison to the summary statistics approach, in which trial-level variability is considered as measurement error. Importantly, the employment of generative models does not automatically result in an inflation of test–retest measures but does so only when such changes are warranted by the data (see Haines et al., 2020). The results of the current study are in line with the recent findings presented by Chen et al. (2021) and Haines et al. (2020), demonstrating the importance of employing finer, more able tools (such as Bayesian generative models) for the psychometric assessment of cognitive tasks. Such methods may deepen our understanding of the tasks themselves, their psychometric properties, and the cognitive structures they are designed to measure.

The findings from our study demonstrate that the affordances task can yield reliable individual differences. However, this is only the first step in a broader psychometric investigation. It is crucial to further examine the variability of these individual differences for clinical use and to determine their relationships to other constructs in the larger context. Additionally, our study suggests that Bayesian hierarchical models are an effective method for understanding these individual differences (Draheim et al., 2019, 2021), and it is recommended to continue using this approach to account for uncertainty in the affordances task. Further research is needed to fully understand the psychometric potential of the affordances task.

Conclusion

The affordances task can serve as an important tool to study aspects of cognitive control and visuomotor functioning. In the current study, we assessed the task’s test–retest reliability for the first time by using a hierarchical Bayesian generative model in an online administration. The affordances task yielded good test–retest properties, supporting its applicability in the study of individual differences. The employment of the generative model replicated recent findings that demonstrated its higher precision in the assessment of test–retest reliability in comparison to traditional methods of assessment which are based on summary statistics. The employment of Bayesian generative models may be used for future evaluations of individual differences and the reliability of cognitive tasks.