Word recognition, oral reading fluency, listening comprehension and vocabulary have been consistently identified as predictors of reading comprehension across a wide range of grade levels and orthographies (Cadime et al., 2017 Fernandes et al., 2017; Lervåg et al., 2018; Little et al., 2017; Muijselaar & de Jong, 2015; Ouellette & Beers, 2010; Padeliadu & Antoniou, 2014; Tilstra et al., 2009; Tobia & Bonifacci, 2015; Wolfgramm et al., 2016). However, the relative contributions of these skills to reading comprehension seem to be influenced by the phase of reading acquisition and by the depth of the orthographies (e.g., Florit & Cain 2011). As students advance from lower to higher grade levels, they are faced with more complex texts and greater demands for high-level reasoning and inferencing (Denton et al., 2011). Hence, beyond reading fluently, having high levels of listening comprehension and having broad and deep vocabulary knowledge, it is necessary that students efficiently apply cognitive and metacognitive reading strategies to achieve better reading comprehension performance (Botsas, 2017; Muijselaar et al., 2017).

Reading models provide conceptual frameworks about the directionality of the relations between reading-related skills. The simple view of reading (SVR) offers a relatively simple framework of reading comprehension, which has received substantial empirical support (Florit & Cain, 2011). However, the SVR is a static model, suggesting the existence of unidirectional relations between two component skills – word recognition and listening comprehension – and reading comprehension (Hoover & Gough, 1990). To explain the developmental changes observed in reading comprehension, a new model based on the SVR, called Cognitive Foundations of Reading (CFR), was recently proposed by Hoover & Tunmer (2020b). Contrary to the SVR, the CFR considers the existence of reciprocal relations between lower- and higher-order skills (Hoover & Tunmer, 2020a). For example, reading comprehension requires the application and articulation of both word recognition and listening comprehension skills, and in turn, the development of reading comprehension can help develop further word recognition and listening comprehension skills (Hoover & Tunmer, 2020c). According to the CFR, listening comprehension depends on linguistic knowledge that includes phonological, syntactic, and semantic knowledge. This latter skill (i.e., vocabulary) is one component of interest in the present study, in addition to word recognition and listening comprehension. The CFR framework addresses the importance of automaticity in word recognition for success in reading comprehension, although it does not specify the role of accuracy and speed in reading words in connected text, which can be aided by syntactic and semantic cues. In fact, in European Portuguese, oral reading fluency (i.e., accurate and quick reading of words in connected text) seems to be a larger predictor of reading comprehension than word recognition, at least in grade 4 (Cadime et al., 2017; Santos et al., 2020). For this reason, and given that this is a 3-year longitudinal study starting in grade 4, we address oral reading fluency instead of word recognition. Additionally, the CFR does not include other components of interest, such as reading strategy use, which is closely interrelated with reading comprehension across a wide range of grade levels (Frid & Friesen, 2020; Law, 2009; Liao et al., 2021; Muijselaar & de Jong, 2015; Roeschl-Heils et al., 2003; Samuelstuen & Bråten, 2005; Van Ammel & Keer, 2021; Van Kraayenoord et al., 2012).

The developmental changes observed between reading comprehension and other reading-related skills, namely, oral reading fluency, listening comprehension, vocabulary and the use of the reading strategies, justify an analysis past the early grade levels of reading development. Indeed, empirical studies of students beyond grade 4 are scarce. This intermediate-depth orthography uses a sample of students from grades 4 to 6 and is based on the assumptions related to the bidirectionality of the relations between linguistic and reading-related skills advocated in the CFR framework. Considering research that also advocates for the role of metacognitive skills in reading comprehension, this paper aims to address the longitudinal and reciprocal interrelations between the aforementioned skills in European Portuguese.

Interrelationships between reading skills

The relation between oral reading fluency and reading comprehension seems to be stronger at the earlier than later stages of reading development (Benson, 2008; Kim & Wagner, 2015; Little et al., 2017; Padeliadu & Antoniou, 2014;  Ribeiro et al., 2016; Silberglitt et al., 2006). Previous research conducted in both transparent and opaque orthographies has shown that oral reading fluency continues to be a significant predictor of reading comprehension past the first four years of schooling (Fernandes et al., 2017; Padeliadu & Antoniou, 2014; Shapiro et al., 2008; Valencia et al., 2010; Yildirim, Rasinski, et al., 2019; Yovanoff et al., 2005). Furthermore, the results from other empirical studies suggest that reading comprehension also predicts oral reading fluency, including in students from kindergarten to seventh grade across a wide range of orthographies, such as English, Korean, Turkish and German. This suggests that there is a reciprocal relationship between these skills (Berninger et al., 2010; Gebauer et al., 2013; Hudson et al., 2012; Jenkins et al., 2003; Kim, 2015; Klauda & Guthrie, 2008; Little et al., 2017; Yildirim, Ates, et al., 2019). In the case of European Portuguese, oral reading fluency and reading comprehension were reciprocally related between grades 2 and 3 but not between grades 3 and 4 ( Santos et al., 2020). Highly efficient readers direct their cognitive resources to develo** complex comprehension skills (Baker et al., 2011; National Institute of Child Health and Human Development, 2000). With good comprehension, the reader is able to identify contextual cues that help to decipher the meaning of the phrases in a text, and they efficiently monitor comprehension, which allows fast correction of misread words during reading (Fuchs et al., 2001; Jenkins et al., 2003; Spear-Swerling, 2006).

Interrelationships between reading comprehension and linguistic skills

Listening comprehension has been identified as a good predictor of reading comprehension (Cadime et al., 2017; Kim, 2015; Lervåg et al., 2018; Tilstra et al., 2009; Tobia & Bonifacci, 2015; Torppa et al., 2016). From the early stages of reading acquisition in transparent orthographies, listening comprehension seems to have a strong influence on reading comprehension, but in deep orthographies, this effect is verified only in more advanced stages of reading acquisition (Florit & Cain, 2011). Moreover, some empirical studies also suggest a reciprocal relationship between the comprehension skills in orthographies of varying depths (Berninger & Abbott, 2010; Santos et al., 2020; Verhoeven & van Leeuwe, 2008, 2012; Wong, 2021). In a six-year cross-lagged panel study with Dutch students from grades 1–6, listening comprehension was only reciprocally related to reading comprehension between grades 3 and 4 (not in the remaining grades) (Verhoeven & Van Leeuwe, 2008). Four years later, the same authors performed a longitudinal study in which data on listening comprehension was collected in grades 1, 3, and 5, and data on reading comprehension was collected in grades 2, 4, and 6. For first language learners (Dutch), listening comprehension in grades 1, 3 and 5 predicted reading comprehension in grades 2, 4, and 6, respectively (Verhoeven & Van Leeuwe, 2012). Reading comprehension in grades 2 and 4 predicted listening comprehension in grades 3 and 5, respectively (Verhoeven & Van Leeuwe, 2012). Berninger and Abbott (2010) also observed reciprocal interrelationship between these two comprehension skills in grades 1, 3, 5 and 7. More recently, cross-lagged relations were found between listening and reading comprehension in European Portuguese speakers in a longitudinal study involving grades 2 through 4 (Santos et al., 2020). Moreover, reciprocal relations between these two comprehension skills were found in Chinese as a second language, from grades 4 to 6 (Wong, 2021). Subjacent skills of the two linguistic processes (e.g., related to phonology, lexis, grammar, syntax and semantics) justify the bidirectionality of this relation (Hogan et al., 2014; Hoover & Tunmer, 2020b; Kim & Phillips, 2014; Nation, 2005; Perfetti et al., 2005). The development of these skills seems to be highly interdependent, regardless of the grade and the orthography.

Vocabulary is another linguistic skill that has been consistently found to account for unique reading comprehension variance, even after listening comprehension is statistically controlled, particularly for students past the early stages of reading acquisition (Braze et al., 2016; Goff et al., 2005; Muijselaar & de Jong, 2015; Ouellette & Beers, 2010; Protopapas et al., 2013; Seigneuric & Ehrlich, 2005; Tilstra et al., 2009; Tunmer & Chapman, 2012; Verhoeven & Van Leeuwe, 2008). This finding is congruent with the Lexical Quality Hypothesis, in which the quality of the words’ representations, including a deep knowledge of their meanings, affects comprehension (Perfetti & Hart, 2002). Additionally, some previous studies have suggested that vocabulary depth measured by expressive measures may be a more accurate representation of readers’ linguistic skills than listening comprehension measures, which are often influenced by memory skills (Ouellette & Beers, 2010; Protopapas et al., 2013; Tilstra et al., 2009). The results from prior research also showed that the additional contribution of vocabulary to reading comprehension seems to increase as students move from lower to upper grade levels, assuming a significant relative weight in grades 4, 5 and 6 both in opaque and more transparent orthographies (Fernandes et al., 2017; Ouellette & Beers, 2010; Swart et al., 2017; Tilstra et al., 2009; Wolfgramm et al., 2016; Yovanoff et al., 2005). Equally, vocabulary has been recognized as a significant predictor of listening comprehension across the first six years of schooling, regardless of orthography depth (Hagtvet, 2003; Li et al., 2021; Verhoeven & Van Leeuwe, 2008; Wolfgramm et al., 2016). Empirical results from prior studies have also suggested that comprehension predicts vocabulary, whether assessed orally (listening comprehension) or using written material (reading comprehension) (Seigneuric & Ehrlich, 2005; Verhoeven et al., 2011; Verhoeven & Van Leeuwe, 2008). Given the knowledge of word meanings provides the means to comprehend oral and written language, vocabulary exerts a direct influence on listening and reading (Nagy & Scott, 2000; Webb, 2021). Students with better comprehension skills tend to be more able to deduce the meaning of new words or words that assume different meanings depending on the context, which also contributes to the expansion and enrichment of their vocabulary (Mol & Bus, 2011).

Interrelationships between reading comprehension and metacognitive skills

As children get older and become more skilled at reading, the use of reading strategies develops (Baker et al., 2015; Pintrich & Zusho, 2002; Roeschl-Heils et al., 2003). Most children begin to be able to use different reading strategies to solve breakdowns in comprehension between 8 and 10 years old (McNamara et al., 2007; Veenman et al., 2006). Skilled readers use metacognitive strategies to identify when they are not understanding some part of the text and, consequently, mobilize appropriate cognitive strategies to overcome this difficulty (Graesser, 2007; Pereira-Laird & Deane, 1997; Perfetti et al., 2005). In fact, the knowledge and use of reading strategies have been identified as significant predictors of reading comprehension (Frid & Friesen, 2020; Law, 2009; Liao et al., 2021; Muijselaar & de Jong, 2015; Roeschl-Heils et al., 2003; Samuelstuen & Bråten, 2005; Van Ammel & Keer, 2021; Van Kraayenoord et al., 2012). However, contrary to reading and linguistic skills, the existence of reciprocal relations between metacognitive skills and reading comprehension has been scarcely explored. To the best of our knowledge, only one study conducted with a sample of Dutch students from the beginning of grade 4 to the end of grade 5 found a reciprocal relationship between knowledge of reading strategies and reading comprehension (Muijselaar et al., 2017). On the one hand, students with better knowledge about the use of reading strategies tend to be more skilled in constructing the main ideas from the text. On the other hand, reading comprehension contributes to the development of the knowledge and use of reading strategies because students learn from the texts (McMaster et al., 2014; Muijselaar et al., 2017; Verhoeven & Perfetti, 2008). Students with good comprehension levels not only read more texts, but also read more challenging texts. Consequently, they face more frequent comprehension gaps and thus have more opportunities to test the application of a wide range of reading strategies to restore a coherent representation of the text (Muijselaar et al., 2017). Students acquire more knowledge about reading strategies and more regularly use them as they advance from lower to upper grades of primary school and have to read texts of increasing difficulty (Clemens et al., 2017; McMaster et al., 2014).

The present study

Many empirical studies have investigated the directionality of the relations between reading comprehension, oral reading fluency, listening comprehension, and vocabulary across a wide range of orthographies and grade levels (Berninger et al., 2010; Berninger & Abbott, 2010; Gebauer et al., 2013; Hudson et al., 2012; Jenkins et al., 2003; Kim, 2015; Klauda & Guthrie, 2008; Little et al., 2017; Seigneuric & Ehrlich, 2005; Verhoeven et al., 2011; Verhoeven & Van Leeuwe, 2012, 2008; Wong, 2021; Yildirim, Ates, et al., 2019), although studies in European Portuguese have focused only on the initial years of reading acquisition (Santos et al., 2020). Research on reciprocal relations between reading comprehension and reading strategies is particularly scarce. The goal of this study was to investigate the longitudinal relations between reading comprehension, oral reading fluency, listening comprehension, vocabulary and the use of reading strategies in European Portuguese from grades 4 to 6. Based on the literature review, our first hypothesis was that reading comprehension has a reciprocal relation with oral reading fluency, listening comprehension and vocabulary across all the studied grade levels. Despite the existence of scarce empirical evidence, we also hypothesized a reciprocal relation between reading strategy use and reading comprehension, as observed in a study conducted with Dutch, an orthography of intermediate depth (Muijselaar et al., 2017).

Method

Participants

The initial sample was composed of 133 students assessed at the end of grade 4. Only the data of students who completed at least two assessments were considered in the analysis (N = 110). In this case, the 110 students who compose the final sample completed the measures in the first two time points (the end of grades 4 and 5). Thus, between these time points, the attrition was 17%. Between the second and third time points (the end of grades 5 and 6, respectively), another 35 students dropped out of the study, which equals an attrition of 32%. The high attrition rate from grades 5 to 6 was due to the occurrence of the COVID-19 pandemic, which led to the closure of the schools during a significant portion of the 2020 academic year, leading to the necessity of postponing the third time point assessment and increasing the drop-off from the study. The sample of 110 students attended public (n = 95, 86.4%) and private schools (n = 15, 13.6%) in northern Portugal. This distribution is representative of the population, given that, according to national data from the General Direction of Education and Science Statistics for the 2019/2020 academic year, more than 85% of children from grades 1 to 6 attended public schools. The participants’ mean age was 9.45 years (SD = 0.54, range 9–11) in grade 4, 10.45 years (SD = 0.54, range 10–12) in grade 5 and 11.96 years (SD = 0.31, range 11–13) in grade 6. More than half of the students were female (n = 58, 52.7%). All students were fluent speakers of European Portuguese. Bilingual children and students who benefited from inclusion and learning support measures at secondary and/or tertiary levels were not included in the studyFootnote 1.

Regarding mothers’ education levels, 35.5% had completed a university degree, 29.1% had completed high school, and 34.5% had a lower educational degree (missing information was obtained for 0.9% of the mothers). Because of the low income of their families, 39.1% of the students in grade 4 received social support from the government for the acquisition of meals, school supplies and study visits.

Measures

Test of Reading Comprehension of Narrative Texts (TRC-n;  Rodrigues et al., 2020; Santos et al., 2016, 2017) . This is a norm-referenced test that includes five vertically scaled forms that assess reading comprehension of narrative texts in students from grades 2 to 6. In the present study, the test forms for students in grades 4, 5 and 6 were administered (TRC-n-4, TRC-n-5 and TRC-n-6). The students silently read text passages that are followed by multiple-choice questions (three options), with the response marked on an answer sheet. The test is untimed, and the responses are scored as 0 (incorrect) or 1 (correct). The total raw score of each test form is converted to a standardized score. The scale of standardized scores of the TRC-n test forms was generated based on the mean value of 100 (SD = 10). The means for the normative samples were 108 (SD = 10), 111 (SD = 10) and 120 (SD = 10) for TRC-n-4, TRC-n-5 and TRC-n-6, respectively. The reliability coefficients ranged between.72 and.94 for TRC-n-4, between.75 and.95 for TRC-n-5 and between 0.72 and 0.97 for TRC-n-6. Regarding validity, the TRC-n results were statistically correlated with scores on other tests of reading-related skills.

Test of Listening Comprehension of Narrative Texts (TLC-n; Rodrigues et al., 2020; Santos et al., 2015; Viana et al., 2015). This is also a norm-referenced test that comprises six vertically scaled forms that measure listening comprehension of narrative texts across grades 1–6. In this study, the test form for fourth, fifth and sixth graders was administered (TLC-n-4, TLC-n-5, and TLC-n-6). The student listens to the text passages and selects one of three answers on a multiple-choice worksheet. The test is untimed. The responses are scored as 0 (incorrect) and 1 (correct), and the total raw score of each test form is converted to a standardized score. The scale of standardized scores of the TLC-n test forms was generated based on the mean value of 100 (SD = 10). The means for the normative samples were 122 (SD = 10), 124 (SD = 10) and 128 (SD = 10) for TLC-n-4, TLC-n-5 and TLC-n-6, respectively. The reliability coefficients ranged from 0.70 and 0.98 for TLC-n-4, from 0.70 to 0.95 for TLC-n-5 and from 0.70 to 0.97 for TLC-n-6. With regard to evidence of validity, statistically significant correlations were obtained between the TLC-n forms and external criteria measures.

Reading Strategy Use (RSU; Ribeiro et al., 2015). This is a scale with 22 items, in which each one of them describes a reading strategy (cognitive or metacognitive), and the student marks the frequency of its use on a 7-point Likert scale ranging between 1 (never) and 7 (always). The adaptation and validation studies of the Portuguese RSU provided evidence for a one-dimensional structure with a Cronbach’s alpha of 0.85.

Fluency Assessment Test [TAF, Teste de Avaliação da Fluência] (Rodrigues et al., 2022). The TAF evaluates oral reading fluency in students from grade 1 to grade 6. It is composed of specific test forms for each grade. Each one includes three unpublished texts (one narrative with dialog, one narrative without dialog and one expository). The students read each text aloud for one minute. The number of words read correctly per minute in the three texts of each test form was averaged and then converted to an equated score. Regarding evidence of validity, significant correlation coefficients were found between the TAF scores and those obtained in other reading tests, teachers’ ratings, and school outcomes, as well as high test-retest reliability coefficients.

Vocabulary Subtest from the Wechsler Intelligence Scale for Children – III (Wechsler, 2003). This test is composed of 30 items that demand that students orally provide the definition of a given word. The response to each item is scored with 0, 1 or 2 points. The test is ended after four consecutive failures. The total raw score was computed by adding the scores of each item. This total was then converted to a standardized score. Reliability coefficients for the Portuguese version of this subtest ranged from 0.69 to 0.89.

Procedure

Legal authorization for data collection was solicited from the ethics committee of the University of Minho and the Portuguese Ministry of Education, as well as from the respective school boards. Parents or legal tutors were informed about the study goals and signed an informed consent form to allow the participation of the students in the study. The anonymity and confidentiality of the data were assured.

At each measurement time point, the TRC-n, the TLC-n and the RSU were administered collectively in the classroom, and the remaining tests were applied to each student individually in a quiet room at school. The test administration was performed by trained psychologists at the end of the academic year for each grade level.

Statistical analysis

We examined internal consistency by means of Cronbach’s alpha for the measures of vocabulary and reading strategy use and by Kuder-Richardson Formula 20 (KR20) for measures of listening comprehension and reading comprehension. Internal consistency is adequate when the values are higher than 0.70 (Taber, 2018). Descriptive statistics (mean and standard deviation) and correlation coefficients between oral reading fluency (ORF), listening comprehension (LC), vocabulary (VOC), reading strategy use (RSU) and reading comprehension (RC) were calculated for each grade level. The normality of the distributions was also checked considering the analysis of the Q-Q plots as well as the absolute values of skewness and kurtosis: values lower than 3.0 and 7.0, respectively, are considered adequate (Kline, 2016).

A longitudinal cross-lagged panel model design was then used to assess the interrelations between the five variables. Four reciprocal-causation models with cross-lagged paths were tested using Mplus version 7 (Muthén & Muthén, 2012). The maximum likelihood estimator with robust standard errors (MLR) was applied. The full information maximum likelihood (FIML) method was used to account for the missing data of the students who dropped out of the study between the last two assessment points.

Model 1 included ORF, RC, LC and VOC to examine the cross-lagged relations between ORF and RC, between RC and LC, and between LC and VOC. Given the poor fit of this first model, it was decomposed into Models 2 and 3. Model 2 tested the reciprocal relations between ORF and RC and between LC and RC. Model 3 was similar to Model 2, but VOC replaced LC. Model 4 comprised ORF, RC and RSU, and tested reciprocal relations between ORF and RC, as well as between RC and RSU.

The following indices were used to evaluate the model fit: chi-square value (χ2), chi-square to degrees of freedom ratio (χ2/df), comparative fit index (CFI), Tucker–Lewis index (TLI), root mean square error of approximation (RMSEA) and standardized root mean square residual (SRMR). The χ2 indicates the magnitude of the discrepancy between the observed and modeled covariance matrix, testing the probability that the theoretical model fits the data. The higher this value relative to the degrees of freedom, the worse the model fit. However, χ2 is sensitive to large sample sizes, and therefore, it is advisable to consider the χ2/df ratio, whose values less than or equal to 3.00 suggest an acceptable fit and less than or equal to 2.00 a good fit (Schermelleh-Engel et al., 2003; Schreiber et al., 2006). The CFI and the TLI indicate the relative fit of the observed model when comparing it to a baseline model. For CFI and TLI, values above 0.90 indicate an acceptable fit, and those above 0.95 suggest a good fit (Byrne, 2011; Hu & Bentler, 1999). Both the RMSEA and the SRMR are also measures of discrepancy between the observed and baseline model, and values below 0.08 and 0.10 (respectively) also indicate an acceptable model fit, whereas values below.05 indicate a good fit (Browne & Cudeck, 1993; Hu & Bentler, 1995; Schermelleh-Engel et al., 2003). The four models were compared using the Akaike information criterion (AIC) and the Bayesian information criterion: models with lower AIC and BIC values have a better fit (Raftery, 1995). The significance level was 5% for all analyses.

Results

Table 1 presents the means, standard deviations, minimum and maximum values, and reliability coefficients for all measures at the three time points. The internal consistency was adequate, with all coefficients being higher than 0.70. The Q-Q plots representing the distribution of each variable at each grade level are displayed in Appendices A–E. The inspection of the Q-Q plots suggests no robust violations to the normality of the data distributions. In addition, all variables had acceptable values of skewness and kurtosis. Taken together, these results provide empirical evidence of the normality of the distributions.

Table 1 Descriptive statistics and reliability coefficients of internal consistency for measures in grades 4, 5 and 6

Table 2 presents the correlation coefficients between all variables over time. Statistically significant correlations were found between reading comprehension and the remaining variables at each time point, except for reading strategy use in grades 4 and 6 with reading comprehension in grades 5 and 6. The lowest correlation coefficients were those between reading strategy use and reading comprehension. The highest correlation coefficients were found between listening comprehension and reading comprehension, as well as between vocabulary and reading comprehension, in every grade. Table 3 presents the fit indices for all cross-lagged models.

Table 2 Correlations for oral reading fluency, listening comprehension, reading comprehension, vocabulary, and reading strategy use in grades 4, 5 and 6
Table 3 Fit indices for the cross-lagged models

The relations tested in Model 1 are depicted in Fig. 1. Model 1 presented a poor fit, given that only χ2/df and CFI were within the acceptable range. Thus, this model was decomposed into the following two cross-lagged models.

Fig. 1
figure 1

Model 1: Test of the reciprocal relations between ORF and RC, between RC and LC, and between LC and VOC. Note. G4 = grade 4; G5 = grade 5; G6 = grade 6

The standardized regression paths for Model 2 are depicted in Fig. 2. This model presented a good fit, given that the values of χ2/df and CFI were within the good range and the remaining indices were within the acceptable range, with the exception of RMSEA, which was slightly above the reference value. The relation between ORF and RC was found to be unidirectional at every measurement time point: RC was predicted by ORF, but the opposite was not confirmed. The cross-lagged paths between LC and RC indicate the existence of a reciprocal relation throughout grades 4–6.

Fig. 2
figure 2

Model 2: Test of the reciprocal relation between ORF and RC and between LC and RC. Note. The model included error covariances, but they are not presented for a matter of parsimony. The standardized coefficients are depicted. G4 = grade 4; G5 = grade 5; G6 = grade 6. p < .10. * p < .05. ***p < .001

The standardized regression paths for Model 3 are depicted in Fig. 3. All indices suggested that Model 3 fit the data very well. The cross-lagged path between ORF and RC was similar to that obtained in Model 2. A reciprocal relation was found between VOC and RC in every grade.

Fig. 3
figure 3

Model 3: Test of the reciprocal relations between ORF and RC and between VOC and RC. Note. The model included error covariances, but they are not presented for a matter of parsimony. The standardized coefficients are depicted. G4 = grade 4; G5 = grade 5; G6 = grade 6. *p < .05. **p < .01. ***p < .001

The standardized regression paths for Model 4 are depicted in Fig. 4. Model 4 presented a good fit, with values of χ2/df, CFI and TLI within the good range and the remaining indices fit within the acceptable range. The relation between ORF and RC was similar to that obtained in the previous models. Furthermore, RSU in grade 5 was predicted by RC in grade 4, but not the opposite.

Fig. 4
figure 4

Model 4: Test of the reciprocal relations between ORF and RC and between RSU and RC. Note. The model included error covariances, but they are not presented for a matter of parsimony. The standardized coefficients are depicted. G4 = grade 4; G5 = grade 5; G6 = grade 6. p < .10. *p < .05. ***p < .001

Regarding model comparison, Model 3 obtained the lowest AIC and BIC values, suggesting that it fits the data better than the remaining models.

Discussion

This study aimed to explore the longitudinal interrelations between reading comprehension, oral reading fluency, listening comprehension, vocabulary, and reading strategy use in European Portuguese from grades 4 to 6.

To test the research hypotheses, four cross-lagged models were fitted. The first cross-lagged model tested the existence of reciprocal relations between reading comprehension, oral reading fluency, and listening comprehension, as well as between listening comprehension and vocabulary. Due to poor fit, this model was subsequently decomposed into two cross-lagged models, with oral reading fluency maintained as a predictor of reading comprehension across both models. Listening comprehension and vocabulary were included as predictors of reading comprehension in the second and third models, respectively. We hypothesized that reading comprehension would have a reciprocal relation with oral reading fluency, listening comprehension and vocabulary. These reciprocal relations have been suggested in orthographies of varying depth (Berninger & Abbott, 2010; Klauda & Guthrie, 2008; Little et al., 2017; Santos et al., 2020; Seigneuric & Ehrlich, 2005; Verhoeven et al., 2011; Verhoeven & van Leeuwe, 2008, 2012; Wong, 2021). Finally, a fourth cross-lagged model was fitted with oral reading fluency, reading comprehension and reading strategy use. Despite the existence of scarce empirical evidence, a reciprocal relation between reading strategy use and reading comprehension was hypothesized, as observed in a study conducted with Dutch, an orthography of intermediate depth (Muijselaar et al., 2017).

Congruent with the results of previous research (e.g., Fernandes et al., 2017; Padeliadu & Antoniou, 2014; Yildirim, Rasinski, et al., 2019), oral reading fluency predicted reading comprehension at every measurement time point. Unexpectedly (e.g., Berninger et al., 2010; Little et al., 2017; Yildirim, Ates, et al., 2019), the opposite association was not observed. Research has suggested that the ability to extract meaning from text exerts an important influence on the development of oral reading fluency, particularly when this last ability is not yet effortless and automatized (Jenkins et al., 2003). It is plausible that this compensatory mechanism is no longer needed in the upper grades of primary school, when students usually achieve a performance ceiling in oral reading fluency (Arnesen et al., 2017; Nese et al., 2012, 2013; Santos et al., 2020).

Consistent with previous research (Berninger & Abbott, 2010; Santos et al., 2020; Verhoeven & van Leeuwe, 2008, 2012; Wong, 2021), the hypothesis of a reciprocal relation between listening and reading comprehension was fully verified in Model 2, suggesting that higher levels of one skill are associated with higher levels of the other skill and vice versa. Similar standardized coefficients were obtained in other studies conducted in intermediate-depth orthographies that examined the relation between these skills using cross-lagged models ( Santos et al., 2020; Verhoeven & van Leeuwe, 2012).

The relation between vocabulary and reading comprehension was also reciprocal, as observed in Model 3, suggesting the knowledge of word meanings and comprehension of written texts influence each other from the end of grade 4 to the end of grade 6. Once again, this finding is consistent with results from prior studies (Seigneuric & Ehrlich, 2005; Verhoeven et al., 2011; Verhoeven & Van Leeuwe, 2008).

Contrary to what was hypothesized, as indicated by Model 4, reading strategy use was not reciprocally related to reading comprehension at any measurement time point. Reading comprehension in grade 4 predicted reading strategy use in grade 5, but this relation was not found from grades 5 to 6. Reading strategy use also did not predict reading comprehension at any measurement time point. This result differs from the one obtained by Muijselaar et al., (2017) in a study with Dutch students. These differences may be related to the measurement procedures. In the study by Muijselaar et al. (2017), a questionnaire about knowledge of reading strategies was used, whereas in this study, a self-report measured the frequency of use of a number of reading strategies. Furthermore, the significant impact of reading comprehension at the end grade 4 on reading strategy use in grade 5 may be related to school transition that occurs in the Portuguese educational system. In Portugal, the first cycle of study includes grade 1 through 4, and the second cycle of study includes grades 5 and 6. During the first four years of formal reading instruction (grades 1–4), it is expected that students learn to read and write as well as acquire knowledge in three central curricular contents (Portuguese, Mathematics and Sciences) under the guidance of only one teacher. When students go to grade 5, they begin to have lessons related to approximately 10 different subject areas, including History and Geography of Portugal, Natural Sciences, Mathematics, Visual Education, Technological Education. Each subject is taught by a different teacher. At this grade level, they are expected to comprehend increasing amounts of expository texts with specific terminologies and concepts as well as poetic texts, which are considerably distinctive from narrative and expository texts in terms of structure (e.g., syntax) and content (Sanacore & Palumbo, 2008). This new school reality may lead students to face more comprehension gaps, and this is particularly challenging for children who do not have high comprehension skills and have not mastered the use of a wide range of reading strategies (Gregg & Sekeres, 2006). Therefore, during grade 5, students may feel more frequently the necessity of applying a diverse range of reading strategies to solve their comprehension difficulties, but only children with higher levels of reading comprehension are able to do that often. Future studies should address this issue.

Models 2, 3 and 4 presented good fits, contrary to Model 1. Model 3 presented a better fit than Models 2 and 4, showing that reading comprehension performance is better explained by oral reading fluency and vocabulary than by oral reading fluency and listening comprehension or by oral reading fluency and reading strategy use. This finding is in accordance with the results obtained in prior research that suggested that measures of (expressive) vocabulary depth may be more robust indicators of students’ linguistic skill than measures of listening comprehension (Ouellette & Beers, 2010; Protopapas et al., 2013; Tilstra et al., 2009). Thus, the use of specific measures to assess linguistic skills (e.g., semantic or syntactic knowledge) is advisable to achieve a comprehensive understanding of the interindividual differences in reading comprehension performance.

Some limitations of this study should be acknowledged. The application of a self-report measure to assess reading strategy use implies a particular disadvantage: the effect of social desirability on students’ responses, i.e., students may have reported that they often use certain reading strategies when they do not. Consequently, an overestimation of reading strategy use may have occurred and contributed to the lack of statistical significance for the majority of paths in the cross-lagged models that contained reading strategy use. Thus, further research should consider other measures of metacognitive skills, such as tests in which students are required to apply specific reading strategies. The high attrition observed from grades 5 to 6, as well as the low ratio of measures to participants, are two other limitations of this study that imply that caution should be exercised in interpreting and generalizing the results. A fourth limitation is related to the way decoding was measured, i.e., using a measure of oral reading fluency and not of word recognition. When addressing word recognition, empirical studies have frequently used measures of single word and/or pseudoword reading, considering either reading accuracy and/or fluency (Florit & Cain, 2011). In empirical studies conducted in more transparent orthographies, such as European Portuguese, the accuracy and speed of reading words in connected text plays a stronger role in explaining interindividual differences in reading comprehension performance than word recognition (Cadime et al., 2017; Santos et al., 2020). In this study, oral reading fluency was assessed by the number of words read correctly per minute when reading connected texts as a measure of decoding. Notwithstanding, future studies should consider other decoding dimensions.

Regardless of these weaknesses, this study fills relevant research gaps with clear theoretical and practical implications. In particular, the study contributes to expanding the scientific knowledge about the longitudinal interrelations between reading, linguistic and metacognitive skills, namely, oral reading fluency, listening comprehension, vocabulary, reading strategy use and reading comprehension, in an intermediate-depth orthography and in the advanced grades of primary school. The findings of this study clarify two central ideas: (1) oral reading fluency, listening comprehension and vocabulary are significant predictors of reading comprehension, even after the first school years, and (2) better levels of reading comprehension have a positive impact on listening comprehension and vocabulary. Thereby, at a theoretical level, our results seem to provide empirical support for the following premise underlying the CFR framework: many of the skills and processes involved in reading occur simultaneously, facilitating each other (Hoover & Tunmer, 2020b). Furthermore, our results reinforce those of prior research suggesting that oral reading fluency (i.e., accuracy and speed of reading words in connected text) is relevant to reading comprehension in orthographies more transparent than English, and it should be considered in frameworks of reading comprehension, such as the CFR (Cadime et al., 2017; Fernandes et al., 2017; Padeliadu & Antoniou, 2014; Santos et al., 2020).The results also open paths for the exploration of other research hypotheses in future studies. At a practical level, these results highlight that, even for children in the upper grades of primary school, oral reading fluency is still central for reading comprehension skills. Therefore, it can be concluded that continued attention to the accurate and fast reading of connected texts throughout the primary school years is fundamental to avoid comprehension difficulties. Moreover, the expansion and deepening of knowledge around word meanings and improvement of comprehension skills with oral and written texts should be used as effective strategies in school contexts to foster these reading skills in grades 5–6.