Differences in Perceived Instructional Quality of the Same Classrooms with Two Different Classroom Observation Instruments in China: Lessons Learned from Qualitative Analysis of Four Lessons Using TEACH and ICALT

Lei, Jieyan Celia; Chen, Zhijun; Ko, James

doi:10.1007/978-3-031-31678-4_7

6907 Accesses
3 Citations

Abstract

Research accumulated has suggested that narrowing instructional quality gaps can improve educational equity and the well-being of children in social and economic backgrounds. Considering that the disparity of instructional quality may affect educational inequality across different regions in China, this study explored how teaching quality varied in 30 lessons primary English classrooms in an economically disadvantaged province in China. This study adopted a mixed-method strategy with quantitative classroom observation data to select four lessons contrastive in teaching quality for subsequent qualitative analysis to explore classroom processes in-depth. Using two internationally validated classroom observation instruments, ICALT and TEACH, added a further dimension to examine how characteristics of instruments might influence perceived instructional quality. Results revealed that while both high-inference instruments were theoretically comparable in distinguishing teaching quality, only ICALT predicted learner engagement. While quantitative instruments could not provide detailed accounts of classroom processes, qualitative accounts of the four lessons could uncover the deep relationships between teacher-student interactions and differences in instructional quality. These findings suggest that conceptually similar instruments may vary in predictive power and that systematic qualitative analysis is indispensable in complementing high-inference instruments to provide an objective teacher evaluation.

You have full access to this open access chapter, Download chapter PDF

Subject-specific characteristics of instructional quality in mathematics education

Article 31 January 2018

Assessing individual lessons using a generic teacher observation instrument: how useful is the International System for Teacher Observation and Feedback (ISTOF)?

Article 01 March 2018

Theoretical and methodological challenges in measuring instructional quality in mathematics education using classroom observations

Article 19 February 2016

Keywords

1 Introduction

In the last decade, economically poorer regions worldwide, including inland provinces in China, have received considerable financial support from governmental and non-governmental organisations for building school and teaching and learning facilities equip** to guarantee pupils’ schooling. Sammons (2007) identified strong links between school education effectiveness and educational equity and concluded that teacher exerts a substantially more significant effect on children than school, and educational effectiveness varies more at the class level.

Quite a few studies have investigated educational inequalities in China, especially underprivileged areas, from different perspectives such as educational financing (e.g., Li et al., 2007; Tsang & Ding, 2005), gender (e.g., Hannum, 2005; Zeng et al., 2014), poverty (e.g., Heckman & Yi, 2012; Zhang, 2017; Yang et al., 2009), ethnicity (e.g., Hannum et al., 2008, 2015), and urbanisation (e.g., Qian & Smyth, 2008; Yang et al., 2014). Unsurprisingly, educational inequalities in China were found to be narrowed significantly, with the adverse effects primarily mitigated. However, the influence of these factors still exists.

In addition to non-classroom observation factors, classroom teaching quality directly impacts students’ learning effectiveness. Given the significant role of classroom teaching practices in greater educational equity (Sammons, 2007), a research gap lies in the lack of lesson observation evidence on the quality of classroom teaching exploration in an underprivileged area in China. Furthermore, the rapid development of China society in recent years makes studies easily and quickly outdated. Lack of timely updated research prevents audiences’ knowledge of the education situation from kee** pace with reality. This study explored educational inequality at the classroom teaching level from a teaching effectiveness perspective in an under-advantaged province in China. Using two classroom observation instruments, ICALT (Van de Grift, 2007) and TEACH (World Bank, 2019), we explored the instructional quality gaps between example lessons and how the perceived instructions differed in learning and teaching interactions.

2 Literature Review

2.1 Teaching Quality in Develo** Countries and Underdeveloped Regions

Factors affecting students’ outcomes at the classroom level have received more attention than factors at the school level in educational effectiveness research (Muijs et al., 2014). Knowledge in effective teaching practice at the classroom level is crucial for enhancing teacher capability to develop agile differentiated instruction strategies for diverse learners’ needs (Edwards et al., 2006). Although strenuous efforts have been made to probe into teaching quality in classrooms, studies between developed and develo** countries are insufficient. The Organisation for Economic Cooperation and Development’s PISA 2018 project (OECD, 2019), which evaluated the academic performance of junior secondary students worldwide, involved only two develo** countries/regions among the 30 participating countries/regions.

We generally lack knowledge in classroom-level teaching quality in develo** countries/regions except for a few noticeable empirical studies. For example, Chiangkul (2016) claimed that insufficient capability in the knowledge and teaching skills of the younger Thai teachers was evident in the Trends International Mathematics and Science Study (TIMSS) 2015. In South Africa and Botswana, teachers were found to lack knowledge about combining practical pedagogical skills with subject content (Sapire & Sorto, 2012). In rural Guatemala, Marshall and Sorto (2012) found that teaching practice in mathematics classrooms adopted less complex pedagogical skills than developed countries like Japan, America and Germany. Similarly, teaching quality in China varies province by province, and inland provinces have disadvantages noticeably in recruiting talented teachers. Moreover, the teaching capability of rural schoolteachers was generally lower than that of urban teachers, resulting in a remarkable gap between rural and urban schools in West China (Wang & Li, 2009). Thus, understanding teaching effectiveness in rural regions of economically disadvantaged provinces in China would contribute to strategies to promote educational quality and equity for children in the regions in the future.

2.2 Classroom Observation and Comparison of Instruments

Studies of student academic outcomes significantly contribute to classroom effectiveness, but the specific processes are not articulated (Pianta et al., 2008). The invention of classroom observation instruments provides a powerful approach for probing into classroom reality. It is seen as a more just form of data collection to examine teachers’ behaviours (Pianta et al., 2008). Classroom observation used to be limited to teacher appraisal, lesson evaluation, professional development of novice teachers, identifications of expert teachers from experienced teachers, but it has become popular with the interest in the classroom level teaching process in research increased (Wragg, 2013). Systematic classroom observation allows teachers to compare specific predetermined and agreed categories of behaviour and practice, which originated in teacher effectiveness research (Muijs & Reynolds, 2005).

Lesson videos of classroom teaching practice could be another observation form that provides researchers with a window to explore what happens in classrooms (Sapire & Sorto, 2012). For teaching analysis, video data was first used in the TIMSS 1995 video study by Stigler et al. (1999). Video recordings allow raters to slow down, pause, replay and re-interpret teaching practice, and capture complex teaching paths (Erickson, 2011; Jacobs et al., 1999; Klette, 2009). Furthermore, recorded teaching practice makes visual representation possible for researchers to capture anticipated details of classrooms that may escape their gaze (Lesh & Lehrer, 2000; Tee et al., 2018).

A few observation instruments were developed to evaluate teachers’ actual teaching processes and their contribution to student achievements. For exploring the generic pedagogic capability of teachers, these observational tools include the Framework for Teaching (Danielson, 1996), the International System for Teacher Observation and Feedback (Teddlie et al., 2006), the International Comparative Analysis of Learning and Teaching (ICALT) (Van de Grift, 2007), the Classroom Assessment Scoring System (CLASS) (Pianta et al., 2008), and the TEACH (World Bank, 2019). Some assess specific competencies, such as classroom talk (Mercer, 2010) and project-based learning (Stearns et al., 2012). Instruments for subject-specific pedagogies are available to researchers as well, such as English reading (Gersten et al., 2005), mathematical instruction (Schoenfeld, 2013) and historical contextualisation (Huijgen et al., 2017).

For instrument application, scholars compared different instruments for STEM classrooms in post-secondary education (Anwar & Menekse, 2021), mathematics and science classrooms in secondary education (Boston et al., 2015; Marshall et al., 2011) and preservice teacher internships (Caughlan & Jiang, 2014; Henry et al., 2009). However, no instruments comparison study based on English as a second language classrooms in primary education was found, which could contribute to essential education quality improvement in develo** countries.

In the present study that compared ICALT and TEACH, we identified two issues in our careful comparisons of the two instruments. First, theoretically speaking, the two instruments are conceptually similar. The teaching behaviours under the Classroom Culture domain of TEACH are conceptually similar to the behavioural indicators of the Safe and Stimulating Learning Climate and Efficient Organisation domains of ICALT (Van de Grift, 2007). Similarly, the Socioemotional Skills domain of TEACH is conceptually comparable to the Intensive and Activating Teaching domain of ICALT. The Instruction domain of TEACH is similar to ICALT’s Clear and Structured Instructions, Adjusting Instructions and Learner Processing to Inter-Learner Differences and Teaching Learning Strategies domains.

The inspectors initially developed ICALT to study primary classrooms in England and the Netherlands. The ICALT was then used as a research tool to compare teaching practices in developed and develo** countries (Maulana et al., 2021). In contrast, TEACH was developed as a system diagnostic and monitoring tool of teaching practices at a primary school level to foster professional development in low- and middle-income countries (Molina et al., 2018). Thus, the difference in scale development would be, theoretically and methodologically, critical if TEACH is more suitable for develo** regions or countries than ICALT. For example, it is unlikely that catering for learner diversity is considered essential in develo** countries where access to free education is challenging. Maulana et al. (2021) have shown that teaching behaviours associated with differentiation could be country-specific rather than universal.

Second, it is less difficult to conduct classroom observation with TEACH in practice than ICALT. ICALT was designed to observe whether teachers adjust teaching according to the level of students, but ICALT also emphasises stimulating students with weak learning abilities to build self-confidence. This teaching behaviour reflects a higher teaching skill of teachers. Kyriakides et al. (2009) found that teacher behaviours varied distinctively in difficulty levels, and it is not uncommon that teachers cannot master some advanced teaching skills even after professional training. Similarly, Ko et al. (2015) found that while teachers in Guangzhou were found performing better than Hong Kong teachers in many aspects of perceived teaching quality, Hong Kong teachers did better in catering for learner diversity because Hong Kong has practised an inclusive education policy for nearly two decades.

2.3 Qualitative In-Depth Lesson Analysis from a Dialogic Teaching Perspective

Apart from the dominant quantitative teacher effectiveness research, a consistently growing body of research investigated learning and teaching from a qualitative perspective on dialogic teaching in the last decades (Howe & Mercer, 2017; Vrikki et al., 2019) with regarding dialogic teaching as vital to student learning outcome (Alexander, 2006; Howe et al., 2019). Alexander (2008) proposed dialogic teaching as a learning process that promotes students to develop their higher-order thinking through reasoning, discussing, arguing, and explaining. Dialogic teaching is believed to have two main types, teacher-student interaction and student-student interaction (Howe & Abedin, 2013), with five core principles: collective, reciprocal, supportive, cumulative and purposeful (Alexander, 2008).

Hennessy and his team (2016) introduced a coding approach with developed Scheme for Education Dialogue Analysis (SEDA) to conduct qualitative in-depth lesson analysis for characterising and analysing classroom dialogues. It is considered a practical approach to evaluate how high-quality interaction is productive for learning (Hennessy et al., 2020), and has become quite prevalent in recent years (Song et al., 2019). For example, Shi et al. (2021), informed by SEDA’s condensed version, the Cambridge Dialogue Analysis Scheme (CDAS) (Vrikki et al., 2019), successfully modified SEDA to make it more suitable for their data set.

3 Research Questions

Based on the above background and consideration, the objective of this study is to answer the following research questions:

1.
How were teaching practices rated using different classroom instruments (i.e., ICALT and TEACH) in the same lessons?
1. (a)
  In what aspects did the ratings look similar based on the two observation instruments?
2. (b)
  How did the rating show more variations based on the two observation instruments?
2.
To what extent the above differences could be identified in an in-depth qualitative analysis of four purposively selected lessons?

4 Method

This study adopted a subsequent quantitative-qualitative research strategy to probe into the link and differences between two instructional quality assessment instruments, the TEACH and the ICALT. This research used the classroom observation strategy to explore teachers’ teaching quality and teacher-student interactions.

4.1 Samples

This study involved 20 primary schools in an underprivileged province in China in two different districts (one city/urban and one county/rural). Among these twenty schools, eleven schools were from the rural area, and nine were from the urban area. Thirty English teachers (one lesson per teacher) randomly selected from the sample schools participated in this study. The data collection was conducted with a third party that targeted primary school teachers whose teaching experience was more than two years and less than eight years. Hence, we controlled the teaching experience of participants by excluding teachers with less than two years or more than eight years.

Thirty lessons (one lesson per teacher) were recorded and observed by a well-trained rater with instruments to obtain quantitative data. Then, four lessons were selected for in-depth qualitative analysis.

4.2 Instruments

Classroom observation instruments are often assumed to study similar teaching characteristics, so they are expected to be comparable (Ko, 2010). ICALT (Van de Grift, 2007) and TEACH (World Bank, 2019) are two internationally validated classroom observation instruments on generic teaching behaviours. Analysis of this study focuses on high-inference indicators of these two instruments.

4.2.1 ICALT

ICALT instrument (Van de Grift, 2007) assesses classroom teaching behaviours divided into three parts. The core part has 32 behavioural indicators to be evaluated on a four-point scale to determine the relative strengths and effectiveness of a teaching behaviour (i.e., 1 = mostly weak; 2 = more often weak than strong; 3 = more often strong than weak; 4 = mostly strong). Four to ten behavioural indicators are grouped in one of the six primary domains in the instrument: Safe and Stimulating Learning Climate, Efficient Organisation, Clarity and Structure of Instruction, Intensive and Activating Teaching, Adjusting Instructions and Learner Processing to Inter-Learner Differences groups, and Teaching Learning Strategies. The second part comprises 115 observable teaching behaviours, with 3–10 matching a behavioural indicator in the core part. For example, ‘The teacher lets learners finish their sentences,’ ‘The teacher listens to what learners have to say,’ and ‘The teacher does not make role stereoty** remarks’ are corresponding teaching behaviours for the first indicator, ‘The teacher shows respect for learners in his/her behaviour and language’. Before giving a score for the behavioural teaching indicators, a rater should determine whether the observed behaviours are observed during the lesson. Whenever a teaching behaviour is observed, it should be scored 1; or a zero should be given if it is not observed. This part of ICALT has made the instrument quite different from many other instruments (e.g., the Classroom Assessment Scoring System by Pianta et al., 2008; Pianta & Hamre, 2009) because a rater is expected to judge the effectiveness of a teaching indicator on the grounds of a set of observed teaching behaviours. The last part of ICALT includes three behavioural indicators for learner engagement and ten associated learning behaviours, evaluated in 4-point and 2-point respectively.

4.2.2 TEACH

TEACH was a validated classroom observation tool developed by the World Bank (2019), applicable for Grade 1–6 classrooms in primary schools. It aimed to promote teaching quality improvement in under-advantaged nations. Raters of this instrument showed high inter-rater reliability (Molina et al., 2018). This instrument offers a unique window into some seldom investigated but weighty domains of class level teaching and learning experiences. The Time on Task component requires observers to record in three ‘snapshots’ of 1–10 seconds whether teachers provide most students with learning activities and how many students are on task. Classroom Culture, Instruction, and Socioemotional Skills are the three domains of the Quality of Teaching Practice component, followed by nine corresponding indicators that point to 28 teaching behaviours. Based on observation reality, observers rate each behaviour item with a three-level scale, ‘high’, ‘medium’ and ‘low’, equal to ‘definitely having this behaviour’, ‘somewhat having this behaviour’ and ‘only having opposite behaviour’ respectively. It should be noted that four behaviour items can be marked as ‘N/A’ if they do not occur in the classroom. By matching its corresponding behavioural ratings, each indicator is scored with a five-point scale, ranging from 1 to 5 (‘1’ is the lowest and ‘5’ is the highest).

4.2.3 Comparison of ICALT and TEACH

Through careful comparisons at the level of behavioural indicators, it was found that the teaching behaviours under the Classroom Culture domain of TEACH correspond to the behavioural indicators of the Safe and Stimulating Learning Climate and Efficient Organisation domains of ICALT (Van de Grift, 2007). Similarly, the Socioemotional Skills domain of TEACH corresponds to the Intensive and Activating Teaching domain of ICALT. The Instruction domain of TEACH corresponds to ICALT’s Clear and Structured Instructions, Adjusting Instructions and Learner Processing to Inter-Learner Differences and Teaching Learning Strategies domains. It is less difficult to conduct classroom observation with TEACH than ICALT. As mentioned earlier, while ICALT and TEACH could be used to observe whether teachers adjust teaching according to student abilities, the Adjusting Instructions and Learner Processing to Inter-Learner Differences domain in ICALT also emphasises stimulating students with weak learning abilities to build self-confidence. This domain reflects a higher level of teaching skills of teachers.

However, as a specific classroom observation instrument for teacher evaluation in primary schools in underdeveloped countries, TEACH is a better choice for in-depth qualitative analysis on dialogic teaching with its official training manual (World Bank, 2019), providing clear definitions on teaching behaviour items and detailed guidance for observer training. All teaching behaviour indicators in TEACH have unified official inspection standards, ensuring the reliability of coding scheme building and the in-depth qualitative dialogue analysis process and results. Accordingly, a new qualitative coding scheme, TEACH Tool for Lesson Analysis (TTLA), was developed based on the TEACH manual and partially summarised in Table 7.1.

Table 7.1 TEACH tool for lesson analysis (TTLA)—A qualitative coding scheme based on the TEACH framework

Full size table

4.3 Raters

The first author served as a research assistant in a commissioned impact study in which she collected all videos while she observed, recorded and rated with TEACH all the lessons onsite. Then, she reviewed the lesson videos with ICALT again within a month. The rater held a master’s degree with considerable lesson observation experience after taking TEACH and ICALT training workshops. The first author evaluated the same lesson videos with two instruments in the workshops and conducted a comparison and discussion afterwards. Then the raters launched the second and third rounds of lesson video evaluation practice. An additional rater was employed to ensure better consistency on inter-rater reliability concerns. The rater informed teachers only one night before the observation to prevent teachers from preparing perfect teaching in advance. All 30 classrooms were recorded with a camera to enable later transcripts on teaching practice and in-depth coding of teaching behaviours.

4.4 Data Collection

4.4.1 Quantitative Rating

A total of thirty English lessons were observed. Quantitative analysis was conducted with SPSS 20 to compare the perceived instructional quality of the same classrooms in different aspects of classroom observation instruments, TEACH and ICALT and determine which instrument could better predict student engagement. As Z-scores averages were provided in the official manual of TEACH (World Bank, 2019), selecting lessons for comparison based on those averages would provide objective ground beyond the present study. Two ‘weak’ lessons (Lesson 1, z = −1.52; Lesson 2, z = −0.96) and two ‘strong’ lessons (Lesson 3, z = 1.24; Lesson 4, z = 2.62) were eventually selected for in-depth qualitative analyses to explore variations in the evaluations of teaching quality with different instruments (see Table 7.1).

4.4.2 Qualitative Coding

In-depth qualitative analyses were performed based on the teaching behaviour definitions in the TEACH manual for better validity. TTLA was employed to code the teaching behaviours of the four selected four lessons. Teaching activities and interactions between teachers and students of each sample lesson illustrated teaching practices more specifically than quantitative ratings.

5 Results

5.1 Quantitative Analyses of All Lessons

All TEACH and ICALT factors were standardised for quantitative analyses because the scales used were different in the two instruments. Due to the small sample sizes, only one regression model was tested using SPSS 20.0 to predict learner engagement in ICALT using the overall scores of both TEACH and ICALT.

Table 7.2 presents the mean, standard deviation, and reliability (alpha and omega) of factors in two instruments. We include both McDonald’s Omega (McDonald, 2013) and Cronbach’s alpha (1951), as the former is considered more suitable regardless of the number of items within a factor. The results indicated that the two values do not show much difference. It also demonstrates the descriptive statistics of the overall scores and good item consistencies of all nine items in TEACH (α = 0.82) and 32 items in ICALT (α = 0.932). Due to a limited number of items in each TEACH factor, there is a low internal consistency level for Socioemotional Skills (α = 0.483). In ICALT, the Adjusting Instructions and Learner Processing to Inter-Learner Differences domain (α = 0.361) and Teaching Learning Strategies domain (α = 0.599) also show low reliabilities.

Table 7.2 Mean, standard deviation and reliability of factors in TEACH and ICALT

Full size table

Spearman rho’s correlation coefficients between TEACH and ICALT factors are presented in Table 7.3. There are strong positive correlations between three TEACH factors, while the ICALT domain Adjusting Instructions and Learner Processing to Inter-learner Differences does not significantly correlate with other ICALT domains. Learner engagement was significantly correlated with most factors in both TEACH and ICALT, except for the Adjusting Instructions and Learner Processing to Inter-Learner Differences domain in ICALT.

Table 7.3 Correlations (Spearman rho) between TEACH (1–3) and ICALT factors (4–9)

Full size table

With the limitation of the participant number, only one regression model with the overall scores of TEACH and ICALT in the prediction of learner engagement could be conducted (see Table 7.4). Results show that only the ICALT score could significantly predict learner engagement, F (2, 27) = 29.92, p < .00, R2 = 0.83.

Table 7.4 Linear regression model using learner engagement in ICALT as a dependent variable

Full size table

5.2 Comparisons of ICALT and TEACH Results of the Selected Four Lessons

As shown in Table 7.5, the individual and overall aspects of LESSON 1 and LESSON 2 were relatively weak with lower means, while LESSON 3 and LESSON 4 were high-quality lessons. The standard deviations of the ICALT averages (Table 7.5) were observably lower than that of TEACH, indicating that variations in ratings were more considerable if TEACH was used for observation.

Table 7.5 Comparisons of four lessons in TEACH and ICALT scores

Full size table

At the domain level, LESSON 1 has a much lower mean in the Instruction domain (M = 1.75) but a little higher means in the Classroom Culture (M = 2.5) and Socioemotional Skills (M = 2.33) domains than those of LESSON 2 (M = 2.75, 2.0, 2.0 respectively) in the TEACH results. However, the ICALT results show LESSON 1 scored much higher means in the Safe and Stimulating Learning Climate domain (M = 2.25) and a little higher in the Intensive and Activating Teaching domain (M = 1.57), and a little lower mean in Clear and Structured Instructions domain (M = 2.29) than LESSON 2. It is worth noting that the ICALT rankings of these two less effective lessons are higher than those of TEACH. Interestingly, LESSON 1 ranks the last in TEACH but the 22nd out of 30 in ICALT. LESSON 2 ranks higher than LESSON 1 in TEACH (28th) but higher in ICALT (26th).

Regarding the two more effective lessons, means of LESSON 3 in the Instruction (M = 3.0) and Socioemotional Skills (M = 3.0) domains are significantly lower than LESSON 4 (M = 3.75, 3.67 respectively) in TEACH. In contrast, for ICALT, means for LESSON 3 were lower in the Safe and Stimulating Learning Climate (M = 3.0), Intensive and Activating Teaching (M = 2.29), Adjusting Instructions and Learner Processing to Inter-Learner Differences (M = 1.0), and Teaching Learning Strategies (M = 1.0) domains than those for LESSON 4 (M = 3.5, 2.71, 1.5, 1.67 respectively). LESSON 3 were rated better in two ICALT domains, Efficient Organisation (M = 3.75) and Clear and Structured Instructions (M = 3.57), than LESSON 4 (M = 3.5, 2.71 respectively). Additionally, the ranking of two high-quality lessons of TEACH was a little higher than that of ICALT. LESSON 3 ranks 3rd in TEACH but 5th in ICALT, and LESSON 4 ranks 1st in TEACH and 2nd in ICALT.

5.3 Qualitative Characteristics of Teacher-Student Interactions

Two low-quality lessons (LESSONS 1 & 2) and two high-quality lessons (LESSONS 3 & 4) were selected as above mentioned. Four lessons were transcribed verbatim and coded with non-verbal communication captured by two coders. Coders coded these lessons with the TTLA framework outlined in Table 7.1. Teaching behaviours reflected in dialogue content are coded with corresponding codes. Multiple coding appears when more than one behaviour is reflected.

The performances of two low-quality lessons (LESSONS 1 & 2) were unsatisfactory in the teacher-student interaction. Table 7.6 shows the learning activity Reading Sentences of LESSON 1. The teacher performed good at providing students with opportunities to play a role in the classroom (S7b) and promoted students’ voluntary behaviours (S7c). Nevertheless, students were not clear with the learning activity behaviour expectation since the teacher did not explain it before the learning activity. When the teacher said, ‘partner A partner B’, all students were confused and silent (Line 2). They had no idea what the teacher expected them to do until she asked who wanted to be Partner A in English and Chinese.

Table 7.6 Lesson 1 Reading sentences

Full size table

The situation in LESSON 2 (Table 7.7) was also difficult. The teacher in LESSON 2 performed poorly in respecting students. The teacher even taunted the students (line 7: Aren’t you full? Can’t the brain think? [means You are a fool in Chinese culture]). On the bright side, the teacher offered students opportunities to play a role in the classroom (9 lines out of 10 lines of teacher talk were coded with S7b) by asking questions to check students’ level of understanding (I4a). However, he did not tell students what they could refer to and where the references were in advance, so it was hard to follow him. Students responded to the teachers’ questions with silence (Line 4, Line 6, Line 11, Line 13), making the lesson challenging to move on.

Table 7.7 Lesson 2 Learning present tense

Full size table

As one of the high-quality lessons, LESSON 3 led the students to review the words learned before (Table 7.8). First, the teacher explained the expected behaviours of the learning activity and demonstrated how to carry out the activity in detail, and even conducted simulation (Line 7, C2a, I3d; Line 9, I3d; Line 11, C2a; Line 13, C2a). In this activity, the teacher attached great importance to students’ mastery of learning content and students’ involvement in the classroom (Line 13, I4a, S7b; Line 15, I4a, S7b; Line 18, I4a, S7b). She checked students’ understanding individually. Four out of the teachers’ seven communicative behaviours were coded as C1a (lines 7, 13, 15 and 17). That means that teachers are very good at respecting students.

Table 7.8 Lesson 3 Reviewing learned vocabularies

Full size table

In LESSON 4, the teacher adopted pictures describing as a learning activity (Table 7.9). Code I3c appeared in every line in this learning activity since the teacher utilised picture materials that connected with students’ lives. That raised students’ strong interest and initiative in this learning activity. The teacher put forward a series of questions around the given pictures to check the students’ understanding of the grammar (Line 128, I4a; Line130 I4a; Line 132, I4a; Line 134, I4a). Questioning on life connected materials also promote students’ participation and allows them to take on a classroom role (S7b). Overall, 13 out of 14 lines were coded with two or three codes. This incident illustrates teacher-student interaction was of high quality in this learning activity.

Table 7.9 Lesson 4 Describing pictures

Full size table

Teaching styles differ among these four lessons and show a large gap between high-quality and low-quality lessons. The difference between a good lesson and a weak one is noticeable. In outstanding high-quality lessons, teachers respected students, articulated clear expectations, and let students play a role in classroom learning. These are some weaknesses of low-quality lessons. For LESSON 1 and LESSON 2, teachers’ behaviours did not show good respect, affecting students’ interest in the lesson. Teachers also did not make their expectations for students on classroom activity clear. This teaching behaviour makes it difficult for students to understand the teacher’s intention. In the end, the students could not give the expected responses. Moreover, having no opportunity to play a role in the classroom made students lack participation and fail to learn confidently.

6 Discussion and Conclusion

6.1 Instrument Characteristics as Biases and Limitations

As shown in Table 7.5, only some general teaching behaviours are assessed (I3a to I6c) in TEACH, which means teachers only need to conduct common teaching behaviours to meet the standards to get higher scores.

‘High-quality’ lessons ranked a little lower in ICALT than in TEACH. It indicated that ICALT has higher overall classroom teaching requirements than TEACH. Regarding ‘low-quality lessons ranked higher in ICALT than in TEACH, the teachers in these two classes did not perform well in general teaching behaviour, but they had deeper teaching behaviour. Nevertheless, it does not affect the determination of the final characterisation of ‘low-quality.’

Our results indicated that TEACH is a feasible coding scheme for in-depth qualitative analysis on dialogic teaching as it fit our research demands to associate it with a quantitative lesson observation instrument. There is a trade-off between instrument complexity and ease of usage as TEACH was developed to provide quick training for practitioners in develo** countries for teacher evaluation and professional teacher development. In contrast, ICALT was initially developed for high-stake inspections and subsequently for high-quality research in developed and develo** countries (Maulana et al., 2021).

6.2 The Practicability of Promoting Teacher Reflections: TEACH vs ICALT

The quantitative results indicated that ICALT predicted student engagement better than TEACH. However, the subscale Learner engagement is part of ICALT, so it is not surprising that the results might favour ICALT more than TEACH. However, both ICALT and TEACH results showed that clear and structured instructions improve student engagement. Adequate instructions could contribute to a better and depth understanding of classroom activities and contents, resulting in higher student involvement in classroom learning (Boston & Candela, 2018).

Moreover, among the ICALT domains, the average score of the Adjusting Instructions and Learner Processing to Inter-Learner Differences was lower than other domains in ICALT, indicating that teachers in the sample hardly presented student-centred instructions to address learner diversity. A lower rating might be caused by the limited background information of the students available to the raters. The raters did not know the students’ learning differences ahead of the class; hence, it might be hard for them to identify students with diverse learning needs to associate teaching behaviours expected to address learner diversity during the classroom observation (Edwards et al., 2006). Thus, a rater may be biased against the teacher if s/he lacks the understanding of students as learners. Among TEACH factors, teachers with better socioemotional skills, including autonomy, perseverance, social and collaborative skills, could have engaged students better in classroom learning.

In addition to the low average score, the Adjusting Instructions and Learner Processing to Inter-Learner Differences subscale also has poor reliability. A similar reason that observers lack contextual information in the classroom might affect the reliability. For example, it is not easier to identify whether a student is weaker without asking the teacher. Another explanation is that as the teaching quality of each teacher was assessed based on one single lesson, personalised instruction to fit in inter-learner differences and adjusting might not be readily recognisable in one single lesson but more evident in more lessons observed for the whole academic term. A longitudinal study in which teaching quality can be assessed several times throughout a whole academic term or year could be conducted in the future to better capture student-centred instructions in the teaching quality.

7 Conclusion

Two significant limitations of the present study were the small sample size and selection of samples. In this study, as the sampling only covered teaching whose teaching experience was more than two years and less than eight years, the teachers who taught more than eight years or just started to teach less than two years were underrepresented. Future studies can focus on the assessments and comparisons of teaching quality based on teachers with all lengths of teaching experience. For example, a study on 47 rural primary schools in Guizhou Province showed that the length of teaching experiences varied across teachers, and teachers with 4–10 years of teaching experience only accounted for 27% of the population (Peng, 2015).

Teacher-student interaction is an essential factor affecting classroom teaching quality (Berlin & Cohen, 2018). The differences between high-quality and low-quality lessons are highlighted in respecting students, behaviour expectation for students, and students playing a role in classroom aspects. If a class does not have these characteristics, it is challenging to associate students’ interests with specific teaching behaviours and subsequently affect the student learning achievement and make a fair judgement on teaching quality.

There are many classroom observation tools for us to choose for teacher evaluation and research. However, we compared two instruments designed for different purposes and probably for different audiences and contexts. When choosing these tools, we should first consider comparing the lens of different instruments (Walkington & Marder, 2018; Walkowiak et al., 2019), as we have done to balance efficiency and exhaustivity for the research needs. When analysing the comparative results, we should also thoroughly consider the limitations of our observation tools. We also conducted in-depth qualitative analyses because high-inference classroom observation instruments like ICALT and TEACH cannot provide detailed accounts of classroom processes. Our coding strategies also provide the potential for quantifying qualitative data. We suggest systematic in-depth qualitative analysis with detailed contextual information provide dby the teacher and a longitudinal approach be indispensable to complement high-inference instruments in more objective research and fairer teacher evaluation.

References

Alexander, R. (2006). Towards dialogic teaching (3rd ed.). Dialogos.
Google Scholar
Alexander, R. (2008). Towards dialogic teaching: Rethinking classroom talk (4th ed.). Dialogos.
Google Scholar
Anwar, S., & Menekse, M. (2021). A systematic review of observation protocols used in post-secondary STEM classrooms. Review of Education, 9(1), 81–120.
Article Google Scholar
Berlin, R., & Cohen, J. (2018). Understanding instructional quality through a relational lens. ZDM, 50(3), 367–379.
Article Google Scholar
Boston, M. D., & Candela, A. G. (2018). The instructional quality assessment as a tool for reflecting on instructional practice. ZDM, 50(3), 427–444.
Article Google Scholar
Boston, M., Bostic, J., Lesseig, K., & Sherman, M. (2015). A comparison of mathematics classroom observation protocols. Mathematics Teacher Educator, 3(2), 154–175.
Article Google Scholar
Caughlan, S., & Jiang, H. (2014). Observation and teacher quality: Critical analysis of observational instruments in preservice teacher performance assessment. Journal of Teacher Education, 65(5), 375–388.
Article Google Scholar
Chiangkul, W. (2016). The state of Thailand education 2014/2015 “How to reform Thailand education towards 21st century?”. Office of the Education Council.
Google Scholar
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.
Article Google Scholar
Danielson, C. (1996). Enhancing professional practice: A framework for teaching. Association for Supervision and Curriculum Development.
Google Scholar
Edwards, C. J., Carr, S., & Siegel, W. (2006). Influences of experiences and training on effective teaching practices to meet the needs of diverse learners in schools. Education, 126(3), 580–591.
Google Scholar
Erickson, F. (2011). Uses of video in social research: A brief history. International Journal of Social Research Methodology, 14(3), 179–189.
Article Google Scholar
Gersten, R., Baker, S. K., Haager, D., & Graves, A. W. (2005). Exploring the role of teacher quality in predicting Reading outcomes for first-grade English learners: An observational study. Remedial and Special Education, 26(4), 197–206.
Article Google Scholar
Hannum, E. (2005). Market transition, educational disparities, and family strategies in rural China: New evidence on gender stratification and development. Demography, 42(2), 275–299.
Article Google Scholar
Hannum, E., Behrman, J., Wang, M., & Liu, J. (2008). Education in the reform era. In L. Brandt & T. G. Rawski (Eds.), China’s great economic transformation (pp. 215–249). Cambridge University Press.
Chapter Google Scholar
Hannum, E., Cherng, H. Y. S., & Wang, M. (2015). Ethnic disparities in educational attainment in China: Considering the implications of interethnic families. Eurasian Geography and Economics, 56(1), 8–23.
Article Google Scholar
Heckman, J. J., & Yi, J. (2012). Human capital, Economic Growth, and Inequality in China (No. w18100). National Bureau of Economic Research.
Book Google Scholar
Hennessy, S., Rojas-Drummond, S., Higham, R., Márquez, A. M., Maine, F., Ríos, R. M., García-Carriónc, R., Torreblancab, O., & Barrera, M. J. (2016). Develo** a coding scheme for analysing classroom dialogue across educational contexts. Learning, Culture and Social Interaction, 9, 16–44.
Article Google Scholar
Hennessy, S., Howe, C., Mercer, N., & Vrikki, M. (2020). Coding classroom dialogue: Methodological considerations for researchers. Learning, Culture and Social Interaction, 25, 100404.
Article Google Scholar
Henry, M. A., Murray, K. S., Hogrebe, M., & Daab, M. (2009). Quantitative analysis of indicators on the RTOP and ITC observation instruments. MSPnet. http://mspnet-static.s3.amazonaws.com/MA_Henry_Quantitative_Analysis_RTOP_ITC_112509_FINAL.pdf
Google Scholar
Howe, C., & Abedin, M. (2013). Classroom dialogue: A systematic review across four decades of research. Cambridge Journal of Education, 43(3), 325–356.
Article Google Scholar
Howe, C., & Mercer, N. (2017). Commentary on the papers. Language and Education, 31(1), 83–92.
Article Google Scholar
Howe, C., Hennessy, S., Mercer, N., Vrikki, M., & Wheatley, L. (2019). Teacher-student dialogue during classroom teaching: Does it really impact on student outcomes. Journal of the Learning Sciences, 28(4–5), 462–512. https://doi.org/10.1080/10508406.2019.1573730
Article Google Scholar
Huijgen, T., van de Grift, W., Van Boxtel, C., & Holthuis, P. (2017). Teaching historical contextualisation: The construction of a reliable observation instrument. European Journal of Psychology of Education, 32(2), 159–181.
Article Google Scholar
Jacobs, J. K., Kawanaka, T., & Stigler, J. W. (1999). Integrating qualitative and quantitative approaches to the analysis of video data on classroom teaching. International Journal of Educational Research, 31, 717–724.
Article Google Scholar
Klette, K. (2009). Challenges in strategies for complexity reduction in video studies. Experiences from the PISA+ study: A video study of teaching and learning in Norway. In T. In Janík & T. Seidel (Eds.), The power of video Sstudies in investigating teaching and learning in the classroom (pp. 61–82). Waxmann.
Google Scholar
Ko, J. Y. O. (2010). Consistency and variation in classroom practice: A mixed-method investigation based on case studies of four EFL teachers of a disadvantaged secondary school in Hong Kong (Publication No. 11363). Doctoral dissertation, University of Nottingham. Nottingham eThesis.
Google Scholar
Ko, J., Ho, M., & Chen, W. (2015, July–Oct). Teacher report —teacher effectiveness and goal orientation. report Nos. 1–43. Tai Po: Hong Kong Institute of Education.
Google Scholar
Kyriakides, L., Creemers, B. P., & Antoniou, P. (2009). Teacher behaviour and student outcomes: Suggestions for research on teacher training and professional development. Teaching and teacher education, 25(1), 12-23.
Google Scholar
Lesh, R. A., & Lehrer, R. (2000). Iterative refinement cycles of videotape analyses of conceptual change. In A. E. Kelly & R. A. Lesh (Eds.), Handbook of research design in mathematics and science education (pp. 665–708). LEA.
Google Scholar
Li, W., Park, A., Wang, S., & **, L. (2007). School equity in rural China. In E. Hannum & A. Park (Eds.), Education and reform in China (pp. 27–43). Routledge.
Google Scholar
Marshall, J. H., & Sorto, M. A. (2012). The effects of teacher mathematics knowledge and pedagogy on student achievement in rural Guatemala. International Review of Education, 58(2), 173–197.
Article Google Scholar
Marshall, J. C., Smart, J., Lotter, C., & Sirbu, C. (2011). Comparative analysis of two inquiry observational protocols: Striving to better understand the quality of teacher-facilitated inquiry-based instruction. School Science and Mathematics, 111(6), 306–315.
Article Google Scholar
Maulana, R., André, S., Helms-Lorenz, M., Ko, J., Chun, S., Shahzad, A., et al. (2021). Observed teaching behaviour in secondary education across six countries: Measurement invariance and indication of cross-national variations. School Effectiveness and School Improvement, 32(1), 64–95.
Article Google Scholar
McDonald, R. P. (2013). Test theory: A unified treatment. Psychology Press.
Book Google Scholar
Mercer, N. (2010). The analysis of classroom talk: Methods and methodologies. British Journal of Educational Psychology, 80(1), 1–14.
Article Google Scholar
Molina, E., Fatima, S. F., Ho, A. D. Y., Melo Hurtado, C. E., Wilichowksi, T., & Pushparatnam, A. (2018). Measuring teaching practices at scale: Results from the development and validation of the teach classroom observation tool (World Bank Policy Research Working Paper, No. 8653). The World Bank.
Google Scholar
Muijs, D., & Reynolds, D. (2005). Effective teaching: Introduction & conclusion. Sage.
Google Scholar
Muijs, D., Kyriakides, L., Van der Werf, G., Creemers, B., Timperley, H., & Earl, L. (2014). State of the art–teacher effectiveness and professional learning. School Effectiveness and School Improvement, 25(2), 231–256.
Article Google Scholar
OECD. (2019, April 26). PISA 2018 assessment and analytical framework. https://www.oecd-ilibrary.org/education/pisa-2018-assessment-and-analytical-framework_b25efab8-en
Peng, Y. (2015). The recruitment and retention of teachers in rural areas of Guizhou, China. Doctoral dissertation, University of York, York, UK.
Google Scholar
Pianta, R. C., & Hamre, B. K. (2009). Conceptualisation, measurement, and improvement of classroom processes: Standardised observation can leverage capacity. Educational Researcher, 38(2), 109–119.
Article Google Scholar
Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2008). Classroom assessment scoring system™: Manual K-3. Paul H Brookes Publishing.
Google Scholar
Qian, X., & Smyth, R. (2008). Measuring regional inequality of education in China: Widening coast–inland gap or widening rural-urban gap? Journal of International Development: The Journal of the Development Studies Association, 20(2), 132–144.
Article Google Scholar
Sammons, P. (2007). School effectiveness and equity: Making connections. CfBT.
Google Scholar
Sapire, I., & Sorto, M. A. (2012). Analysing teaching quality in Botswana and South Africa. Prospects, 42(4), 433–451.
Article Google Scholar
Schoenfeld, A. H. (2013). Classroom observations in theory and practice. ZDM, 45(4), 607–621.
Article Google Scholar
Shi, Y., Shen, X., Wang, T., Cheng, L., & Wang, A. (2021). Dialogic teaching of controversial public issues in a Chinese middle school. Learning, Culture and Social Interaction, 30, 100533.
Article Google Scholar
Song, Y., Chen, X., Hao, T., Liu, Z., & Lan, Z. (2019). Exploring two decades of research on classroom dialogue by using bibliometric analysis. Computers & Education, 137, 12–31.
Article Google Scholar
Stearns, L. M., Morgan, J., Capraro, M. M., & Capraro, R. M. (2012). A teacher observation instrument for PBL classroom instruction. Journal of STEM Education: Innovations & Research, 13(3), 7.
Google Scholar
Stigler, J. W., Gonzales, P., Kwanaka, T., Knoll, S., & Serrano, A. (1999). The TIMSS videotape classroom study: Methods and findings from an exploratory research project on eighth-grade mathematics instruction in Germany, Japan, and the United States. National Center for Education Statistics (ED), .
Google Scholar
Teddlie, C., Creemers, B., Kyriakides, L., Muijs, D., & Yu, F. (2006). The international system for teacher observation and feedback: Evolution of an international study of teacher effectiveness constructs. Educational Research and Evaluation, 12(6), 561–582.
Article Google Scholar
Tee, M. Y., Samuel, M., Nor, N. M., Sathasivam, R. V., & Zulnaidi, H. (2018). Classroom practice and the quality of teaching: Where a nation is going? Journal of International and Comparative Education, 7(1), 17–33.
Article Google Scholar
Tsang, M. C., & Ding, Y. (2005). Resource utilisation and disparities in compulsory education in China. China Review, 5, 1–31.
Google Scholar
Van de Grift, W. (2007). Quality of teaching in four European countries: A review of the literature and application of an assessment instrument. Educational Research, 49(2), 127–152.
Article Google Scholar
Vrikki, M., Wheatley, L., Howe, C., Hennessy, S., & Mercer, N. (2019). Dialogic practices in primary school classrooms. Language and Education, 33(1), 85–100. https://doi.org/10.1080/09500782.2018.1509988
Article Google Scholar
Walkington, C., & Marder, M. (2018). Using the UTeach Observation Protocol (UTOP) to understand the quality of mathematics instruction. ZDM, 50(3), 507–519.
Article Google Scholar
Walkowiak, T. A., Adams, E. L., & Berry, R. Q. (2019). Validity arguments for instruments that measure mathematics teaching practices: Comparing the M-Scan and IPL-M. In J. D. Bostic, E. E. Krupa, & J. C. Shih (Eds.), Assessment in mathematics education contexts (pp. 90–119). Routledge.
Chapter Google Scholar
Wang, J., & Li, Y. (2009). Research on the teaching quality of compulsory education in China’s west rural schools. Frontiers of Education in China, 4(1), 66–93.
Article Google Scholar
World Bank. (2019, July 9). TEACH manual. World Bank. https://www.worldbank.org/en/topic/education/brief/teach-hel**-countries-track-and-improve-teaching-quality?cid=EXT_WBEmailShare_EXT
Wragg, T. (2013). An introduction to classroom observation. Routledge.
Google Scholar
Yang, J., Huang, X., & Li, X. (2009). Education inequality and income inequality: An empirical study on China. Frontier of Education in China, 4(3), 413–434.
Article Google Scholar
Yang, J., Huang, X., & Liu, X. (2014). An analysis of education inequality in China. International Journal of Educational Development, 37, 2–10.
Article Google Scholar
Zeng, J., Pang, X., Zhang, L., Medina, A., & Rozelle, S. (2014). Gender inequality in education in China: A meta-regression analysis. Contemporary Economic Policy, 32(2), 474–491.
Article Google Scholar
Zhang, H. (2017). Opportunity or new poverty trap: Rural-urban education disparity and internal migration in China. China Economic Review, 44, 112–124.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Education, Shaoyang University, Shaoyang, China
Jieyan Celia Lei
Department of Education Policy and Leadership, The Education University of Hong Kong, Tai Po, Hong Kong
Jieyan Celia Lei & James Ko
Department of Education, University of Bath, Bath, UK
Zhijun Chen

Authors

Jieyan Celia Lei
View author publications
You can also search for this author in PubMed Google Scholar
Zhijun Chen
View author publications
You can also search for this author in PubMed Google Scholar
James Ko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James Ko .

Editor information

Editors and Affiliations

Department of Teacher Education, University of Groningen, Groningen, The Netherlands
Ridwan Maulana
Department of Teacher Education, University of Groningen, Groningen, The Netherlands
Michelle Helms-Lorenz
Department of Education, University of York, York, UK
Robert M. Klassen

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lei, J.C., Chen, Z., Ko, J. (2023). Differences in Perceived Instructional Quality of the Same Classrooms with Two Different Classroom Observation Instruments in China: Lessons Learned from Qualitative Analysis of Four Lessons Using TEACH and ICALT. In: Maulana, R., Helms-Lorenz, M., Klassen, R.M. (eds) Effective Teaching Around the World . Springer, Cham. https://doi.org/10.1007/978-3-031-31678-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-31678-4_7
Published: 28 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31677-7
Online ISBN: 978-3-031-31678-4
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics

Differences in Perceived Instructional Quality of the Same Classrooms with Two Different Classroom Observation Instruments in China: Lessons Learned from Qualitative Analysis of Four Lessons Using TEACH and ICALT

Abstract

Similar content being viewed by others

Subject-specific characteristics of instructional quality in mathematics education

Assessing individual lessons using a generic teacher observation instrument: how useful is the International System for Teacher Observation and Feedback (ISTOF)?

Theoretical and methodological challenges in measuring instructional quality in mathematics education using classroom observations

Keywords

1 Introduction

2 Literature Review

2.1 Teaching Quality in Develo** Countries and Underdeveloped Regions

2.2 Classroom Observation and Comparison of Instruments

2.3 Qualitative In-Depth Lesson Analysis from a Dialogic Teaching Perspective

3 Research Questions

4 Method

4.1 Samples

4.2 Instruments

4.2.1 ICALT

4.2.2 TEACH

4.2.3 Comparison of ICALT and TEACH

4.3 Raters

4.4 Data Collection

4.4.1 Quantitative Rating

4.4.2 Qualitative Coding

5 Results

5.1 Quantitative Analyses of All Lessons

5.2 Comparisons of ICALT and TEACH Results of the Selected Four Lessons

5.3 Qualitative Characteristics of Teacher-Student Interactions

6 Discussion and Conclusion

6.1 Instrument Characteristics as Biases and Limitations

6.2 The Practicability of Promoting Teacher Reflections: TEACH vs ICALT

7 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation