Introduction

The swift evolution of artificial intelligence (AI) technology has garnered considerable attention for its application in secondary education. Notably, language analysis technology, an integral facet of AI, holds substantial promise within the realm of secondary education. This study seeks to assess the efficacy of AI-based language analysis technology in secondary education, aiming to furnish a scientific foundation for educational reform. Technological innovations are resha** secondary education as online education gains popularity and evolves. Language analysis technology, leveraging techniques like natural language processing and text analysis, can delve into students’ linguistic expressions during the learning process, thereby equip** educators with a more comprehensive understanding of students’ learning dynamics. Through AI, a nuanced analysis of students’ language proficiency, expression patterns, and related aspects becomes feasible, offering precise guidance for personalized teaching and subject-specific tutoring.

In the online environment, teaching behavior can significantly impact learners’ experiences and learning outcomes. Therefore, as a crucial dimension of teaching practice, teaching behavior plays a pivotal role in influencing the effectiveness of instruction. Studying this controlling mechanism can help promote online courses and facilitate more efficient student learning1. Some scholars have found a significant correlation between teaching behavior and academic emotion, arguing that teaching behavior can alleviate students’ negative emotions online, such as anxiety and loneliness2. Conversely, online teaching behavior serves as a direct expression of educators’ teaching abilities and comprehensive skills. Educators must reflect on their teaching behaviors to enhance the effectiveness of online instruction. Therefore, the foundation for building high-quality online courses should begin with the online Teaching Behavior Analysis (TBA)3.

Based on the media used by educators, teaching behaviors can be categorized into verbal and non-verbal behaviors. Notably, classroom discourse is fundamental for student–teacher communication, constituting approximately 80% of all teaching behaviors4. Makarenko, a renowned educator in the former Soviet Union, emphasized that, under the same teaching model, different classroom discourses might lead to a 200-fold difference in teaching effectiveness, underscoring the importance of classroom discourse5. Additionally, classroom discourse, a crucial component of educators’ teaching behavior, serves as a key indicator in evaluating the quality of online courses6. Therefore, focusing on online TBA and leveraging big data technologies to mine its characteristics and patterns holds great significance for enhancing the teaching quality and learning outcomes of online courses7.

The innovative development of online course-supportive big data platforms and related data processing technologies has become a new research focus. Understanding how classroom discourse influences the learning experience and teaching effectiveness is essential to improve online educators’ essential teaching skills. To this end, this work introduces big data mining technology to explore educators’ teaching characteristics and behaviors that affect the quality of online courses. It analyzes the teaching objectives, evaluates online educators’ experiences, and explores online TBA methods. Based on the research findings, implications are suggested for enhancing online educators’ teaching skills. The research results provide an essential reference and basis for improving the online learning experience and teaching effectiveness.

Literature review

The online course-oriented data mining technology based on AI targets the unique data collected from the teaching environment, teaching objects, and teaching process in online courses. It focuses on big data in online courses, which falls into the main category of educational big data research and application

Figure 3
figure 3

Descriptive statistics of grouped evaluation of different classroom discourse comprehensive scores.

In the grouped online course evaluation, speech intelligibility is rated as “excellent” (97.9 points), “middle” (91.1 points), and “poor” (81 points). Speaking rate is rated as “fast” (93 points), “middle” (90 points), and “slow” (84 points). In comparison, content similarity is rated as “low” (93 points), “middle” (91.4 points), and “high” (82.8 points). Average sentence length is rated as “short” (93.2 points), “medium” (90.6 points), and “long” (77.8 points). The evaluation scores for different groups of indicators vary.

Analysis of variance

Figure 4 conducts an analysis of variance (ANOVA) to explore whether there are statistical differences in the classroom discourse evaluation scores of the four indicators between different groups.

Figure 4
figure 4

ANOVA of comprehensive scores of classroom discourse evaluation in different groups.

In the ANOVA of the speech intelligibility dimension, F = 11.8 and p = 0.0009. In the ANOVA of the speaking rate, F = 2.67, and p = 0.093. In the ANOVA of the content similarity, F = 4.65, and p = 0.045. In the ANOVA of the average sentence length, F = 11.83, and p = 0.0008. The results indicate that the comprehensive scores of grouped evaluations among different indicators exhibit varying significance.

Correlation analysis of the online classroom discourse indicators and course evaluation in secondary schools

Figure 5 illustrates the correlation analysis results between online classroom discourse indicators and comprehensive course evaluation scores in secondary schools.

Figure 5
figure 5

Correlation analysis between classroom discourse indicators and comprehensive scores of course evaluation.

In Fig. 5, a significant negative correlation is observed between speech intelligibility and the comprehensive score of online course evaluation, with a correlation coefficient of −0.71. The speaking rate is significantly negatively correlated with the comprehensive online course evaluation score, with a correlation coefficient of −0.56. The content similarity of classroom discourse is significantly negatively correlated with the comprehensive course evaluation score, showing a correlation coefficient of −0.74. The average sentence length of classroom discourse is significantly negatively correlated with the comprehensive online course evaluation score, with a correlation coefficient of −0.71.

Regression analysis of classroom discourse indicators in secondary school online education on course evaluation

Figure 6 presents the results of stepwise multiple regression analysis examining the impact of classroom discourse indicators on learners’ course evaluation.

Figure 6
figure 6

Results of stepwise multiple regression analysis of the impact of classroom discourse indicators on comprehensive course evaluation.

In Fig. 6, the model fitting equation is y = −24.74 (similarity) −4.64 (sentence length) + 127.44. The model fitting determination coefficient R2 = 0.78, the adjustment coefficient R2 = 0.76, and the model fitting and coefficient are highly significant. Among these, similarity emerges as the strongest explanatory variable, explaining the majority of the variation in the comprehensive course score, while sentence length contributes to a smaller portion of the variation in the comprehensive course score.