Transforming Learning or Empty Promise? A Meta-Analysis of Generative AI in Education | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Transforming Learning or Empty Promise? A Meta-Analysis of Generative AI in Education Xiuxiu Tang, Xiyu Wang, Liu Dong, Jingxian Cecilia Zhang This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7577394/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This meta-analysis examines the impact of generative artificial intelligence (GenAI) tools, such as ChatGPT, on students’ academic achievement. Drawing on 52 experimental and quasi-experimental studies across educational levels and domains, we synthesized evidence from interventions using GenAI to support learning. Eligible studies reported performance outcomes (e.g., test scores, grades, GPA) and met rigorous inclusion criteria. Overall, GenAI-based instruction showed a positive effect (Hedges' g = 1.193) on academic achievement, with substantial between-study variability indicating that GenAI’s effectiveness depends on contextual and design features. Moderator analyses identified two significant factors, which are instructional role and subject area. GenAI was most effective when used to support formative functions such as assessment, feedback, and tutoring, suggesting that its strength lies in providing adaptive guidance and personalized learning support. Effects also varied across subject areas. Language education showed the strongest and most consistent gains, reflecting a close alignment between GenAI’s natural language capabilities and core instructional practices. In contrast, more modest effects were observed in computer science and art education, where applications tend to be narrower in scope. Other moderators, including educational level, sample size, intervention duration, and learning domain, did not yield statistically significant differences but revealed descriptive patterns that may inform future research and implementation. These findings suggest that GenAI tools hold considerable promise for improving academic performance when thoughtfully integrated into instructional practice. Educators and policymakers should consider both the role GenAI plays and the subject context to ensure its effective use in diverse educational settings. Generative artificial intelligence academic achievement learning outcomes meta-analysis AI in education Figures Figure 1 Figure 2 Figure 3 Figure 4 1. Introduction The rapid advent of generative artificial intelligence (GenAI) tools such as ChatGPT, Gemini, and Claude is reshaping educational landscapes by offering new possibilities for supporting and enhancing learning. Unlike earlier forms of artificial intelligence (AI), GenAI technologies generate human-like text, facilitate problem-solving, and provide interactive, context-sensitive assistance (Fui-Hoon Nah et al., 2023 ). These distinctive capabilities have has attracted widespread attention, but empirical evidence on their educational impact is mixed, highlighting a pressing need for systematic synthesis. Meta-analyses of AI in education more broadly have consistently found positive effects. For example, Dong et al. ( 2025 ) synthesized 29 empirical studies of diverse AI technologies and reported a large overall effect (g = 0.92), with variation across educational levels and subjects. Tlili et al. ( 2025 ), analyzing 85 studies, likewise found a large average effect (g = 1.10), especially highlighting the effectiveness of chatbots. While these reviews established AI’s general potential, they combined traditional AI systems with emerging GenAI tools, leaving the unique contribution of GenAI unclear. More recently, several reviews have examined GenAI specifically (e.g., Liu et al., 2025 ; Sun & Zhou, 2024 ; Zhu et al., 2025 ). Although these studies provide valuable early insights, their scope remains limited. Sun and Zhou ( 2024 ), for instance, focused exclusively on college students and synthesized only 28 studies. Zhu et al. ( 2025 ) included both K–12 and higher education contexts but drew on just 26 studies and reported modest effects (g = 0.39) across varied outcomes. Liu et al. ( 2025 ) analyzed a larger set of 49 studies and found substantial effects on achievement and motivation, but their moderator analyses concentrated on media format and interface features rather than pedagogical functions. These reviews demonstrate the potential of GenAI but leave important questions unanswered regarding its impact on academic achievement across educational levels, learning domains, and instructional designs, including how study design features (e.g., sample size) and the instructional roles assigned to GenAI influence outcomes. The present meta-analysis addresses these gaps by synthesizing recent peer-reviewed, quasi-experimental and experimental studies of GenAI-based instruction published since late 2022. The evidence base spans all educational levels, from K–12 to higher education. By focusing on achievement outcomes and analyzing a larger and more up-to-date body of studies, this study provides a comprehensive and nuanced account of GenAI’s educational effects. It also incorporates several key moderators, including participant sample size, intervention duration, learning domain, subject area, the instructional role of GenAI, and educational level. In doing so, it extends prior reviews and clarifies the conditions under which GenAI is most effective. This contributes to both theory and practice by informing pedagogical design, institutional decision-making, and policy development. Research Questions This meta-analysis synthesizes findings from 52 peer-reviewed empirical studies published between November 2022 and 2025 that met strict eligibility criteria. The study addresses two central questions: RQ1: What is the overall effect of GenAI-based instructional interventions on academic performance outcomes (e.g., test scores, grades, GPA)? RQ2: Under what conditions does GenAI most effectively improve student achievement (e.g., across educational levels, learning domains, instructional roles, subject areas, or study designs)? 2. Literature Review 2.1 Learning Domains: STEM vs non-STEM The pedagogical utility of GenAI is influenced by the academic discipline in which it is applied. Evidence shows that GenAI benefits both STEM and non-STEM fields, but the mechanisms through which it operates differ. In non-STEM disciplines such as language acquisition and the social sciences, GenAI tools have been especially effective for communication, critical inquiry, and affective engagement. For example, in English as a Foreign Language (EFL) instruction, an AI chatbot integrated into “think-pair-share” activities reduced students’ speaking anxiety while increasing enjoyment and oral proficiency (Wu et al., 2025). Likewise, in an undergraduate research methods course, the use of ChatGPT enhanced research competencies and fostered autonomous motivation and self-directed learning (Li et al., 2025). In contrast, within STEM fields such as computer science and engineering, GenAI's primary value appears to lie in its ability to manage cognitive load and provide scaffolded, personalized learning support. For example, a study conducted in a computer vision course found that an intelligent teaching assistant powered by a large language model not only improved academic outcomes but also helped students develop higher-order cognitive skills like active questioning and summarization (Teng et al., 2025). Furthermore, in a university programming course, an AI-agent-supported collaborative learning framework was shown to enhance learning achievement while simultaneously reducing the cognitive effort required of students and increasing their self-efficacy (Wang et al., 2025). In these settings, the AI agent acts as a dynamic, intelligent scaffold, personalizing the learning pathway and enabling students to tackle complex problems more effectively. 2.2 Educational Level Differences Beyond the subject matter, the developmental stage of the learner also plays a crucial role in determining the effectiveness and appropriate application of GenAI. The existing literature indicates that while GenAI is a potent tool across various age groups, its impact diverges between higher education and K-12 settings, with pedagogical strategy being a particularly critical variable for younger learners. Most existing research has been conducted in higher education, where students generally benefit from greater autonomy and digital fluency. In this context, studies have consistently found that learners using GenAI tools outperform their peers in control conditions (Lee et al., 2022; Wang, 2025). For example, in diverse courses spanning public health, engineering, and language learning, university students using GenAI tools have produced higher-quality work, demonstrated stronger comprehension, and reported greater confidence and motivation. Specifically, Lee et al. (2022) found that an AI-based chatbot used for after-class review enhanced academic performance and self-efficacy among college students. Similarly, Wang (2025) reported that undergraduates using ChatGPT-4 for English practice showed significant gains in their communication skills and a high degree of acceptance for the technology. These results suggest that university students are well-positioned to engage productively with GenAI, especially when the technology supports complex tasks like revision, self-regulation, or conceptual reasoning. In K-12 education, the results are more variable and highly dependent on pedagogical support (Alneyadi & Wardat, 2024; Jeon, 2023; Liu et al., 2024; Sapan & Uzun, 2024). For secondary students, interventions that incorporate strong instructional design tend to yield better outcomes. This is supported by Alneyadi and Wardat (2024), who demonstrated that integrating ChatGPT into a Grade 12 Quantum Theory course significantly enhanced student achievement. However, the importance of this scaffolding is underscored by the contrasting findings of Sapan and Uzun (2024), where traditional instruction proved more effective than a less structured ChatGPT integration for improving high school students' writing skills. Research in primary education, while rarer, echoes this theme. Early evidence suggests that GenAI can support foundational skills only when carefully scaffolded. For example, Jeon (2023) found that a chatbot employing "dynamic assessment" with graduated, interactive assistance was highly effective for elementary school students learning vocabulary, while Liu et al. (2024) showed that a teacher-guided, LLM-supported model significantly improved writing performance. Overall, these findings confirm educational level as a key moderator, with the most consistent gains observed in postsecondary contexts and an increasing need for structured, teacher-supported interventions for younger learners. 2.3 Roles of GenAI in Instructional Contexts The role that GenAI plays within the instructional design of each study also varies widely. Based on a synthesis of recent literature, GenAI interventions typically fall into at least three major categories, which are critical for understanding their impact. First, in the role of tutoring and scaffolding, many studies used GenAI to act as a conversational tutor or a tool to support student learning. These interventions aimed to replicate or enhance elements of one-on-one instruction. For example, some used GenAI as a conversational partner with human-like avatars to reduce speaking anxiety and build student confidence (Wang et al., 2024), while others used it as a creative scaffolding tool, translating students' textual descriptions of poetry into visual images to deepen their comprehension (Chu et al., 2025). Second, some studies used GenAI primarily for assessment and feedback, which emphasizes iterative improvement and reflection. In this capacity, GenAI was deployed to automatically evaluate student work and provide formative feedback. For instance, it has been used to provide "dialogic feedback" on complex tasks like computer programming, where the timing and context of the GenAI's interactive suggestions were found to be critical for developing students' critical thinking skills (Gong et al., 2025). Third, an increasing number of studies have explored GenAI’s capacity for personalized learning support. This role often overlaps with tutoring and feedback, as the most effective tools adapt to student input. For example, the conversational tutors adjust their dialogue, and feedback systems provide suggestions tailored to specific errors. This adaptive capability, which supports self-regulated learning and engagement, is a key feature in studies showing positive outcomes across different educational levels (Jeon, 2023; Lee et al., 2022). Although these roles often overlap in practice, they reflect distinct pedagogical aims and learner interactions. Understanding how GenAI functions as a tutor, feedback provider, or personalized support tool is therefore essential for interpreting its educational impact. This body of research indicates the need for a systematic synthesis that examines not only the overall effect of GenAI on student achievement but also the conditions under which its impact varies. 3. Methodology The present meta-analysis examined the influence of GenAI on students’ learning outcomes and investigated how study characteristics such as sample size, educational level, intervention duration, role of GenAI, learning domain, and subject area moderated effect sizes. The study was conducted in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines (Moher et al., 2009 ). 3.1 Eligibility Criteria To be included in this meta-analysis, studies had to meet the following criteria: (1) be peer-reviewed journal articles published between November 30, 2022 and 2025; (2) be written in English and full-text accessible; (3) be empirical in nature (quantitative or mixed-method); (4) apply GenAI technologies such as ChatGPT; (5) be quasi-experimental or true-experimental studies; (6) include both an experimental group using GenAI to support learning and a control group relying on traditional instruction without GenAI; (7) report academic or learning outcomes for both groups; (8) provide sufficient statistical information to calculate the effect size. 3.2 Search Strategy and Selection Criteria We conducted a comprehensive search of published studies using a wide range of scientific databases, including Scopus, Web of Science, and EBSCO (including APA PsycINFO, Education Full Text; Education Source, ERIC, Social Sciences Full Text). The key search terms used in the electronic database searches consisted of two categories: GenAI and learning, which were connected by Boolean AND. The GenAI-related keywords included “generative artificial intelligence” or “Gen-AI” or “GenAI” or “GAI” or “ChatGPT” or “GPT” or “agent” or “chatbot” or “large language model” or “LLM” or “OpenAI”. The learning-related keywords included: “Learning” or “education” or “instruction” or “curriculum” or “course” or “learning outcome” or “learning performance” or “learning effect” or “learning achievement” or “academic performance” or “academic achievement”. 3.3 Data Extraction A PRISMA flow diagram (Fig. 1 ) was used to illustrate the data extraction process. The search initially identified 7,667 records, which were imported for processing. After removing the duplicates, 5,302 unique studies remained for screening. The review proceeded in three rounds. In the first round, titles and abstracts were screened for relevance, resulting in the exclusion of 5,157 records and leaving 145 full-text articles for eligibility assessment. In the second round, the full texts were reviewed in detail, and 91 studies were excluded. In the third round, statistical information and categorical variables were extracted from the remaining studies for effect size calculation. Ultimately, 54 studies met all criteria and were included in the meta-analysis. 3.4 Data Coding A structured coding form was developed to extract key information from each study, including title, publication year, research design, and statistical data (sample size, means, and standard deviations). Additionally, each study was coded on 6 moderator descriptors: (1) educational level ( higher education, secondary education, primary education), (2) sample size (1–50, 51–100, 101–150 or more than 150), (3) learning domain (STEM or Non-STEM), (4) subject area, (5) intervention duration, and (6) the instructional role assigned to GenAI (personalized recommendation, assessment and evaluation, tutoring, mixed or others). By focusing on these specific variables, this meta-analysis provides insights into which aspects of GenAI interventions are most influential. Four coders were trained and coded the first ten articles together to ensure consistency in coding. Then, at least one coder coded and cross-checked the remaining articles. Any discrepancies were discussed and resolved until full agreement was reached. See Appendix for coding details. 3.5 Data Analysis To ensure statistical independence, only one effect size was extracted from each of the 54 included studies. To reduce small-sample bias, Hedges’ g was used to estimate effect sizes. A random effects model, justified by significant heterogeneity (Q test, I² statistic), was used to synthesize the overall impact of GenAI on student achievement. A forest plot was generated to display individual effect sizes and corresponding confidence intervals. Moderator analyses were conducted to examine factors influencing the effectiveness of GenAI interventions. All analyses were conducted using SPSS version 29. 3.6 Publication bias To assess the risk of publication bias, a funnel plot was generated to examine the relationship between study effect sizes and their standard errors. In the absence of bias, the funnel plot should display a symmetrical, funnel-shaped distribution, with smaller studies scattered widely at the base and larger studies clustered near the mean effect size at the top. Asymmetry in the plot, by contrast, may suggest that smaller studies with divergent or unfavorable results are underrepresented, indicating possible selective outcome reporting or publication bias. In addition to visual inspection, Egger’s regression test was conducted to statistically evaluate funnel plot asymmetry. 4. Results 4.1 Weighted Average Effect Size and Heterogeneity Of the 54 eligible studies, two were excluded prior to analysis due to extreme outlier effect sizes and inconsistencies in the original reports, to avoid undue influence on the pooled estimate. Across the 52 independent comparisons included in the analysis, the aggregated impact of GenAI tools on student academic achievement was substantial and statistically significant. The pooled standardized mean difference, expressed as Hedges’ g, was 1.193 (SE = 0.243, 95% CI [0.716, 1.669], p < 0.001), indicating a strong positive effect of GenAI-supported learning relative to conventional instructional methods. The 95% prediction interval (–2.267 to 4.639) further illustrates the extent of contextual variability, suggesting that while the average effect is robust, outcomes may differ across implementation settings and study conditions. Measures of heterogeneity confirmed considerable between-study variability. The Q statistic was highly significant, Q(51) = 863.51, p < 0.05, and the I² index was 98.0%, indicating that nearly all variance observed among effect sizes reflects systematic differences across studies rather than random error. The estimated between-study variance (τ²) was 2.98. A histogram of the unweighted effect sizes (Fig. 2 ) shows a moderately skewed distribution, with several studies reporting large positive effects. The forest plot (Fig. 3 ) provides a visual summary of the individual study estimates and their confidence intervals. These results collectively justify the use of a random-effects model and underscore the value of exploring moderators to account for the observed dispersion. 4.2 Publication Bias As part of the risk-of-bias assessment, a funnel plot of effect sizes against standard errors was generated (Fig. 4). Visual asymmetry raised the possibility of small-study effects or selective reporting. To formally test this, Egger’s regression test for funnel plot asymmetry was conducted. The result was statistically significant, t (49) = 3.156, p < 0.05, providing evidence of potential bias. The extrapolated effect size as standard error approaches zero was − 0.344 (95% CI: − 1.093, 0.406), suggesting that smaller studies may overestimate treatment effects. While the confidence interval of the limit estimate includes zero, the significance of the test highlights the need to interpret the pooled estimate in light of potential publication or reporting bias. 4.3 Moderator Analyses To examine sources of heterogeneity, moderator analyses were conducted on study-level characteristics, including educational level, sample size, learning domain, subject area, intervention duration, and the instructional role assigned to GenAI. Summary statistics for all subgroup comparisons are reported in Table 1 . Table 1 Subgroup Analyses of Potential Moderator Variables Using Random Effects Variable Category K Hedge's g Q statistics df p Chi-square of the moderator analysis Mean SD Education level χ²(2) = 1.628, p = 0.443 Primary education 7 1.007 0.414 40.477 6 < 0.05 Secondary education 11 0.715 0.418 246.381 10 < 0.05 Higher education 34 1.397 0.346 518.676 33 < 0.05 Sample size χ²(3) = 4.745, p = 0.191 1–50 13 1.075 0.276 68.526 12 < 0.05 51–100 29 1.517 0.421 584.256 28 < 0.05 101–150 6 0.349 0.425 127.992 5 < 0.05 More than 150 4 0.671 0.318 50.948 3 < 0.05 Learning domain χ²(1) = 0.499, p = 0.480 STEM 20 1.329 0.365 321.970 19 < 0.05 Non-STEM 32 1.313 0.352 541.032 31 < 0.05 Subject area χ²(10) = 51.949, p < 0.001 General education 4 1.423 0.662 61.847 3 < 0.05 Art education 4 0.284 0.181 4.515 3 0.211 Language education 23 1.629 0.486 331.692 22 < 0.05 Mathematics education 1 0.274 0.324 NA NA NA Physics education 2 1.130 0.128 0.000 1 0.997 Science education 5 0.975 0.890 130.125 4 < 0.05 Medical education 1 2.334 0.402 NA NA NA Nurse education 1 2.084 0.405 NA NA NA Health education 3 0.743 0.167 0.362 2 0.834 Computer science education 7 0.467 0.655 190.030 6 < 0.05 Engineering education 1 1.544 0.207 NA NA NA Intervention duration χ²(2) = 1.864, p = 0.397 1–4 weeks 26 1.159 0.270 340.446 25 < 0.05 5–8 weeks 18 0.787 0.342 355.2363 17 8 weeks 8 2.332 1.228 130.434 7 < 0.05 Instructional role of GenAI χ²(4) = 10.641, p = 0.031 Assessment and evaluation 4 2.019 0.586 46.288 3 < 0.05 Personalized recommendation 1 0.490 0.311 NA NA NA Tutoring 17 1.410 0.398 208.508 16 < 0.05 Mixed 29 1.017 0.367 532.273 28 < 0.05 Others 1 0.274 0.324 NA NA NA Educational Level Differences across educational levels were not statistically significant (χ²(2) = 1.628, p = .443). Still, a descriptive gradient appeared: studies at the higher-education level showed the largest mean effect (g = 1.397), followed by primary education (g = 1.007), with secondary education showing the smallest (g = 0.715). Although this difference cannot be interpreted as conclusive, it hints that GenAI may yield stronger outcomes among learners in higher education, who can make more autonomous use of the tools. The smaller gains observed in secondary education may reflect differences in curricular structure or support needs. Thus, even without statistical significance, the pattern suggests that developmental stage could influence the effectiveness of GenAI integration. Sample Size Sample size also did not significantly moderate effects (χ²(3) = 4.745, p = .191). Nonetheless, a clear descriptive tendency emerged. Studies with 51–100 participants reported the strongest mean effect (g = 1.517). Smaller studies with 1–50 participants showed a positive but lower effect (g = 1.075). Very large studies with over 150 participants reported a more modest effect (g = 0.671). The lowest mean appeared in the 101–150 participant group (g = 0.349). These findings should be read cautiously, but they suggest that study scale may shape outcomes. Mid-sized studies often occur in manageable settings where GenAI can be fully embedded, while very large studies may include heterogeneous contexts that dilute effects. The unexpectedly low value in the 101–150 group likely reflects study-specific features rather than a general trend. Altogether, while not statistically significant, the pattern hints that study scale could be a practical factor in how GenAI’s impact is realized. Learning Domain The difference between STEM and non-STEM domains was not significant (χ²(1) = 0.499, p = .480). The mean effect sizes were nearly identical - STEM at g = 1.329 and non-STEM at g = 1.313. This similarity suggests that GenAI has broad applicability across disciplinary boundaries. Although the contrast was nonsignificant, the fact that both groups showed large positive effects indicates that the benefits of GenAI are not confined to one type of learning domain. This consistency across fields strengthens the case for generalizability. Subject Area Subject area significantly moderated effects, χ²(10) = 51.949, p < .001. The strongest average gains were observed in language education (g = 1.629), followed by engineering (g = 1.544) and general education (g = 1.423). Smaller pooled effects were reported in computer science (g = 0.467) and art education (g = 0.284). Very large mean values appeared in medical education (g = 2.3341) and nursing (g = 2.084), though each was based on a single study and thus should be interpreted cautiously. The relatively modest average in computer science is notable, especially given GenAI's conceptual alignment with this field. One possible reason is that current studies in CS education often focus on narrow skillsets such as syntax correction or code generation, rather than broader applications like project-based learning, peer feedback, or generative support for computational thinking. Moreover, the small number of available studies (K = 7) limits the stability of the pooled estimate, and the diversity in study design further complicates interpretation. As a result, the generalizability of the observed effect size in this area should be regarded as limited. By contrast, the consistently strong effects observed in language education likely reflect a tighter match between GenAI’s natural language capabilities and common instructional tasks in that domain, such as writing assistance, paraphrasing, translation, and iterative feedback. The extremely large means found in medical and nursing education should be treated cautiously due to their basis in single studies, but they suggest potentially promising areas for simulation-based or decision-support applications—pending further replication. Altogether, the evidence indicates that GenAI’s impact is not distributed evenly across disciplines. It tends to be stronger where the tools' capabilities closely map onto instructional practices and weaker where current use cases remain limited in scope or scale. Computer science education, despite its intuitive connection to AI, may require broader pedagogical integration to fully reflect the field’s potential. Intervention Duration Intervention duration did not significantly moderate the effect, χ²(2) = 1.864, p = .397. The average effect size was highest for studies lasting more than eight weeks (g = 2.332), followed by short-term interventions of one to four weeks (g = 1.159). Interventions of medium length, lasting five to eight weeks, showed the lowest average (g = 0.787). The longer interventions showed a strikingly high average effect, though based on a smaller set of studies. Medium-duration studies reported the lowest average, while shorter interventions fell in between. While these results were not significant, the variation raises the possibility that exposure time plays a role in GenAI's impact. Instructional Role of GenAI Instructional role of GenAI significantly moderated outcomes, χ²(4) = 10.641, p = .031. The largest effects were found in assessment and evaluation (g = 2.019) and tutoring (g = 1.410). Mixed-use studies averaged g = 1.017, while recommendation-based (g = 0.490) and miscellaneous applications (g = 0.274) were lower. The pattern suggests that GenAI is most effective when integrated directly into instructional feedback and guidance processes. While the test was significant, some categories included very few studies, so interpretations should be tempered accordingly. 5. Discussion This meta-analysis of 52 experimental and quasi-experimental studies demonstrates that GenAI interventions produce a strong positive effect on student academic achievement (Hedges’ g = 1.193). This estimate exceeds effect sizes reported in earlier reviews of AI or GenAI in education (Dong et al., 2025 ; X. Liu et al., 2025 ; Sun & Zhou, 2024 ; Tlili et al., 2025 ; Zhu et al., 2025 ). Several factors likely explain this difference, including the rapid advances in large language models since 2022, which have made GenAI tools more accurate, interactive, and versatile, as well as the broader scope of this review, which encompassed multiple educational levels, learning domains, subject areas and instructional designs. At the same time, the observed high heterogeneity indicates that effectiveness varies substantially across contexts. Understanding the sources of this variation is therefore critical. The moderator analyses revealed that the instructional role of GenAI and subject area were statistically significant, which indicates conditions under which GenAI is most effective. Other moderators (educational level, intervention duration, sample size, and learning domain) did not reach significance but showed descriptive patterns that can inform future research and implementation. 5.1 Instructional Role as a Core Driver The instructional role emerged as one of the most important moderators of GenAI effectiveness. Interventions in which GenAI was used for assessment and feedback produced the strongest effects, followed by tutoring applications. By contrast, recommendation-based or loosely defined uses were less effective. These findings align with the extensive literature on formative assessment, which emphasizes that timely, targeted, and individualized feedback is among the most powerful influences on student learning (Black & Wiliam, 2009 ; Hattie & Timperley, 2007 ). GenAI tools are well-suited to this function because they can provide immediate, personalized responses at scale, supplementing teacher feedback and allowing for more frequent cycles of practice and revision. From the perspective of scaffolding theory (Van De Pol et al., 2010 ; Wood et al., 1976 ), GenAI can function as a flexible support mechanism that adapts to learners’ needs as they progress through their zone of proximal development (Vygotskij, 1981 ). In addition, self-regulated learning frameworks (Panadero, 2017 ; Zimmerman, 2002 ) suggest that such adaptive feedback promotes metacognitive monitoring, error detection, and iterative improvement—all of which are processes that GenAI-based tutoring or assessment tools can enhance. The weaker effects of recommendation-based or miscellaneous applications suggest that pedagogical alignment is essential. Simply deploying GenAI as a novelty or generic recommender does not guarantee benefits; in fact, such uses may fail to engage deeper learning processes. Together, these results suggest that GenAI’s strongest contributions occur when its functions are directly tied to formative assessment and tutoring, where feedback and adaptive guidance are critical for learning. 5.2 Subject Area Differences: Affordance-Practice Alignment The second significant moderator was subject area, with striking variation in effect sizes across disciplines. The largest and most consistent gains were observed in language education, followed by strong results in engineering and general education. By contrast, more modest effects were reported in computer science and art education, while single studies in medical and nursing education reported very large effects. The success of GenAI in language education likely reflects the close match between large language models’ capabilities and the core practices of the domain. Writing, translation, paraphrasing, and revision are all central to language learning, and GenAI can provide immediate, iterative, and tailored support in these areas (Hyland & Hyland, 2006 ). This natural synergy between affordance and task likely explains why language education consistently produced stronger effects. The weaker effects in computer science are notable. Despite the field’s conceptual connection to AI, many CS interventions focused on narrow skill sets such as syntax correction or code generation (Yang et al., 2025; Zhao et al., 2025). These uses may ease extraneous cognitive load (Sweller, 1988 ) but do not necessarily foster deeper conceptual learning or computational thinking. More integrative approaches, such as project-based learning, peer review of code, or generative support for problem decomposition, may be needed to realize GenAI’s potential in this field. Similarly, in art education, current applications may not align well with the creative, iterative, and process-driven nature of artistic practice. The very large effects reported in medical and nursing education, though based on single studies, point to promising directions. GenAI could be particularly useful in simulation-based learning, clinical decision-making, and diagnostic reasoning, where adaptive feedback and scenario generation are valuable (Chang et al., 2022; Cook et al., 2013 ). However, further replication is needed before strong conclusions can be drawn. These results suggest that GenAI’s effectiveness depends on how closely its affordances align with the instructional practices of a domain. Language tasks map naturally onto LLM capabilities, while other fields may require more thoughtful integration to achieve similar benefits. 5.3 Contextual Moderators: Descriptive but Informative Patterns Although not statistically significant, several other moderators revealed consistent descriptive patterns that provide insight into contextual conditions shaping GenAI’s impact. Educational level. The gradient observed—higher education showing the strongest effects, followed by primary, then secondary—suggests that learner autonomy and developmental readiness may play important roles. University students may be better equipped to independently integrate GenAI into their work, while primary students may benefit when teachers mediate GenAI use. Secondary students, in contrast, may be at a transitional stage where metacognitive monitoring is still developing (Kuhn, 2000 ; Pintrich, 2000 ), and where curricular structures may limit opportunities to use GenAI meaningfully (Mercer & Littleton, 2007 ). These patterns, while not conclusive, suggest that developmental stage could influence how effectively learners benefit from GenAI. Intervention duration. Longer interventions (> 8 weeks) showed descriptively stronger effects than shorter ones, while medium-length interventions yielded the lowest averages. This pattern may reflect novelty effects in short-term studies and the stabilizing of benefits in longer-term integrations. Motivation research supports this interpretation: self-determination theory (Ryan & Deci, 2000 ) emphasizes that sustained engagement depends on autonomy, competence, and relatedness, while work on achievement emotions suggests that novelty-driven excitement tends to fade unless interventions are integrated into stable learning routines (Linnenbrink-Garcia & Pekrun, 2011 ; Pekrun, 2006 ; Tsay et al., 2020 ). These findings point to the importance of designing interventions for sustained rather than short-term use. Sample size. The strongest effects were observed in mid-sized studies (51–100 participants), while very large studies yielded smaller effects. One explanation is that mid-sized studies often occur in classroom-based settings where implementation fidelity is high, while large studies encompass heterogeneous learners and contexts that dilute effects. At the same time, small-study bias may inflate results in underpowered designs (Ioannidis, 2005 ; Valentine et al., 2010 ). This pattern underscores the need for larger, well-controlled trials that balance fidelity with generalizability. STEM vs. non-STEM domains. Finally, the near-identical effects across STEM and non-STEM fields suggest that GenAI’s benefits are broadly applicable. This finding strengthens the case that GenAI is not restricted to text-rich domains but can enhance learning across diverse disciplines, provided applications are thoughtfully designed. 5.4 Limitations and Future Research Several limitations warrant caution. First, publication bias suggests that smaller studies may overestimate effects, and the wide prediction interval indicates that not all interventions yield positive outcomes. Second, the evidence base remains dominated by short-term studies in higher education, leaving gaps in K–12 contexts where scaffolding and access issues are more pronounced. Third, few studies examined equity implications, raising concerns that GenAI could exacerbate existing digital divides (Capraro et al., 2024 ; Luo & Liu, 2025 ). Finally, the rapid evolution of large language models raises questions about the generalizability of current findings to future iterations. Future research should address these gaps by conducting preregistered, large-scale trials with strong implementation fidelity; longitudinal studies to examine sustainability over time; and equity-focused analyses to ensure benefits are accessible across diverse learner populations. Design-based research (Brown, 1992 ; Cobb et al., 2003 ) will be particularly valuable for refining pedagogical integration, while cross-model comparisons are needed to disentangle technology-specific from pedagogy-specific effects. 5.5 Practical Significance This meta-analysis shows that GenAI can substantially improve student learning when used thoughtfully, but not all uses are equally effective. The strongest benefits occur when GenAI supports formative roles such as providing feedback on student work or serving as a tutoring assistant. In these cases, students receive more personalized guidance, which helps them practice, revise, and reflect in ways that accelerate learning. Language education stands out as the field with the largest gains, reflecting a strong match between GenAI’s language-generation strengths and writing, translation, and feedback tasks. By contrast, narrower applications in fields like computer science or art show smaller benefits, suggesting that more integrative and creative uses are needed. For educators and decision-makers, the practical takeaway is that: GenAI should be viewed as a pedagogical partner rather than a novelty. Its value lies not in replacing teachers, but in scaling up feedback, tutoring, and practice opportunities that are otherwise difficult to provide in large or diverse classrooms. At the same time, successful implementation requires attention to context: younger students need structured teacher support, longer-term use appears more beneficial than one-off trials, and equitable access must be ensured to avoid widening learning gaps. With thoughtful integration, GenAI can reduce instructional burden and expand opportunities for student-centered learning across a wide range of subjects. 5.6 Implications For educators, these findings suggest that GenAI should be deployed strategically in formative roles, such as tutoring and feedback, where its capacity for personalization can enhance learning. In K–12 contexts, teacher mediation and scaffolding are critical to ensure productive engagement with AI tools. For institutions, professional development is essential to prepare teachers not only to operate GenAI tools but to integrate them into pedagogy responsibly. Implementation should be guided by clear ethical standards and curricular alignment (Holmes et al., 2022 ). For policymakers, equity must be a central concern. Without intentional policies to ensure access, GenAI may widen digital divides (Bentley et al., 2024 ). Policies that promote responsible adoption, transparency, and research–practice partnerships will be critical to maximizing benefits and minimizing risks. At a system level, GenAI should be understood as a pedagogical partner rather than a replacement for teachers. When embedded thoughtfully, it can reduce instructional burden, strengthen formative assessment, and expand opportunities for student-centered learning. 5.7 Conclusion This meta-analysis provides a comprehensive synthesis of GenAI’s effects on student achievement. While overall benefits are strong, outcomes depend on how, where, and for whom GenAI is implemented. Instructional role and subject area emerged as the strongest moderators, with particularly large effects for formative assessment, tutoring, and language education. Descriptive patterns further suggest that developmental stage, exposure time, and implementation fidelity condition effectiveness. Looking forward, GenAI has the potential to transform education, but its promise will only be realized through responsible integration, equity-focused implementation, and sustained empirical evaluation. By aligning GenAI use with pedagogical theory, developmental needs, and institutional support, educators and policymakers can move beyond novelty to unlock its long-term potential for equitable, evidence-based learning. Declarations Author Contribution X.T. contributed to conceptualization, methodology, formal analysis, data curation, software, validation, supervision, and writing (original draft and review & editing). X.W. contributed to methodology, investigation, formal analysis, data curation, software, validation, and writing (review & editing). L.D. contributed to methodology, formal analysis, data curation, software, validation, visualization, and writing (review & editing). J.Z. contributed to methodology, data curation, software, validation, and writing (review & editing). All authors reviewed and approved the final manuscript. References Ait Baha, T., El Hajji, M., Es-Saady, Y., & Fadili, H. (2024). The impact of educational chatbot on student learning experience. Education and Information Technologies , 29 (8), 10153-10176. https://doi.org/10.1007/s10639-023-12166-w Alneyadi, S., & Wardat, Y. (2024). Integrating ChatGPT in grade 12 quantum theory education: An exploratory study at Emirate School (UAE). International Journal of Information and Education Technology , 14 (3), 398–410. https://doi.org/10.18178/ijiet.2024.14.3.2061 Al Ghaithi, A., & Behforouz, B. (2024). The use of an interactive ChatBot in grammar learning. Journal of Educators Online , 21 (4), n4. Avello-Martínez, R., Gajderowicz, T., & Gómez-Rodríguez, V. G. (2024). Is ChatGPT helpful for graduate students in acquiring knowledge about digital storytelling and reducing their cognitive load? An experiment. Revista De Educación a Distancia (RED) , 24 (78). https://doi.org/10.6018/red.604621 Behforouz, B., & Ghaithi, A. A. (2024). Investigating the effect of an interactive educational chatbot on reading comprehension skills. International Journal of Engineering Pedagogy (iJEP) , 14 (4), 139–154. https://doi.org/10.3991/ijep.v14i4.48461 Beltozar-Clemente, S., & Díaz-Vega, E. (2024). Physics XP: Integration of ChatGPT and gamification to improve academic performance and motivation in physics 1 course. International Journal of Engineering Pedagogy , 14 (6). https://doi.org/10.3991/ijep.v14i6.47127 Bentley, S. V., Naughtin, C. K., McGrath, M. J., Irons, J. L., & Cooper, P. S. (2024). The digital divide in action: How experiences of digital technology shape future relationships with artificial intelligence. AI and Ethics , 4 (4), 901–915. https://doi.org/10.1007/s43681-024-00452-3 Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability , 21 (1), 5–31. https://doi.org/10.1007/s11092-008-9068-5 Brown, A. L. (1992). Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. Journal of the Learning Sciences , 2 (2), 141–178. https://doi.org/10.1207/s15327809jls0202_2 Capraro, V., Lentsch, A., Acemoglu, D., Akgun, S., Akhmedova, A., Bilancini, E., Bonnefon, J.-F., Brañas-Garza, P., Butera, L., Douglas, K. M., Everett, J. A. C., Gigerenzer, G., Greenhow, C., Hashimoto, D. A., Holt-Lunstad, J., Jetten, J., Johnson, S., Kunz, W. H., Longoni, C., … Viale, R. (2024). The impact of generative artificial intelligence on socioeconomic inequalities and policy making. PNAS Nexus , 3 (6), 191. https://doi.org/10.1093/pnasnexus/pgae191 Chang, C., Hwang, G., & Gau, M. (2022). Promoting students’ learning achievement and self‐efficacy: A mobile chatbot approach for nursing training. British Journal of Educational Technology , 53 (1), 171–188. https://doi.org/10.1111/bjet.13158 Chen, C., & Chang, C. (2024). Effectiveness of AI-assisted game-based learning on science learning outcomes, intrinsic motivation, cognitive load, and learning behavior. Education and Information Technologies , 29 (14), 18621–18642. https://doi.org/10.1007/s10639-024-12553-x Chen, C., & Gong, Y. (2025). The role of AI-assisted learning in academic writing: A mixed-methods study on Chinese as a second language students. Education Sciences , 15 (2), 141. https://doi.org/10.3390/educsci15020141 Chen, J., Mokmin, N. A. M., Shen, Q., & Su, H. (2025). Leveraging AI in design education: exploring virtual instructors and conversational techniques in flipped classroom models. Education and Information Technologies , 1-21. https://doi.org/10.1007/s10639-025-13458-z Chen, M. R. A. (2025). Improving English semantic learning outcomes through AI chatbot-based ARCS approach. Interactive Learning Environments , 1-16. https://doi.org/10.1080/10494820.2025.2454443 Chen, M.-R. A. (2024). Metacognitive mastery: Transformative learning in EFL through a generative AI chatbot fueled by metalinguistic guidance. Educational Technology & Society , 27 (3), 407–427. https://www.jstor.org/stable/48787038 Chen, M. R. A. (2024). The AI chatbot interaction for semantic learning: A collaborative note-taking approach with EFL students. Language Learning & Technology , 28 (1), 1-25. Chen, Y., Zhang, X., & Hu, L. (2024). A progressive prompt-based image-generative AI approach to promoting students’ achievement and perceptions in learning ancient Chinese poetry. Educational Technology & Society , 27 (2), 284-305. https://hdl.handle.net/10125/73586 Chu, H.-C., Hsu, C.-Y., & Wang, C.-C. (2025). Effects of AI-generated drawing on students’ learning achievement and creativity in an ancient poetry course. Educational Technology & Society , 28 (2). https://doi.org/10.30191/ETS.202504_28(2).TP03 Cobb, P., Confrey, J., diSessa, A., Lehrer, R., & Schauble, L. (2003). Design experiments in educational research. Educational Researcher , 32 (1), 9–13. https://doi.org/10.3102/0013189X032001009 Cook, D. A., Hamstra, S. J., Brydges, R., Zendejas, B., Szostek, J. H., Wang, A. T., Erwin, P. J., & Hatala, R. (2013). Comparative effectiveness of instructional design features in simulation-based education: Systematic review and meta-analysis. Medical Teacher , 35 (1), e867–e898. https://doi.org/10.3109/0142159X.2012.714886 Dong, L., Tang, X., & Wang, X. (2025). Examining the effect of artificial intelligence in relation to students’ academic achievement: A meta-analysis. Computers and Education: Artificial Intelligence , 8 , 100400. https://doi.org/10.1016/j.caeai.2025.100400 Edwards, B. I., Olugbade, D., & Ojo, O. A. (2024). Facilitating cognitive load management and improved learning outcomes and attitudes in middle school technology and vocational education through ai chatbot. Journal of Technical Education and Training , 16 (3), 114-131. https://penerbit.uthm.edu.my/ojs/index.php/JTET/article/view/19476 Fui-Hoon Nah, F., Zheng, R., Cai, J., Siau, K., & Chen, L. (2023). Generative AI and ChatGPT: Applications, challenges, and AI-human collaboration. Journal of information technology case and application research, 25(3), 277-304. https://doi.org/10.1080/15228053.2023.2233814 Fathi, T. E., Saad, A., Larhzil, H., Lamri, D., & Ibrahmi, E. M. A. (2025). Integrating generative AI into STEM education: enhancing conceptual understanding, addressing misconceptions, and assessing student acceptance. Disciplinary and Interdisciplinary Science Education Research , 7 (1). https://doi.org/10.1186/s43031-025-00125-z Fidan, M., & Gencel, N. (2022). Supporting the instructional videos with chatbot and peer feedback mechanisms in online learning: The effects on learning performance and intrinsic motivation. Journal of Educational Computing Research , 60 (7), 1716-1741. https://doi.org/10.1177/07356331221077901 Gasaymeh, A. M. M., & AlMohtadi, R. M. (2024). The effect of flipped interactive learning (FIL) based on ChatGPT on students’ skills in a large programming class. International Journal of Information and Education Technology , 14 (11), 1516-1522. https://doi.org/10.18178/ijiet.2024.14.11.2182 Gong, X., Li, Z., & Qiao, A. (2025). Impact of generative AI dialogic feedback on different stages of programming problem solving. Education and Information Technologies , 30 (7), 9689–9709. https://doi.org/10.1007/s10639-024-13173-1 Hakim, V. G. A., Paiman, N. A., & Rahman, M. H. S. (2024). Genie‐on‐demand: A custom AI chatbot for enhancing learning performance, self‐efficacy, and technology acceptance in occupational health and safety for engineering education. Computer Applications in Engineering Education , 32 (6), e22800. https://doi.org/10.1002/cae.22800 Hattie, J., & Timperley, H. (2007). The Power of Feedback. Review of Educational Research , 77 (1), 81–112. https://doi.org/10.3102/003465430298487 Holmes, W., Porayska-Pomsta, K., Holstein, K., Sutherland, E., Baker, T., Shum, S. B., Santos, O. C., Rodrigo, M. T., Cukurova, M., Bittencourt, I. I., & Koedinger, K. R. (2022). Ethics of AI in Education: Towards a Community-Wide Framework. International Journal of Artificial Intelligence in Education , 32 (3), 504–526. https://doi.org/10.1007/s40593-021-00239-1 Hsu, M. H. (2024). Mastering medical terminology with ChatGPT and Termbot. Health Education Journal , 83 (4), 352-358. https://doi.org/10.1177/00178969231197371 Hui, Z., Zewu, Z., Jiao, H., & Yu, C. (2025). Application of ChatGPT-assisted problem-based learning teaching method in clinical medical education. BMC Medical Education , 25 (1), 1-7. https://doi.org/10.1186/s12909-024-06321-1 Hwang, G. J., & Zhang, D. (2024). Effects of an adaptive computer agent-based digital game on EFL students’ English learning outcomes. Educational technology research and development , 72 (6), 3271-3294. https://doi.org/10.1007/s11423-024-10396-4 Hyland, K., & Hyland, F. (2006). Feedback on second language students’ writing. Language Teaching , 39 (2), 83–101. https://doi.org/10.1017/S0261444806003399 Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine , 2 (8), e124. https://doi.org/10.1371/journal.pmed.0020124 Jeon, J. (2023). Chatbot-assisted dynamic assessment (CA-DA) for L2 vocabulary learning and diagnosis. Computer Assisted Language Learning , 36 (7), 1338–1364. https://doi.org/10.1080/09588221.2021.1987272 Ji, Y., Zhan, Z., Li, T., Zou, X., & Lyu, S. (2025). Human-machine co-creation: the effects of ChatGPT on students' learning performance, AI awareness, critical thinking, and cognitive load in a STEM course towards entrepreneurship. IEEE Transactions on Learning Technologies . https://doi.org/10.1109/TLT.2025.3554584 Karaman, M. R., & Göksu, İ. (2024). Are lesson plans created by ChatGPT more effective? an experimental study. International Journal of Technology in Education , 7 (1), 107–127. https://doi.org/10.46328/ijte.607 Kuhn, D. (2000). Metacognitive development. Current Directions in Psychological Science , 9 (5), 178–181. https://doi.org/10.1111/1467-8721.00088 Lee, Y.-F., Hwang, G.-J., & Chen, P.-Y. (2022). Impacts of an AI-based chabot on college students’ after-class review, academic performance, self-efficacy, learning attitude, and motivation. Educational Technology Research and Development , 70 (5), 1843–1865. https://doi.org/10.1007/s11423-022-10142-8 Lee, Y. F., Hwang, G. J., & Chen, P. Y. (2025). Technology-based interactive guidance to promote learning performance and self-regulation: a chatbot-assisted self-regulated learning approach. Educational Technology Research and Development , 1-26. https://doi.org/10.1007/s11423-025-10478-x Li, H. (2023). Effects of a ChatGPT-based flipped learning guiding approach on learners’ courseware project performances and perceptions. Australasian Journal of Educational Technology , 39 (5), 40-58. https://doi.org/10.14742/ajet.8923 Li, H., Wang, Y., Luo, S., & Huang, C. (2025). The influence of GenAI on the effectiveness of argumentative writing in higher education: Evidence from a quasi-experimental study in China. Journal of Asian Public Policy , 18 (2), 405-430. https://doi.org/10.1080/17516234.2024.2363128 Li, Y., Sadiq, G., Qambar, G., & Zheng, P. (2025). The impact of students’ use of ChatGPT on their research skills: The mediating effects of autonomous motivation, engagement, and self-directed learning. Education and Information Technologies , 30 (4), 4185–4216. https://doi.org/10.1007/s10639-024-12981-9 Liang, H. Y., Hwang, G. J., Hsu, T. Y., & Yeh, J. Y. (2024). Effect of an AI‐based chatbot on students' learning performance in alternate reality game‐based museum learning. British Journal of Educational Technology , 55 (5), 2315-2338. https://doi.org/10.1111/bjet.13448 Linnenbrink-Garcia, L., & Pekrun, R. (2011). Students’ emotions and academic engagement: Introduction to the special issue. Contemporary Educational Psychology , 36 (1), 1–3. https://doi.org/10.1016/j.cedpsych.2010.11.004 Liu, C. C., Hwang, G. J., Yu, P., Tu, Y. F., & Wang, Y. (2025). Effects of an automated corrective feedback-based peer assessment approach on students’ learning achievement, motivation, and self-regulated learning conceptions in foreign language pronunciation. Educational Technology Research and Development , 1-22. https://doi.org/10.1007/s11423-025-10484-z Liu, X., Guo, B., He, W., & Hu, X. (2025). Effects of generative artificial intelligence on k-12 and higher education students’ learning outcomes: A Meta-Analysis. Journal of Educational Computing Research , 63 (5), 1249–1291. https://doi.org/10.1177/07356331251329185 Liu, Z.-M., Hwang, G.-J., Chen, C.-Q., Chen, X.-D., & Ye, X.-D. (2024). Integrating large language models into EFL writing instruction: Effects on performance, self-regulated learning strategies, and motivation. Computer Assisted Language Learning , 1–25. https://doi.org/10.1080/09588221.2024.2389923 Luo, J. (Jess), & Liu, X. (Caroline). (2025). What do we mean by digital equality in education? Toward five conceptual lenses based on a systematic review. Journal of Research on Technology in Education , 1–21. https://doi.org/10.1080/15391523.2025.2487279 Mahapatra, S. (2024). Impact of ChatGPT on ESL students’ academic writing skills: A mixed methods intervention study. Smart Learning Environments , 11 (1). https://doi.org/10.1186/s40561-024-00295-9 Mercer, N., & Littleton, K. (2007). Dialogue and the Development of Children’s Thinking (0 ed.). Routledge. https://doi.org/10.4324/9780203946657 Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & for the PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. BMJ , 339 (jul21 1), b2535–b2535. https://doi.org/10.1136/bmj.b2535 Nusivera, E., & Hikmat, A. (2025). Integration of Chat-GPT usage in language learning model to improve argumentation skills, complex comprehension skills, and critical thinking skills. IJLTER. ORG , 24 (2), 375-390. https://doi.org/10.26803/ijlter.24.2.19 Panadero, E. (2017). A review of self-regulated learning: Six models and four directions for research. Frontiers in Psychology , 8 , 422. https://doi.org/10.3389/fpsyg.2017.00422 Pekrun, R. (2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educational Psychology Review , 18 (4), 315–341. https://doi.org/10.1007/s10648-006-9029-9 Pintrich, P. R. (2000). The role of goal orientation in self-regulated learning. In Handbook of Self-Regulation (pp. 451–502). Elsevier. https://doi.org/10.1016/B978-012109890-2/50043-3 Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist , 55 (1), 68–78. https://doi.org/10.1037/0003-066X.55.1.68 Sapan, M., & Uzun, L. (2024). The effect of ChatGPT-integrated English teaching on high school EFL learners’ writing skills and vocabulary development. International Journal of Education in Mathematics, Science and Technology , 12 (6), 1679–1699. https://doi.org/10.46328/ijemst.4655 Shahsavar, Z., Kafipour, R., Khojasteh, L., & Pakdel, F. (2024). Is artificial intelligence for everyone? Analyzing the role of ChatGPT as a writing assistant for medical students. Frontiers in Education , 9 . https://doi.org/10.3389/feduc.2024.1457744 Shi, H., Chai, C. S., Zhou, S., & Aubrey, S. (2025). Comparing the effects of ChatGPT and automated writing evaluation on students’ writing and ideal L2 writing self. Computer Assisted Language Learning , 1-28. https://doi.org/10.1080/09588221.2025.2454541 Sun, L., & Zhou, L. (2024). Does generative artificial intelligence improve the academic achievement of college students? A Meta-analysis. Journal of Educational Computing Research , 62 (7), 1676–1713. https://doi.org/10.1177/07356331241277937 Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science , 12 (2), 257–285. https://doi.org/10.1207/s15516709cog1202_4 Teng, D., Wang, X., Xia, Y., Zhang, Y., Tang, L., Chen, Q., Zhang, R., Xie, S., & Yu, W. (2025). Investigating the utilization and impact of large language model-based intelligent teaching assistants in flipped classrooms. Education and Information Technologies , 30 (8), 10777–10810. https://doi.org/10.1007/s10639-024-13264-z Tlili, A., Saqer, K., Salha, S., & Huang, R. (2025). Investigating the effect of artificial intelligence in education (AIEd) on learning achievement: A meta-analysis and research synthesis. Information Development , 41 (3), 825–842. https://doi.org/10.1177/02666669241304407 Tsay, C. H., Kofinas, A. K., Trivedi, S. K., & Yang, Y. (2020). Overcoming the novelty effect in online gamified learning systems: An empirical evaluation of student engagement and performance. Journal of Computer Assisted Learning , 36 (2), 128–146. https://doi.org/10.1111/jcal.12385 Valentine, J. C., Pigott, T. D., & Rothstein, H. R. (2010). How many studies do you need?: A primer on statistical power for meta-analysis. Journal of Educational and Behavioral Statistics , 35 (2), 215–247. https://doi.org/10.3102/1076998609346961 Van De Pol, J., Volman, M., & Beishuizen, J. (2010). Scaffolding in teacher–student interaction: A decade of research. Educational Psychology Review , 22 (3), 271–296. https://doi.org/10.1007/s10648-010-9127-6 Vygotskij, L. S. (1981). Mind in society: The development of higher psychological processes (Nachdr.). Harvard Univ. Press. Wang, C., Zou, B., Du, Y., & Wang, Z. (2024). The impact of different conversational generative AI chatbots on EFL learners: An analysis of willingness to communicate, foreign language speaking anxiety, and self-perceived communicative competence. System , 127 , 103533. https://doi.org/10.1016/j.system.2024.103533 Wang, H., Wang, C., Chen, Z., Liu, F., Bao, C., & Xu, X. (2025). Impact of AI-agent-supported collaborative learning on the learning outcomes of University programming courses. Education and Information Technologies . https://doi.org/10.1007/s10639-025-13487-8 Wang, M., Zhang, D., Zhu, J., & Gu, H. (2025). Effects of incorporating a large language model-based adaptive mechanism into contextual games on students’ academic performance, flow experience, cognitive load and behavioral patterns. Journal of Educational Computing Research , 63 (3), 662-694. https://doi.org/10.1177/07356331251321719 Wang, Y. (2025). A study on the efficacy of ChatGPT-4 in enhancing students’ English communication skills. Sage Open , 15 (1), 21582440241310644. https://doi.org/10.1177/21582440241310644 Wei, X., Wang, L., Lee, L. K., & Liu, R. (2025). Multiple generative AI pedagogical agents in augmented reality environments: A study on implementing the 5E model in science education. Journal of Educational Computing Research , 63 (2), 336-371. https://doi.org/10.1177/07356331241305519 Wood, D., Bruner, J. S., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry , 17 (2), 89–100. https://doi.org/10.1111/j.1469-7610.1976.tb00381.x Wu, T.-T., Hapsari, I. P., & Huang, Y.-M. (2025). Effects of incorporating AI chatbots into think–pair–share activities on EFL speaking anxiety, language enjoyment, and speaking performance. Computer Assisted Language Learning , 1–39. https://doi.org/10.1080/09588221.2025.2478271 Yang, T.-C., Hsu, Y.-C., & Wu, J.-Y. (2025). The effectiveness of ChatGPT in assisting high school students in programming learning: Evidence from a quasi-experimental research. Interactive Learning Environments , 1–18. https://doi.org/10.1080/10494820.2025.2450659 Zahran, F. A. (2025). The Impact of using Poe ChatGPT-based TPACK model on English as a foreign language teachers' performance and their students' vocabulary learning. Higher Learning Research Communications , 15 (1), n1. Zhao, G., Yang, L., Hu, B., & Wang, J. (2025). A Generative artificial intelligence (AI)-based human-computer collaborative programming learning method to improve computational thinking, learning attitudes, and learning achievement. Journal of Educational Computing Research , 63 (5), 1059–1087. https://doi.org/10.1177/07356331251336154 Zhou, Q., Hashim, H., & Sulaiman, N. A. (2025). Supporting English speaking practice in higher education: the impact of AI chatbot-integrated mobile-assisted blended learning framework. Education and Information Technologies , 1-32. Zhu, Y., Liu, Q., & Zhao, L. (2025). Exploring the impact of Generative artificial intelligence on students’ learning outcomes: A meta-analysis. Education and Information Technologies , 30 (11), 16211–16239. https://doi.org/10.1007/s10639-025-13420-z Zimmerman, B. J. (2002). Becoming a self-regulated learner: An overview. Theory Into Practice , 41 (2), 64–70. https://doi.org/10.1207/s15430421tip4102_2 Additional Declarations No competing interests reported. Supplementary Files Appendix.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7577394","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":514146265,"identity":"12532752-e64a-4e21-b287-03488914454e","order_by":0,"name":"Xiuxiu Tang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAsklEQVRIiWNgGAWjYNCCChsGhgMMDBLEqWYDEWfSSNXC2HaYBC0G95ufPfxy5nwe3wHmg7d5iNEi2cZmbixTcbtY8gBbsjVRWvjZGMykJc7cTtxwgMdMmigtbGzs36Ql284BtfB/I04LPxuPmeTHtgMgW9iI0yLZllMmzXAmOXHmYTZjyznEaDE4fHyb5I8Ku8S+480Pb7whRgsIMIPdw0yschBg/EGK6lEwCkbBKBh5AAAw4zFiQrCk/gAAAABJRU5ErkJggg==","orcid":"","institution":"University of Notre Dame","correspondingAuthor":true,"prefix":"","firstName":"Xiuxiu","middleName":"","lastName":"Tang","suffix":""},{"id":514146266,"identity":"cf2695fe-5372-44ae-8e93-632f213600e5","order_by":1,"name":"Xiyu Wang","email":"","orcid":"","institution":"Purdue University West Lafayette","correspondingAuthor":false,"prefix":"","firstName":"Xiyu","middleName":"","lastName":"Wang","suffix":""},{"id":514146267,"identity":"9bd3bc62-b14a-4b03-ac6d-5ba99bef9f33","order_by":2,"name":"Liu Dong","email":"","orcid":"","institution":"Purdue University West Lafayette","correspondingAuthor":false,"prefix":"","firstName":"Liu","middleName":"","lastName":"Dong","suffix":""},{"id":514146268,"identity":"b1db910d-9b7f-48bd-9385-4ab09c6b23a3","order_by":3,"name":"Jingxian Cecilia Zhang","email":"","orcid":"","institution":"Mount St. Mary's University","correspondingAuthor":false,"prefix":"","firstName":"Jingxian","middleName":"Cecilia","lastName":"Zhang","suffix":""}],"badges":[],"createdAt":"2025-09-09 23:38:06","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7577394/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7577394/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":92290689,"identity":"835cf58a-663d-422f-9de3-009d9963e733","added_by":"auto","created_at":"2025-09-26 21:05:40","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":304575,"visible":true,"origin":"","legend":"","description":"","filename":"ManuscriptMetaanalysisGenAIlearning.docx","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/d3816b47a82f6a9da69b35b2.docx"},{"id":92290674,"identity":"6091394f-20a8-4210-95e1-dfd48223e7f0","added_by":"auto","created_at":"2025-09-26 21:05:38","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":6478,"visible":true,"origin":"","legend":"","description":"","filename":"f23469bb0c884faf8acf079e62657ea0.json","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/b3a10ee0756849cb6edd4090.json"},{"id":92290691,"identity":"6debe3d6-0fd2-4e54-8ba5-d366d9034e0c","added_by":"auto","created_at":"2025-09-26 21:05:41","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":246301,"visible":true,"origin":"","legend":"","description":"","filename":"f23469bb0c884faf8acf079e62657ea01enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/ed2f360c4adf6463022e7171.xml"},{"id":92290737,"identity":"e70507e3-f7b5-4c04-87d8-bda049011cc7","added_by":"auto","created_at":"2025-09-26 21:05:46","extension":"png","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":44836,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/43a7ef3d172e47c0a00eca64.png"},{"id":92290694,"identity":"a6f35326-b694-478e-a93d-428fae9e6852","added_by":"auto","created_at":"2025-09-26 21:05:41","extension":"png","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":37154,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/e9a75dedac75ceb9bce1e251.png"},{"id":92290636,"identity":"59dbe209-5a5f-4f2a-b389-b008d1e354f9","added_by":"auto","created_at":"2025-09-26 21:05:34","extension":"png","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":106977,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/200726c262524804f46b0265.png"},{"id":92290702,"identity":"0aef56a0-4287-4706-92aa-7878eca5c5cf","added_by":"auto","created_at":"2025-09-26 21:05:42","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":24350,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/eadb8f5ee7ee2e7eadb1c370.png"},{"id":92290645,"identity":"04fc7508-3c2f-4ab7-92c4-491cf24fa784","added_by":"auto","created_at":"2025-09-26 21:05:35","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":21353,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/27fb9d32ec560a14285a49dd.png"},{"id":92290697,"identity":"a3f9d7b7-ec8b-4d68-88af-c9c5ba2cde37","added_by":"auto","created_at":"2025-09-26 21:05:41","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":10551,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/03203e0f046e7df21bf52e56.png"},{"id":92290707,"identity":"7872467c-6cde-43fa-b102-ff06af2c4fa0","added_by":"auto","created_at":"2025-09-26 21:05:43","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":33700,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/ddb9243566dbd0cf3f059c9d.png"},{"id":92290684,"identity":"3dbe2ebd-8d15-437d-a34d-de83e9d9f208","added_by":"auto","created_at":"2025-09-26 21:05:40","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":9085,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/da01761f12da6e8dfc0b63cb.png"},{"id":92290655,"identity":"e88e09e3-eeeb-4d3d-bc9e-c6b9e42abbbf","added_by":"auto","created_at":"2025-09-26 21:05:36","extension":"xml","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":242840,"visible":true,"origin":"","legend":"","description":"","filename":"f23469bb0c884faf8acf079e62657ea01structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/5b227b14fa259374ec560b29.xml"},{"id":92290683,"identity":"dba90cfd-ba40-4a82-ac53-e647d6fcb078","added_by":"auto","created_at":"2025-09-26 21:05:40","extension":"html","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":255197,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/15335396298c3f818f0fbef8.html"},{"id":92290629,"identity":"9accfbfb-685a-45af-bac0-30a7bbc16e9d","added_by":"auto","created_at":"2025-09-26 21:05:33","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":126722,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eFlow Diagram\u003c/em\u003e\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/2094e2a7d27c1585426631eb.png"},{"id":92290693,"identity":"2fbfe898-ab85-4103-9c34-941caea263f1","added_by":"auto","created_at":"2025-09-26 21:05:41","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":32390,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eDistribution of 52 Unweighted Hedge’s g Effect Sizes\u003c/em\u003e\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/8000f0befae2590fd9fbfe7a.png"},{"id":92291331,"identity":"06ebacfe-297d-49eb-bdab-7d27c7b1c3b7","added_by":"auto","created_at":"2025-09-26 21:27:13","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":164129,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eDistribution of the Weighted Effect Sizes and Their CI\u003c/em\u003e\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/b6de3c26fbe92c4eb1a5082b.png"},{"id":92290641,"identity":"faa01c49-644a-46b4-9260-8472fcb6eec6","added_by":"auto","created_at":"2025-09-26 21:05:35","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":33916,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eFunnel Plot for Publication Bias Assessment\u003c/em\u003e\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/efd38958e620f9bc55e7e683.png"},{"id":95534887,"identity":"398bf360-ecf6-496e-9b27-7a54cea2c557","added_by":"auto","created_at":"2025-11-10 10:29:57","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1404794,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/3dbaa090-db2a-4c58-8a30-4b421839505e.pdf"},{"id":92290635,"identity":"d7d112d2-954f-40eb-bd59-7faee38dc9b5","added_by":"auto","created_at":"2025-09-26 21:05:33","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":22802,"visible":true,"origin":"","legend":"","description":"","filename":"Appendix.docx","url":"https://assets-eu.researchsquare.com/files/rs-7577394/v1/53c6085a17769cc5de4c7b85.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Transforming Learning or Empty Promise? A Meta-Analysis of Generative AI in Education","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eThe rapid advent of generative artificial intelligence (GenAI) tools such as ChatGPT, Gemini, and Claude is reshaping educational landscapes by offering new possibilities for supporting and enhancing learning. Unlike earlier forms of artificial intelligence (AI), GenAI technologies generate human-like text, facilitate problem-solving, and provide interactive, context-sensitive assistance (Fui-Hoon Nah et al., \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). These distinctive capabilities have has attracted widespread attention, but empirical evidence on their educational impact is mixed, highlighting a pressing need for systematic synthesis.\u003c/p\u003e\u003cp\u003eMeta-analyses of AI in education more broadly have consistently found positive effects. For example, Dong et al. (\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2025\u003c/span\u003e) synthesized 29 empirical studies of diverse AI technologies and reported a large overall effect (g\u0026thinsp;=\u0026thinsp;0.92), with variation across educational levels and subjects. Tlili et al. (\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e2025\u003c/span\u003e), analyzing 85 studies, likewise found a large average effect (g\u0026thinsp;=\u0026thinsp;1.10), especially highlighting the effectiveness of chatbots. While these reviews established AI\u0026rsquo;s general potential, they combined traditional AI systems with emerging GenAI tools, leaving the unique contribution of GenAI unclear.\u003c/p\u003e\u003cp\u003eMore recently, several reviews have examined GenAI specifically (e.g., Liu et al., \u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e2025\u003c/span\u003e; Sun \u0026amp; Zhou, \u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Zhu et al., \u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). Although these studies provide valuable early insights, their scope remains limited. Sun and Zhou (\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e2024\u003c/span\u003e), for instance, focused exclusively on college students and synthesized only 28 studies. Zhu et al. (\u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e2025\u003c/span\u003e) included both K\u0026ndash;12 and higher education contexts but drew on just 26 studies and reported modest effects (g\u0026thinsp;=\u0026thinsp;0.39) across varied outcomes. Liu et al. (\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e2025\u003c/span\u003e) analyzed a larger set of 49 studies and found substantial effects on achievement and motivation, but their moderator analyses concentrated on media format and interface features rather than pedagogical functions. These reviews demonstrate the potential of GenAI but leave important questions unanswered regarding its impact on academic achievement across educational levels, learning domains, and instructional designs, including how study design features (e.g., sample size) and the instructional roles assigned to GenAI influence outcomes.\u003c/p\u003e\u003cp\u003eThe present meta-analysis addresses these gaps by synthesizing recent peer-reviewed, quasi-experimental and experimental studies of GenAI-based instruction published since late 2022. The evidence base spans all educational levels, from K\u0026ndash;12 to higher education. By focusing on achievement outcomes and analyzing a larger and more up-to-date body of studies, this study provides a comprehensive and nuanced account of GenAI\u0026rsquo;s educational effects. It also incorporates several key moderators, including participant sample size, intervention duration, learning domain, subject area, the instructional role of GenAI, and educational level. In doing so, it extends prior reviews and clarifies the conditions under which GenAI is most effective. This contributes to both theory and practice by informing pedagogical design, institutional decision-making, and policy development.\u003c/p\u003e\u003cp\u003e\u003cb\u003eResearch Questions\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThis meta-analysis synthesizes findings from 52 peer-reviewed empirical studies published between November 2022 and 2025 that met strict eligibility criteria. The study addresses two central questions:\u003c/p\u003e\u003cp\u003eRQ1: What is the overall effect of GenAI-based instructional interventions on academic performance outcomes (e.g., test scores, grades, GPA)?\u003c/p\u003e\u003cp\u003eRQ2: Under what conditions does GenAI most effectively improve student achievement (e.g., across educational levels, learning domains, instructional roles, subject areas, or study designs)?\u003c/p\u003e"},{"header":"2. Literature Review","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1 Learning Domains: STEM vs non-STEM\u003c/h2\u003e\u003cp\u003eThe pedagogical utility of GenAI is influenced by the academic discipline in which it is applied. Evidence shows that GenAI benefits both STEM and non-STEM fields, but the mechanisms through which it operates differ. In non-STEM disciplines such as language acquisition and the social sciences, GenAI tools have been especially effective for communication, critical inquiry, and affective engagement. For example, in English as a Foreign Language (EFL) instruction, an AI chatbot integrated into \u0026ldquo;think-pair-share\u0026rdquo; activities reduced students\u0026rsquo; speaking anxiety while increasing enjoyment and oral proficiency (Wu et al., 2025). Likewise, in an undergraduate research methods course, the use of ChatGPT enhanced research competencies and fostered autonomous motivation and self-directed learning (Li et al., 2025).\u003c/p\u003e\u003cp\u003eIn contrast, within STEM fields such as computer science and engineering, GenAI's primary value appears to lie in its ability to manage cognitive load and provide scaffolded, personalized learning support. For example, a study conducted in a computer vision course found that an intelligent teaching assistant powered by a large language model not only improved academic outcomes but also helped students develop higher-order cognitive skills like active questioning and summarization (Teng et al., 2025). Furthermore, in a university programming course, an AI-agent-supported collaborative learning framework was shown to enhance learning achievement while simultaneously reducing the cognitive effort required of students and increasing their self-efficacy (Wang et al., 2025). In these settings, the AI agent acts as a dynamic, intelligent scaffold, personalizing the learning pathway and enabling students to tackle complex problems more effectively.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2 Educational Level Differences\u003c/h2\u003e\u003cp\u003eBeyond the subject matter, the developmental stage of the learner also plays a crucial role in determining the effectiveness and appropriate application of GenAI. The existing literature indicates that while GenAI is a potent tool across various age groups, its impact diverges between higher education and K-12 settings, with pedagogical strategy being a particularly critical variable for younger learners.\u003c/p\u003e\u003cp\u003eMost existing research has been conducted in higher education, where students generally benefit from greater autonomy and digital fluency. In this context, studies have consistently found that learners using GenAI tools outperform their peers in control conditions (Lee et al., 2022; Wang, 2025). For example, in diverse courses spanning public health, engineering, and language learning, university students using GenAI tools have produced higher-quality work, demonstrated stronger comprehension, and reported greater confidence and motivation. Specifically, Lee et al. (2022) found that an AI-based chatbot used for after-class review enhanced academic performance and self-efficacy among college students. Similarly, Wang (2025) reported that undergraduates using ChatGPT-4 for English practice showed significant gains in their communication skills and a high degree of acceptance for the technology. These results suggest that university students are well-positioned to engage productively with GenAI, especially when the technology supports complex tasks like revision, self-regulation, or conceptual reasoning.\u003c/p\u003e\u003cp\u003eIn K-12 education, the results are more variable and highly dependent on pedagogical support (Alneyadi \u0026amp; Wardat, 2024; Jeon, 2023; Liu et al., 2024; Sapan \u0026amp; Uzun, 2024). For secondary students, interventions that incorporate strong instructional design tend to yield better outcomes. This is supported by Alneyadi and Wardat (2024), who demonstrated that integrating ChatGPT into a Grade 12 Quantum Theory course significantly enhanced student achievement. However, the importance of this scaffolding is underscored by the contrasting findings of Sapan and Uzun (2024), where traditional instruction proved more effective than a less structured ChatGPT integration for improving high school students' writing skills. Research in primary education, while rarer, echoes this theme. Early evidence suggests that GenAI can support foundational skills only when carefully scaffolded. For example, Jeon (2023) found that a chatbot employing \"dynamic assessment\" with graduated, interactive assistance was highly effective for elementary school students learning vocabulary, while Liu et al. (2024) showed that a teacher-guided, LLM-supported model significantly improved writing performance. Overall, these findings confirm educational level as a key moderator, with the most consistent gains observed in postsecondary contexts and an increasing need for structured, teacher-supported interventions for younger learners.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3 Roles of GenAI in Instructional Contexts\u003c/h2\u003e\u003cp\u003eThe role that GenAI plays within the instructional design of each study also varies widely. Based on a synthesis of recent literature, GenAI interventions typically fall into at least three major categories, which are critical for understanding their impact. First, in the role of tutoring and scaffolding, many studies used GenAI to act as a conversational tutor or a tool to support student learning. These interventions aimed to replicate or enhance elements of one-on-one instruction. For example, some used GenAI as a conversational partner with human-like avatars to reduce speaking anxiety and build student confidence (Wang et al., 2024), while others used it as a creative scaffolding tool, translating students' textual descriptions of poetry into visual images to deepen their comprehension (Chu et al., 2025).\u003c/p\u003e\u003cp\u003eSecond, some studies used GenAI primarily for assessment and feedback, which emphasizes iterative improvement and reflection. In this capacity, GenAI was deployed to automatically evaluate student work and provide formative feedback. For instance, it has been used to provide \"dialogic feedback\" on complex tasks like computer programming, where the timing and context of the GenAI's interactive suggestions were found to be critical for developing students' critical thinking skills (Gong et al., 2025).\u003c/p\u003e\u003cp\u003eThird, an increasing number of studies have explored GenAI\u0026rsquo;s capacity for personalized learning support. This role often overlaps with tutoring and feedback, as the most effective tools adapt to student input. For example, the conversational tutors adjust their dialogue, and feedback systems provide suggestions tailored to specific errors. This adaptive capability, which supports self-regulated learning and engagement, is a key feature in studies showing positive outcomes across different educational levels (Jeon, 2023; Lee et al., 2022).\u003c/p\u003e\u003cp\u003eAlthough these roles often overlap in practice, they reflect distinct pedagogical aims and learner interactions. Understanding how GenAI functions as a tutor, feedback provider, or personalized support tool is therefore essential for interpreting its educational impact. This body of research indicates the need for a systematic synthesis that examines not only the overall effect of GenAI on student achievement but also the conditions under which its impact varies.\u003c/p\u003e\u003c/div\u003e"},{"header":"3. Methodology","content":"\u003cp\u003eThe present meta-analysis examined the influence of GenAI on students\u0026rsquo; learning outcomes and investigated how study characteristics such as sample size, educational level, intervention duration, role of GenAI, learning domain, and subject area moderated effect sizes. The study was conducted in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines (Moher et al., \u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e2009\u003c/span\u003e).\u003c/p\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e3.1 Eligibility Criteria\u003c/h2\u003e\u003cp\u003eTo be included in this meta-analysis, studies had to meet the following criteria: (1) be peer-reviewed journal articles published between November 30, 2022 and 2025; (2) be written in English and full-text accessible; (3) be empirical in nature (quantitative or mixed-method); (4) apply GenAI technologies such as ChatGPT; (5) be quasi-experimental or true-experimental studies; (6) include both an experimental group using GenAI to support learning and a control group relying on traditional instruction without GenAI; (7) report academic or learning outcomes for both groups; (8) provide sufficient statistical information to calculate the effect size.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003e3.2 Search Strategy and Selection Criteria\u003c/h2\u003e\u003cp\u003eWe conducted a comprehensive search of published studies using a wide range of scientific databases, including Scopus, Web of Science, and EBSCO (including APA PsycINFO, Education Full Text; Education Source, ERIC, Social Sciences Full Text). The key search terms used in the electronic database searches consisted of two categories: GenAI and learning, which were connected by Boolean AND. The GenAI-related keywords included \u0026ldquo;generative artificial intelligence\u0026rdquo; or \u0026ldquo;Gen-AI\u0026rdquo; or \u0026ldquo;GenAI\u0026rdquo; or \u0026ldquo;GAI\u0026rdquo; or \u0026ldquo;ChatGPT\u0026rdquo; or \u0026ldquo;GPT\u0026rdquo; or \u0026ldquo;agent\u0026rdquo; or \u0026ldquo;chatbot\u0026rdquo; or \u0026ldquo;large language model\u0026rdquo; or \u0026ldquo;LLM\u0026rdquo; or \u0026ldquo;OpenAI\u0026rdquo;. The learning-related keywords included: \u0026ldquo;Learning\u0026rdquo; or \u0026ldquo;education\u0026rdquo; or \u0026ldquo;instruction\u0026rdquo; or \u0026ldquo;curriculum\u0026rdquo; or \u0026ldquo;course\u0026rdquo; or \u0026ldquo;learning outcome\u0026rdquo; or \u0026ldquo;learning performance\u0026rdquo; or \u0026ldquo;learning effect\u0026rdquo; or \u0026ldquo;learning achievement\u0026rdquo; or \u0026ldquo;academic performance\u0026rdquo; or \u0026ldquo;academic achievement\u0026rdquo;.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003e3.3 Data Extraction\u003c/h2\u003e\u003cp\u003eA PRISMA flow diagram (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e) was used to illustrate the data extraction process. The search initially identified 7,667 records, which were imported for processing. After removing the duplicates, 5,302 unique studies remained for screening. The review proceeded in three rounds. In the first round, titles and abstracts were screened for relevance, resulting in the exclusion of 5,157 records and leaving 145 full-text articles for eligibility assessment. In the second round, the full texts were reviewed in detail, and 91 studies were excluded. In the third round, statistical information and categorical variables were extracted from the remaining studies for effect size calculation. Ultimately, 54 studies met all criteria and were included in the meta-analysis.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\u003ch2\u003e3.4 Data Coding\u003c/h2\u003e\u003cp\u003eA structured coding form was developed to extract key information from each study, including title, publication year, research design, and statistical data (sample size, means, and standard deviations). Additionally, each study was coded on 6 moderator descriptors: (1) educational level ( higher education, secondary education, primary education), (2) sample size (1\u0026ndash;50, 51\u0026ndash;100, 101\u0026ndash;150 or more than 150), (3) learning domain (STEM or Non-STEM), (4) subject area, (5) intervention duration, and (6) the instructional role assigned to GenAI (personalized recommendation, assessment and evaluation, tutoring, mixed or others). By focusing on these specific variables, this meta-analysis provides insights into which aspects of GenAI interventions are most influential. Four coders were trained and coded the first ten articles together to ensure consistency in coding. Then, at least one coder coded and cross-checked the remaining articles. Any discrepancies were discussed and resolved until full agreement was reached. See Appendix for coding details.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003e3.5 Data Analysis\u003c/h2\u003e\u003cp\u003eTo ensure statistical independence, only one effect size was extracted from each of the 54 included studies. To reduce small-sample bias, Hedges\u0026rsquo; g was used to estimate effect sizes. A random effects model, justified by significant heterogeneity (Q test, I\u0026sup2; statistic), was used to synthesize the overall impact of GenAI on student achievement. A forest plot was generated to display individual effect sizes and corresponding confidence intervals. Moderator analyses were conducted to examine factors influencing the effectiveness of GenAI interventions. All analyses were conducted using SPSS version 29.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003e3.6 Publication bias\u003c/h2\u003e\u003cp\u003eTo assess the risk of publication bias, a funnel plot was generated to examine the relationship between study effect sizes and their standard errors. In the absence of bias, the funnel plot should display a symmetrical, funnel-shaped distribution, with smaller studies scattered widely at the base and larger studies clustered near the mean effect size at the top. Asymmetry in the plot, by contrast, may suggest that smaller studies with divergent or unfavorable results are underrepresented, indicating possible selective outcome reporting or publication bias. In addition to visual inspection, Egger\u0026rsquo;s regression test was conducted to statistically evaluate funnel plot asymmetry.\u003c/p\u003e\u003c/div\u003e"},{"header":"4. Results","content":"\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\n \u003ch2\u003e4.1 Weighted Average Effect Size and Heterogeneity\u003c/h2\u003e\n \u003cp\u003eOf the 54 eligible studies, two were excluded prior to analysis due to extreme outlier effect sizes and inconsistencies in the original reports, to avoid undue influence on the pooled estimate. Across the 52 independent comparisons included in the analysis, the aggregated impact of GenAI tools on student academic achievement was substantial and statistically significant. The pooled standardized mean difference, expressed as Hedges\u0026rsquo; g, was 1.193 (SE\u0026thinsp;=\u0026thinsp;0.243, 95% CI [0.716, 1.669], p\u0026thinsp;\u0026lt;\u0026thinsp;0.001), indicating a strong positive effect of GenAI-supported learning relative to conventional instructional methods. The 95% prediction interval (\u0026ndash;2.267 to 4.639) further illustrates the extent of contextual variability, suggesting that while the average effect is robust, outcomes may differ across implementation settings and study conditions.\u003c/p\u003e\n \u003cp\u003eMeasures of heterogeneity confirmed considerable between-study variability. The Q statistic was highly significant, Q(51)\u0026thinsp;=\u0026thinsp;863.51, p\u0026thinsp;\u0026lt;\u0026thinsp;0.05, and the I\u0026sup2; index was 98.0%, indicating that nearly all variance observed among effect sizes reflects systematic differences across studies rather than random error. The estimated between-study variance (\u0026tau;\u0026sup2;) was 2.98. A histogram of the unweighted effect sizes (Fig. \u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e) shows a moderately skewed distribution, with several studies reporting large positive effects. The forest plot (Fig. \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e) provides a visual summary of the individual study estimates and their confidence intervals. These results collectively justify the use of a random-effects model and underscore the value of exploring moderators to account for the observed dispersion.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\n \u003ch2\u003e4.2 Publication Bias\u003c/h2\u003e\n \u003cp\u003eAs part of the risk-of-bias assessment, a funnel plot of effect sizes against standard errors was generated (Fig. 4). Visual asymmetry raised the possibility of small-study effects or selective reporting. To formally test this, Egger\u0026rsquo;s regression test for funnel plot asymmetry was conducted. The result was statistically significant, \u003cem\u003et\u003c/em\u003e(49)\u0026thinsp;=\u0026thinsp;3.156, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.05, providing evidence of potential bias. The extrapolated effect size as standard error approaches zero was \u0026minus;\u0026thinsp;0.344 (95% CI: \u0026minus;\u0026thinsp;1.093, 0.406), suggesting that smaller studies may overestimate treatment effects. While the confidence interval of the limit estimate includes zero, the significance of the test highlights the need to interpret the pooled estimate in light of potential publication or reporting bias.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\n \u003ch2\u003e4.3 Moderator Analyses\u003c/h2\u003e\n \u003cp\u003eTo examine sources of heterogeneity, moderator analyses were conducted on study-level characteristics, including educational level, sample size, learning domain, subject area, intervention duration, and the instructional role assigned to GenAI. Summary statistics for all subgroup comparisons are reported in Table \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e\n \u003cdiv class=\"gridtable\"\u003e\n \u003ctable id=\"Tab1\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003e\u003cem\u003eSubgroup Analyses of Potential Moderator Variables Using Random\u003c/em\u003e Effects\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\" rowspan=\"2\"\u003e\n \u003cp\u003eVariable\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" rowspan=\"2\"\u003e\n \u003cp\u003eCategory\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" rowspan=\"2\"\u003e\n \u003cp\u003e\u003cem\u003eK\u003c/em\u003e\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colspan=\"2\"\u003e\n \u003cp\u003eHedge\u0026apos;s g\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\u0026nbsp;\u003c/th\u003e\n \u003cth align=\"left\" rowspan=\"2\"\u003e\n \u003cp\u003e\u003cem\u003eQ statistics\u003c/em\u003e\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" rowspan=\"2\"\u003e\n \u003cp\u003e\u003cem\u003edf\u003c/em\u003e\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" rowspan=\"2\"\u003e\n \u003cp\u003e\u003cem\u003ep\u003c/em\u003e\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\u0026nbsp;\u003c/th\u003e\n \u003cth align=\"left\" rowspan=\"2\"\u003e\n \u003cp\u003eChi-square of the moderator analysis\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eMean\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eSD\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\u0026nbsp;\u003c/th\u003e\n \u003cth align=\"left\"\u003e\u0026nbsp;\u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colspan=\"2\"\u003e\n \u003cp\u003eEducation level\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026chi;\u0026sup2;(2)\u0026thinsp;=\u0026thinsp;1.628, p\u0026thinsp;=\u0026thinsp;0.443\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePrimary education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.007\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.414\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e40.477\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSecondary education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.715\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.418\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e246.381\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eHigher education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e34\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.397\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.346\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e518.676\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e33\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSample size\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026chi;\u0026sup2;(3)\u0026thinsp;=\u0026thinsp;4.745, p\u0026thinsp;=\u0026thinsp;0.191\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1\u0026ndash;50\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.075\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.276\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e68.526\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e12\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e51\u0026ndash;100\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.517\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.421\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e584.256\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e101\u0026ndash;150\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.349\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.425\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e127.992\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMore than 150\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.671\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.318\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e50.948\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colspan=\"2\"\u003e\n \u003cp\u003eLearning domain\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026chi;\u0026sup2;(1)\u0026thinsp;=\u0026thinsp;0.499, p\u0026thinsp;=\u0026thinsp;0.480\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSTEM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.329\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.365\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e321.970\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNon-STEM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e32\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.313\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.352\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e541.032\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSubject area\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026chi;\u0026sup2;(10)\u0026thinsp;=\u0026thinsp;51.949, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eGeneral education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.423\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.662\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e61.847\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eArt education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.284\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.181\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e4.515\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.211\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eLanguage education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.629\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.486\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e331.692\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMathematics education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.274\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.324\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePhysics education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.130\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.128\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.997\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eScience education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.975\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.890\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e130.125\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMedical education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2.334\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.402\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNurse education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2.084\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.405\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eHealth education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.743\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.167\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.362\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.834\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eComputer science education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.467\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.655\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e190.030\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEngineering education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.544\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.207\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colspan=\"2\"\u003e\n \u003cp\u003eIntervention duration\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026chi;\u0026sup2;(2)\u0026thinsp;=\u0026thinsp;1.864, p\u0026thinsp;=\u0026thinsp;0.397\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1\u0026ndash;4 weeks\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e26\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.159\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.270\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e340.446\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e5\u0026ndash;8 weeks\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.787\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.342\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e355.2363\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026gt;\u0026thinsp;8 weeks\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2.332\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.228\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e130.434\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colspan=\"2\"\u003e\n \u003cp\u003eInstructional role of GenAI\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026chi;\u0026sup2;(4)\u0026thinsp;=\u0026thinsp;10.641, p\u0026thinsp;=\u0026thinsp;0.031\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAssessment and evaluation\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2.019\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.586\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e46.288\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePersonalized recommendation\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.490\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.311\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTutoring\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.410\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.398\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e208.508\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMixed\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.017\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.367\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e532.273\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026lt;\u0026thinsp;0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eOthers\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.274\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.324\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003e\u003cstrong\u003eEducational Level\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003eDifferences across educational levels were not statistically significant (\u0026chi;\u0026sup2;(2)\u0026thinsp;=\u0026thinsp;1.628, p\u0026thinsp;=\u0026thinsp;.443). Still, a descriptive gradient appeared: studies at the higher-education level showed the largest mean effect (g\u0026thinsp;=\u0026thinsp;1.397), followed by primary education (g\u0026thinsp;=\u0026thinsp;1.007), with secondary education showing the smallest (g\u0026thinsp;=\u0026thinsp;0.715). Although this difference cannot be interpreted as conclusive, it hints that GenAI may yield stronger outcomes among learners in higher education, who can make more autonomous use of the tools. The smaller gains observed in secondary education may reflect differences in curricular structure or support needs. Thus, even without statistical significance, the pattern suggests that developmental stage could influence the effectiveness of GenAI integration.\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003eSample Size\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003eSample size also did not significantly moderate effects (\u0026chi;\u0026sup2;(3)\u0026thinsp;=\u0026thinsp;4.745, p\u0026thinsp;=\u0026thinsp;.191). Nonetheless, a clear descriptive tendency emerged. Studies with 51\u0026ndash;100 participants reported the strongest mean effect (g\u0026thinsp;=\u0026thinsp;1.517). Smaller studies with 1\u0026ndash;50 participants showed a positive but lower effect (g\u0026thinsp;=\u0026thinsp;1.075). Very large studies with over 150 participants reported a more modest effect (g\u0026thinsp;=\u0026thinsp;0.671). The lowest mean appeared in the 101\u0026ndash;150 participant group (g\u0026thinsp;=\u0026thinsp;0.349).\u003c/p\u003e\n \u003cp\u003eThese findings should be read cautiously, but they suggest that study scale may shape outcomes. Mid-sized studies often occur in manageable settings where GenAI can be fully embedded, while very large studies may include heterogeneous contexts that dilute effects. The unexpectedly low value in the 101\u0026ndash;150 group likely reflects study-specific features rather than a general trend. Altogether, while not statistically significant, the pattern hints that study scale could be a practical factor in how GenAI\u0026rsquo;s impact is realized.\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003eLearning Domain\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003eThe difference between STEM and non-STEM domains was not significant (\u0026chi;\u0026sup2;(1)\u0026thinsp;=\u0026thinsp;0.499, p\u0026thinsp;=\u0026thinsp;.480). The mean effect sizes were nearly identical - STEM at g\u0026thinsp;=\u0026thinsp;1.329 and non-STEM at g\u0026thinsp;=\u0026thinsp;1.313. This similarity suggests that GenAI has broad applicability across disciplinary boundaries. Although the contrast was nonsignificant, the fact that both groups showed large positive effects indicates that the benefits of GenAI are not confined to one type of learning domain. This consistency across fields strengthens the case for generalizability.\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003eSubject Area\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003eSubject area significantly moderated effects, \u0026chi;\u0026sup2;(10)\u0026thinsp;=\u0026thinsp;51.949, p\u0026thinsp;\u0026lt;\u0026thinsp;.001. The strongest average gains were observed in language education (g\u0026thinsp;=\u0026thinsp;1.629), followed by engineering (g\u0026thinsp;=\u0026thinsp;1.544) and general education (g\u0026thinsp;=\u0026thinsp;1.423). Smaller pooled effects were reported in computer science (g\u0026thinsp;=\u0026thinsp;0.467) and art education (g\u0026thinsp;=\u0026thinsp;0.284). Very large mean values appeared in medical education (g\u0026thinsp;=\u0026thinsp;2.3341) and nursing (g\u0026thinsp;=\u0026thinsp;2.084), though each was based on a single study and thus should be interpreted cautiously.\u003c/p\u003e\n \u003cp\u003eThe relatively modest average in computer science is notable, especially given GenAI\u0026apos;s conceptual alignment with this field. One possible reason is that current studies in CS education often focus on narrow skillsets such as syntax correction or code generation, rather than broader applications like project-based learning, peer feedback, or generative support for computational thinking. Moreover, the small number of available studies (K\u0026thinsp;=\u0026thinsp;7) limits the stability of the pooled estimate, and the diversity in study design further complicates interpretation. As a result, the generalizability of the observed effect size in this area should be regarded as limited. By contrast, the consistently strong effects observed in language education likely reflect a tighter match between GenAI\u0026rsquo;s natural language capabilities and common instructional tasks in that domain, such as writing assistance, paraphrasing, translation, and iterative feedback. The extremely large means found in medical and nursing education should be treated cautiously due to their basis in single studies, but they suggest potentially promising areas for simulation-based or decision-support applications\u0026mdash;pending further replication.\u003c/p\u003e\n \u003cp\u003eAltogether, the evidence indicates that GenAI\u0026rsquo;s impact is not distributed evenly across disciplines. It tends to be stronger where the tools\u0026apos; capabilities closely map onto instructional practices and weaker where current use cases remain limited in scope or scale. Computer science education, despite its intuitive connection to AI, may require broader pedagogical integration to fully reflect the field\u0026rsquo;s potential.\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003eIntervention Duration\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003eIntervention duration did not significantly moderate the effect, \u0026chi;\u0026sup2;(2)\u0026thinsp;=\u0026thinsp;1.864, p\u0026thinsp;=\u0026thinsp;.397. The average effect size was highest for studies lasting more than eight weeks (g\u0026thinsp;=\u0026thinsp;2.332), followed by short-term interventions of one to four weeks (g\u0026thinsp;=\u0026thinsp;1.159). Interventions of medium length, lasting five to eight weeks, showed the lowest average (g\u0026thinsp;=\u0026thinsp;0.787). The longer interventions showed a strikingly high average effect, though based on a smaller set of studies. Medium-duration studies reported the lowest average, while shorter interventions fell in between. While these results were not significant, the variation raises the possibility that exposure time plays a role in GenAI\u0026apos;s impact.\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003eInstructional Role of GenAI\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003eInstructional role of GenAI significantly moderated outcomes, \u0026chi;\u0026sup2;(4)\u0026thinsp;=\u0026thinsp;10.641, p\u0026thinsp;=\u0026thinsp;.031. The largest effects were found in assessment and evaluation (g\u0026thinsp;=\u0026thinsp;2.019) and tutoring (g\u0026thinsp;=\u0026thinsp;1.410). Mixed-use studies averaged g\u0026thinsp;=\u0026thinsp;1.017, while recommendation-based (g\u0026thinsp;=\u0026thinsp;0.490) and miscellaneous applications (g\u0026thinsp;=\u0026thinsp;0.274) were lower. The pattern suggests that GenAI is most effective when integrated directly into instructional feedback and guidance processes. While the test was significant, some categories included very few studies, so interpretations should be tempered accordingly.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"5. Discussion","content":"\u003cp\u003eThis meta-analysis of 52 experimental and quasi-experimental studies demonstrates that GenAI interventions produce a strong positive effect on student academic achievement (Hedges\u0026rsquo; \u003cem\u003eg\u003c/em\u003e\u0026thinsp;=\u0026thinsp;1.193). This estimate exceeds effect sizes reported in earlier reviews of AI or GenAI in education (Dong et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2025\u003c/span\u003e; X. Liu et al., \u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e2025\u003c/span\u003e; Sun \u0026amp; Zhou, \u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Tlili et al., \u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e2025\u003c/span\u003e; Zhu et al., \u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). Several factors likely explain this difference, including the rapid advances in large language models since 2022, which have made GenAI tools more accurate, interactive, and versatile, as well as the broader scope of this review, which encompassed multiple educational levels, learning domains, subject areas and instructional designs.\u003c/p\u003e\u003cp\u003eAt the same time, the observed high heterogeneity indicates that effectiveness varies substantially across contexts. Understanding the sources of this variation is therefore critical. The moderator analyses revealed that the instructional role of GenAI and subject area were statistically significant, which indicates conditions under which GenAI is most effective. Other moderators (educational level, intervention duration, sample size, and learning domain) did not reach significance but showed descriptive patterns that can inform future research and implementation.\u003c/p\u003e\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\u003ch2\u003e5.1 Instructional Role as a Core Driver\u003c/h2\u003e\u003cp\u003eThe instructional role emerged as one of the most important moderators of GenAI effectiveness. Interventions in which GenAI was used for assessment and feedback produced the strongest effects, followed by tutoring applications. By contrast, recommendation-based or loosely defined uses were less effective. These findings align with the extensive literature on formative assessment, which emphasizes that timely, targeted, and individualized feedback is among the most powerful influences on student learning (Black \u0026amp; Wiliam, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2009\u003c/span\u003e; Hattie \u0026amp; Timperley, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). GenAI tools are well-suited to this function because they can provide immediate, personalized responses at scale, supplementing teacher feedback and allowing for more frequent cycles of practice and revision.\u003c/p\u003e\u003cp\u003eFrom the perspective of scaffolding theory (Van De Pol et al., \u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Wood et al., \u003cspan citationid=\"CR76\" class=\"CitationRef\"\u003e1976\u003c/span\u003e), GenAI can function as a flexible support mechanism that adapts to learners\u0026rsquo; needs as they progress through their zone of proximal development (Vygotskij, \u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e1981\u003c/span\u003e). In addition, self-regulated learning frameworks (Panadero, \u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Zimmerman, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2002\u003c/span\u003e) suggest that such adaptive feedback promotes metacognitive monitoring, error detection, and iterative improvement\u0026mdash;all of which are processes that GenAI-based tutoring or assessment tools can enhance.\u003c/p\u003e\u003cp\u003eThe weaker effects of recommendation-based or miscellaneous applications suggest that pedagogical alignment is essential. Simply deploying GenAI as a novelty or generic recommender does not guarantee benefits; in fact, such uses may fail to engage deeper learning processes. Together, these results suggest that GenAI\u0026rsquo;s strongest contributions occur when its functions are directly tied to formative assessment and tutoring, where feedback and adaptive guidance are critical for learning.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec19\" class=\"Section2\"\u003e\u003ch2\u003e5.2 Subject Area Differences: Affordance-Practice Alignment\u003c/h2\u003e\u003cp\u003eThe second significant moderator was subject area, with striking variation in effect sizes across disciplines. The largest and most consistent gains were observed in language education, followed by strong results in engineering and general education. By contrast, more modest effects were reported in computer science and art education, while single studies in medical and nursing education reported very large effects. The success of GenAI in language education likely reflects the close match between large language models\u0026rsquo; capabilities and the core practices of the domain. Writing, translation, paraphrasing, and revision are all central to language learning, and GenAI can provide immediate, iterative, and tailored support in these areas (Hyland \u0026amp; Hyland, \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e2006\u003c/span\u003e). This natural synergy between affordance and task likely explains why language education consistently produced stronger effects.\u003c/p\u003e\u003cp\u003eThe weaker effects in computer science are notable. Despite the field\u0026rsquo;s conceptual connection to AI, many CS interventions focused on narrow skill sets such as syntax correction or code generation (Yang et al., 2025; Zhao et al., 2025). These uses may ease extraneous cognitive load (Sweller, \u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e1988\u003c/span\u003e) but do not necessarily foster deeper conceptual learning or computational thinking. More integrative approaches, such as project-based learning, peer review of code, or generative support for problem decomposition, may be needed to realize GenAI\u0026rsquo;s potential in this field. Similarly, in art education, current applications may not align well with the creative, iterative, and process-driven nature of artistic practice.\u003c/p\u003e\u003cp\u003eThe very large effects reported in medical and nursing education, though based on single studies, point to promising directions. GenAI could be particularly useful in simulation-based learning, clinical decision-making, and diagnostic reasoning, where adaptive feedback and scenario generation are valuable (Chang et al., 2022; Cook et al., \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2013\u003c/span\u003e). However, further replication is needed before strong conclusions can be drawn.\u003c/p\u003e\u003cp\u003eThese results suggest that GenAI\u0026rsquo;s effectiveness depends on how closely its affordances align with the instructional practices of a domain. Language tasks map naturally onto LLM capabilities, while other fields may require more thoughtful integration to achieve similar benefits.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec20\" class=\"Section2\"\u003e\u003ch2\u003e5.3 Contextual Moderators: Descriptive but Informative Patterns\u003c/h2\u003e\u003cp\u003eAlthough not statistically significant, several other moderators revealed consistent descriptive patterns that provide insight into contextual conditions shaping GenAI\u0026rsquo;s impact.\u003c/p\u003e\u003cp\u003eEducational level. The gradient observed\u0026mdash;higher education showing the strongest effects, followed by primary, then secondary\u0026mdash;suggests that learner autonomy and developmental readiness may play important roles. University students may be better equipped to independently integrate GenAI into their work, while primary students may benefit when teachers mediate GenAI use. Secondary students, in contrast, may be at a transitional stage where metacognitive monitoring is still developing (Kuhn, \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e2000\u003c/span\u003e; Pintrich, \u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e2000\u003c/span\u003e), and where curricular structures may limit opportunities to use GenAI meaningfully (Mercer \u0026amp; Littleton, \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). These patterns, while not conclusive, suggest that developmental stage could influence how effectively learners benefit from GenAI.\u003c/p\u003e\u003cp\u003e\u003cem\u003eIntervention duration.\u003c/em\u003e Longer interventions (\u0026gt;\u0026thinsp;8 weeks) showed descriptively stronger effects than shorter ones, while medium-length interventions yielded the lowest averages. This pattern may reflect novelty effects in short-term studies and the stabilizing of benefits in longer-term integrations. Motivation research supports this interpretation: self-determination theory (Ryan \u0026amp; Deci, \u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e2000\u003c/span\u003e) emphasizes that sustained engagement depends on autonomy, competence, and relatedness, while work on achievement emotions suggests that novelty-driven excitement tends to fade unless interventions are integrated into stable learning routines (Linnenbrink-Garcia \u0026amp; Pekrun, \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e2011\u003c/span\u003e; Pekrun, \u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2006\u003c/span\u003e; Tsay et al., \u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). These findings point to the importance of designing interventions for sustained rather than short-term use.\u003c/p\u003e\u003cp\u003e\u003cem\u003eSample size.\u003c/em\u003e The strongest effects were observed in mid-sized studies (51\u0026ndash;100 participants), while very large studies yielded smaller effects. One explanation is that mid-sized studies often occur in classroom-based settings where implementation fidelity is high, while large studies encompass heterogeneous learners and contexts that dilute effects. At the same time, small-study bias may inflate results in underpowered designs (Ioannidis, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2005\u003c/span\u003e; Valentine et al., \u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e2010\u003c/span\u003e). This pattern underscores the need for larger, well-controlled trials that balance fidelity with generalizability.\u003c/p\u003e\u003cp\u003e\u003cem\u003eSTEM vs. non-STEM domains.\u003c/em\u003e Finally, the near-identical effects across STEM and non-STEM fields suggest that GenAI\u0026rsquo;s benefits are broadly applicable. This finding strengthens the case that GenAI is not restricted to text-rich domains but can enhance learning across diverse disciplines, provided applications are thoughtfully designed.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec21\" class=\"Section2\"\u003e\u003ch2\u003e5.4 Limitations and Future Research\u003c/h2\u003e\u003cp\u003eSeveral limitations warrant caution. First, publication bias suggests that smaller studies may overestimate effects, and the wide prediction interval indicates that not all interventions yield positive outcomes. Second, the evidence base remains dominated by short-term studies in higher education, leaving gaps in K\u0026ndash;12 contexts where scaffolding and access issues are more pronounced. Third, few studies examined equity implications, raising concerns that GenAI could exacerbate existing digital divides (Capraro et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Luo \u0026amp; Liu, \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). Finally, the rapid evolution of large language models raises questions about the generalizability of current findings to future iterations.\u003c/p\u003e\u003cp\u003eFuture research should address these gaps by conducting preregistered, large-scale trials with strong implementation fidelity; longitudinal studies to examine sustainability over time; and equity-focused analyses to ensure benefits are accessible across diverse learner populations. Design-based research (Brown, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e1992\u003c/span\u003e; Cobb et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2003\u003c/span\u003e) will be particularly valuable for refining pedagogical integration, while cross-model comparisons are needed to disentangle technology-specific from pedagogy-specific effects.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec22\" class=\"Section2\"\u003e\u003ch2\u003e5.5 Practical Significance\u003c/h2\u003e\u003cp\u003eThis meta-analysis shows that GenAI can substantially improve student learning when used thoughtfully, but not all uses are equally effective. The strongest benefits occur when GenAI supports formative roles such as providing feedback on student work or serving as a tutoring assistant. In these cases, students receive more personalized guidance, which helps them practice, revise, and reflect in ways that accelerate learning. Language education stands out as the field with the largest gains, reflecting a strong match between GenAI\u0026rsquo;s language-generation strengths and writing, translation, and feedback tasks. By contrast, narrower applications in fields like computer science or art show smaller benefits, suggesting that more integrative and creative uses are needed.\u003c/p\u003e\u003cp\u003eFor educators and decision-makers, the practical takeaway is that: GenAI should be viewed as a pedagogical partner rather than a novelty. Its value lies not in replacing teachers, but in scaling up feedback, tutoring, and practice opportunities that are otherwise difficult to provide in large or diverse classrooms. At the same time, successful implementation requires attention to context: younger students need structured teacher support, longer-term use appears more beneficial than one-off trials, and equitable access must be ensured to avoid widening learning gaps. With thoughtful integration, GenAI can reduce instructional burden and expand opportunities for student-centered learning across a wide range of subjects.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec23\" class=\"Section2\"\u003e\u003ch2\u003e5.6 Implications\u003c/h2\u003e\u003cp\u003eFor educators, these findings suggest that GenAI should be deployed strategically in formative roles, such as tutoring and feedback, where its capacity for personalization can enhance learning. In K\u0026ndash;12 contexts, teacher mediation and scaffolding are critical to ensure productive engagement with AI tools. For institutions, professional development is essential to prepare teachers not only to operate GenAI tools but to integrate them into pedagogy responsibly. Implementation should be guided by clear ethical standards and curricular alignment (Holmes et al., \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). For policymakers, equity must be a central concern. Without intentional policies to ensure access, GenAI may widen digital divides (Bentley et al., \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Policies that promote responsible adoption, transparency, and research\u0026ndash;practice partnerships will be critical to maximizing benefits and minimizing risks. At a system level, GenAI should be understood as a pedagogical partner rather than a replacement for teachers. When embedded thoughtfully, it can reduce instructional burden, strengthen formative assessment, and expand opportunities for student-centered learning.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec24\" class=\"Section2\"\u003e\u003ch2\u003e5.7 Conclusion\u003c/h2\u003e\u003cp\u003eThis meta-analysis provides a comprehensive synthesis of GenAI\u0026rsquo;s effects on student achievement. While overall benefits are strong, outcomes depend on how, where, and for whom GenAI is implemented. Instructional role and subject area emerged as the strongest moderators, with particularly large effects for formative assessment, tutoring, and language education. Descriptive patterns further suggest that developmental stage, exposure time, and implementation fidelity condition effectiveness.\u003c/p\u003e\u003cp\u003eLooking forward, GenAI has the potential to transform education, but its promise will only be realized through responsible integration, equity-focused implementation, and sustained empirical evaluation. By aligning GenAI use with pedagogical theory, developmental needs, and institutional support, educators and policymakers can move beyond novelty to unlock its long-term potential for equitable, evidence-based learning.\u003c/p\u003e\u003c/div\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eX.T. contributed to conceptualization, methodology, formal analysis, data curation, software, validation, supervision, and writing (original draft and review \u0026amp; editing). X.W. contributed to methodology, investigation, formal analysis, data curation, software, validation, and writing (review \u0026amp; editing). L.D. contributed to methodology, formal analysis, data curation, software, validation, visualization, and writing (review \u0026amp; editing). J.Z. contributed to methodology, data curation, software, validation, and writing (review \u0026amp; editing). All authors reviewed and approved the final manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eAit Baha, T., El Hajji, M., Es-Saady, Y., \u0026amp; Fadili, H. (2024). The impact of educational chatbot on student learning experience. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e, \u003cem\u003e29\u003c/em\u003e(8), 10153-10176. https://doi.org/10.1007/s10639-023-12166-w\u003c/li\u003e\n \u003cli\u003eAlneyadi, S., \u0026amp; Wardat, Y. (2024). Integrating ChatGPT in grade 12 quantum theory education: An exploratory study at Emirate School (UAE). \u003cem\u003eInternational Journal of Information and Education Technology\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(3), 398\u0026ndash;410. https://doi.org/10.18178/ijiet.2024.14.3.2061\u003c/li\u003e\n \u003cli\u003eAl Ghaithi, A., \u0026amp; Behforouz, B. (2024). The use of an interactive ChatBot in grammar learning. \u003cem\u003eJournal of Educators Online\u003c/em\u003e, \u003cem\u003e21\u003c/em\u003e(4), n4.\u003c/li\u003e\n \u003cli\u003eAvello-Mart\u0026iacute;nez, R., Gajderowicz, T., \u0026amp; G\u0026oacute;mez-Rodr\u0026iacute;guez, V. G. (2024). Is ChatGPT helpful for graduate students in acquiring knowledge about digital storytelling and reducing their cognitive load? An experiment. \u003cem\u003eRevista De Educaci\u0026oacute;n a Distancia (RED)\u003c/em\u003e, \u003cem\u003e24\u003c/em\u003e(78). https://doi.org/10.6018/red.604621\u003c/li\u003e\n \u003cli\u003eBehforouz, B., \u0026amp; Ghaithi, A. A. (2024). Investigating the effect of an interactive educational chatbot on reading comprehension skills. \u003cem\u003eInternational Journal of Engineering Pedagogy (iJEP)\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(4), 139\u0026ndash;154. https://doi.org/10.3991/ijep.v14i4.48461\u003c/li\u003e\n \u003cli\u003eBeltozar-Clemente, S., \u0026amp; D\u0026iacute;az-Vega, E. (2024). Physics XP: Integration of ChatGPT and gamification to improve academic performance and motivation in physics 1 course. \u003cem\u003eInternational Journal of Engineering Pedagogy\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(6). https://doi.org/10.3991/ijep.v14i6.47127\u003c/li\u003e\n \u003cli\u003eBentley, S. V., Naughtin, C. K., McGrath, M. J., Irons, J. L., \u0026amp; Cooper, P. S. (2024). The digital divide in action: How experiences of digital technology shape future relationships with artificial intelligence. \u003cem\u003eAI and Ethics\u003c/em\u003e, \u003cem\u003e4\u003c/em\u003e(4), 901\u0026ndash;915. https://doi.org/10.1007/s43681-024-00452-3\u003c/li\u003e\n \u003cli\u003eBlack, P., \u0026amp; Wiliam, D. (2009). Developing the theory of formative assessment. \u003cem\u003eEducational Assessment, Evaluation and Accountability\u003c/em\u003e, \u003cem\u003e21\u003c/em\u003e(1), 5\u0026ndash;31. https://doi.org/10.1007/s11092-008-9068-5\u003c/li\u003e\n \u003cli\u003eBrown, A. L. (1992). Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. \u003cem\u003eJournal of the Learning Sciences\u003c/em\u003e, \u003cem\u003e2\u003c/em\u003e(2), 141\u0026ndash;178. https://doi.org/10.1207/s15327809jls0202_2\u003c/li\u003e\n \u003cli\u003eCapraro, V., Lentsch, A., Acemoglu, D., Akgun, S., Akhmedova, A., Bilancini, E., Bonnefon, J.-F., Bra\u0026ntilde;as-Garza, P., Butera, L., Douglas, K. M., Everett, J. A. C., Gigerenzer, G., Greenhow, C., Hashimoto, D. A., Holt-Lunstad, J., Jetten, J., Johnson, S., Kunz, W. H., Longoni, C., \u0026hellip; Viale, R. (2024). The impact of generative artificial intelligence on socioeconomic inequalities and policy making. \u003cem\u003ePNAS Nexus\u003c/em\u003e, \u003cem\u003e3\u003c/em\u003e(6), 191. https://doi.org/10.1093/pnasnexus/pgae191\u003c/li\u003e\n \u003cli\u003eChang, C., Hwang, G., \u0026amp; Gau, M. (2022). Promoting students\u0026rsquo; learning achievement and self‐efficacy: A mobile chatbot approach for nursing training. \u003cem\u003eBritish Journal of Educational Technology\u003c/em\u003e, \u003cem\u003e53\u003c/em\u003e(1), 171\u0026ndash;188. https://doi.org/10.1111/bjet.13158\u003c/li\u003e\n \u003cli\u003eChen, C., \u0026amp; Chang, C. (2024). Effectiveness of AI-assisted game-based learning on science learning outcomes, intrinsic motivation, cognitive load, and learning behavior. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e, \u003cem\u003e29\u003c/em\u003e(14), 18621\u0026ndash;18642. https://doi.org/10.1007/s10639-024-12553-x\u003c/li\u003e\n \u003cli\u003eChen, C., \u0026amp; Gong, Y. (2025). The role of AI-assisted learning in academic writing: A mixed-methods study on Chinese as a second language students. \u003cem\u003eEducation Sciences\u003c/em\u003e, \u003cem\u003e15\u003c/em\u003e(2), 141. https://doi.org/10.3390/educsci15020141\u003c/li\u003e\n \u003cli\u003eChen, J., Mokmin, N. A. M., Shen, Q., \u0026amp; Su, H. (2025). Leveraging AI in design education: exploring virtual instructors and conversational techniques in flipped classroom models. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e, 1-21. https://doi.org/10.1007/s10639-025-13458-z\u003c/li\u003e\n \u003cli\u003eChen, M. R. A. (2025). Improving English semantic learning outcomes through AI chatbot-based ARCS approach. \u003cem\u003eInteractive Learning Environments\u003c/em\u003e, 1-16. https://doi.org/10.1080/10494820.2025.2454443\u003c/li\u003e\n \u003cli\u003eChen, M.-R. A. (2024). Metacognitive mastery: Transformative learning in EFL through a generative AI chatbot fueled by metalinguistic guidance. \u003cem\u003eEducational Technology \u0026amp; Society\u003c/em\u003e, \u003cem\u003e27\u003c/em\u003e(3), 407\u0026ndash;427. https://www.jstor.org/stable/48787038\u003c/li\u003e\n \u003cli\u003eChen, M. R. A. (2024). The AI chatbot interaction for semantic learning: A collaborative note-taking approach with EFL students. \u003cem\u003eLanguage Learning \u0026amp; Technology\u003c/em\u003e, \u003cem\u003e28\u003c/em\u003e(1), 1-25.\u003c/li\u003e\n \u003cli\u003eChen, Y., Zhang, X., \u0026amp; Hu, L. (2024). A progressive prompt-based image-generative AI approach to promoting students\u0026rsquo; achievement and perceptions in learning ancient Chinese poetry. \u003cem\u003eEducational Technology \u0026amp; Society\u003c/em\u003e, \u003cem\u003e27\u003c/em\u003e(2), 284-305. https://hdl.handle.net/10125/73586\u003c/li\u003e\n \u003cli\u003eChu, H.-C., Hsu, C.-Y., \u0026amp; Wang, C.-C. (2025). Effects of AI-generated drawing on students\u0026rsquo; learning achievement and creativity in an ancient poetry course. \u003cem\u003eEducational Technology \u0026amp; Society\u003c/em\u003e, \u003cem\u003e28\u003c/em\u003e(2). https://doi.org/10.30191/ETS.202504_28(2).TP03\u003c/li\u003e\n \u003cli\u003eCobb, P., Confrey, J., diSessa, A., Lehrer, R., \u0026amp; Schauble, L. (2003). Design experiments in educational research. \u003cem\u003eEducational Researcher\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e(1), 9\u0026ndash;13. https://doi.org/10.3102/0013189X032001009\u003c/li\u003e\n \u003cli\u003eCook, D. A., Hamstra, S. J., Brydges, R., Zendejas, B., Szostek, J. H., Wang, A. T., Erwin, P. J., \u0026amp; Hatala, R. (2013). Comparative effectiveness of instructional design features in simulation-based education: Systematic review and meta-analysis. \u003cem\u003eMedical Teacher\u003c/em\u003e, \u003cem\u003e35\u003c/em\u003e(1), e867\u0026ndash;e898. https://doi.org/10.3109/0142159X.2012.714886\u003c/li\u003e\n \u003cli\u003eDong, L., Tang, X., \u0026amp; Wang, X. (2025). Examining the effect of artificial intelligence in relation to students\u0026rsquo; academic achievement: A meta-analysis. \u003cem\u003eComputers and Education: Artificial Intelligence\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e, 100400. https://doi.org/10.1016/j.caeai.2025.100400\u003c/li\u003e\n \u003cli\u003eEdwards, B. I., Olugbade, D., \u0026amp; Ojo, O. A. (2024). Facilitating cognitive load management and improved learning outcomes and attitudes in middle school technology and vocational education through ai chatbot. \u003cem\u003eJournal of Technical Education and Training\u003c/em\u003e, \u003cem\u003e16\u003c/em\u003e(3), 114-131. https://penerbit.uthm.edu.my/ojs/index.php/JTET/article/view/19476\u003c/li\u003e\n \u003cli\u003eFui-Hoon Nah, F., Zheng, R., Cai, J., Siau, K., \u0026amp; Chen, L. (2023). Generative AI and ChatGPT: Applications, challenges, and AI-human collaboration. Journal of information technology case and application research, 25(3), 277-304. https://doi.org/10.1080/15228053.2023.2233814\u003c/li\u003e\n \u003cli\u003eFathi, T. E., Saad, A., Larhzil, H., Lamri, D., \u0026amp; Ibrahmi, E. M. A. (2025). Integrating generative AI into STEM education: enhancing conceptual understanding, addressing misconceptions, and assessing student acceptance. \u003cem\u003eDisciplinary and Interdisciplinary Science Education Research\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e(1). https://doi.org/10.1186/s43031-025-00125-z\u003c/li\u003e\n \u003cli\u003eFidan, M., \u0026amp; Gencel, N. (2022). Supporting the instructional videos with chatbot and peer feedback mechanisms in online learning: The effects on learning performance and intrinsic motivation. \u003cem\u003eJournal of Educational Computing Research\u003c/em\u003e, \u003cem\u003e60\u003c/em\u003e(7), 1716-1741. https://doi.org/10.1177/07356331221077901\u003c/li\u003e\n \u003cli\u003eGasaymeh, A. M. M., \u0026amp; AlMohtadi, R. M. (2024). The effect of flipped interactive learning (FIL) based on ChatGPT on students\u0026rsquo; skills in a large programming class. \u003cem\u003eInternational Journal of Information and Education Technology\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(11), 1516-1522. https://doi.org/10.18178/ijiet.2024.14.11.2182\u003c/li\u003e\n \u003cli\u003eGong, X., Li, Z., \u0026amp; Qiao, A. (2025). Impact of generative AI dialogic feedback on different stages of programming problem solving. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e, \u003cem\u003e30\u003c/em\u003e(7), 9689\u0026ndash;9709. https://doi.org/10.1007/s10639-024-13173-1\u003c/li\u003e\n \u003cli\u003eHakim, V. G. A., Paiman, N. A., \u0026amp; Rahman, M. H. S. (2024). Genie‐on‐demand: A custom AI chatbot for enhancing learning performance, self‐efficacy, and technology acceptance in occupational health and safety for engineering education. \u003cem\u003eComputer Applications in Engineering Education\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e(6), e22800. https://doi.org/10.1002/cae.22800\u003c/li\u003e\n \u003cli\u003eHattie, J., \u0026amp; Timperley, H. (2007). The Power of Feedback. \u003cem\u003eReview of Educational Research\u003c/em\u003e, \u003cem\u003e77\u003c/em\u003e(1), 81\u0026ndash;112. https://doi.org/10.3102/003465430298487\u003c/li\u003e\n \u003cli\u003eHolmes, W., Porayska-Pomsta, K., Holstein, K., Sutherland, E., Baker, T., Shum, S. B., Santos, O. C., Rodrigo, M. T., Cukurova, M., Bittencourt, I. I., \u0026amp; Koedinger, K. R. (2022). Ethics of AI in Education: Towards a Community-Wide Framework. \u003cem\u003eInternational Journal of Artificial Intelligence in Education\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e(3), 504\u0026ndash;526. https://doi.org/10.1007/s40593-021-00239-1\u003c/li\u003e\n \u003cli\u003eHsu, M. H. (2024). Mastering medical terminology with ChatGPT and Termbot. \u003cem\u003eHealth Education Journal\u003c/em\u003e, \u003cem\u003e83\u003c/em\u003e(4), 352-358. https://doi.org/10.1177/00178969231197371\u003c/li\u003e\n \u003cli\u003eHui, Z., Zewu, Z., Jiao, H., \u0026amp; Yu, C. (2025). Application of ChatGPT-assisted problem-based learning teaching method in clinical medical education. \u003cem\u003eBMC Medical Education\u003c/em\u003e, \u003cem\u003e25\u003c/em\u003e(1), 1-7. https://doi.org/10.1186/s12909-024-06321-1\u003c/li\u003e\n \u003cli\u003eHwang, G. J., \u0026amp; Zhang, D. (2024). Effects of an adaptive computer agent-based digital game on EFL students\u0026rsquo; English learning outcomes. \u003cem\u003eEducational technology research and development\u003c/em\u003e, \u003cem\u003e72\u003c/em\u003e(6), 3271-3294. https://doi.org/10.1007/s11423-024-10396-4\u003c/li\u003e\n \u003cli\u003eHyland, K., \u0026amp; Hyland, F. (2006). Feedback on second language students\u0026rsquo; writing. \u003cem\u003eLanguage Teaching\u003c/em\u003e, \u003cem\u003e39\u003c/em\u003e(2), 83\u0026ndash;101. https://doi.org/10.1017/S0261444806003399\u003c/li\u003e\n \u003cli\u003eIoannidis, J. P. A. (2005). Why most published research findings are false. \u003cem\u003ePLoS Medicine\u003c/em\u003e, \u003cem\u003e2\u003c/em\u003e(8), e124. https://doi.org/10.1371/journal.pmed.0020124\u003c/li\u003e\n \u003cli\u003eJeon, J. (2023). Chatbot-assisted dynamic assessment (CA-DA) for L2 vocabulary learning and diagnosis. \u003cem\u003eComputer Assisted Language Learning\u003c/em\u003e, \u003cem\u003e36\u003c/em\u003e(7), 1338\u0026ndash;1364. https://doi.org/10.1080/09588221.2021.1987272\u003c/li\u003e\n \u003cli\u003eJi, Y., Zhan, Z., Li, T., Zou, X., \u0026amp; Lyu, S. (2025). Human-machine co-creation: the effects of ChatGPT on students\u0026apos; learning performance, AI awareness, critical thinking, and cognitive load in a STEM course towards entrepreneurship. \u003cem\u003eIEEE Transactions on Learning Technologies\u003c/em\u003e. https://doi.org/10.1109/TLT.2025.3554584\u003c/li\u003e\n \u003cli\u003eKaraman, M. R., \u0026amp; G\u0026ouml;ksu, İ. (2024). Are lesson plans created by ChatGPT more effective? an experimental study. \u003cem\u003eInternational Journal of Technology in Education\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e(1), 107\u0026ndash;127. https://doi.org/10.46328/ijte.607\u003c/li\u003e\n \u003cli\u003eKuhn, D. (2000). Metacognitive development. \u003cem\u003eCurrent Directions in Psychological Science\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e(5), 178\u0026ndash;181. https://doi.org/10.1111/1467-8721.00088\u003c/li\u003e\n \u003cli\u003eLee, Y.-F., Hwang, G.-J., \u0026amp; Chen, P.-Y. (2022). Impacts of an AI-based chabot on college students\u0026rsquo; after-class review, academic performance, self-efficacy, learning attitude, and motivation. \u003cem\u003eEducational Technology Research and Development\u003c/em\u003e, \u003cem\u003e70\u003c/em\u003e(5), 1843\u0026ndash;1865. https://doi.org/10.1007/s11423-022-10142-8\u003c/li\u003e\n \u003cli\u003eLee, Y. F., Hwang, G. J., \u0026amp; Chen, P. Y. (2025). Technology-based interactive guidance to promote learning performance and self-regulation: a chatbot-assisted self-regulated learning approach. \u003cem\u003eEducational Technology Research and Development\u003c/em\u003e, 1-26. https://doi.org/10.1007/s11423-025-10478-x\u003c/li\u003e\n \u003cli\u003eLi, H. (2023). Effects of a ChatGPT-based flipped learning guiding approach on learners\u0026rsquo; courseware project performances and perceptions. \u003cem\u003eAustralasian Journal of Educational Technology\u003c/em\u003e, \u003cem\u003e39\u003c/em\u003e(5), 40-58. https://doi.org/10.14742/ajet.8923\u003c/li\u003e\n \u003cli\u003eLi, H., Wang, Y., Luo, S., \u0026amp; Huang, C. (2025). The influence of GenAI on the effectiveness of argumentative writing in higher education: Evidence from a quasi-experimental study in China. \u003cem\u003eJournal of Asian Public Policy\u003c/em\u003e, \u003cem\u003e18\u003c/em\u003e(2), 405-430. https://doi.org/10.1080/17516234.2024.2363128\u003c/li\u003e\n \u003cli\u003eLi, Y., Sadiq, G., Qambar, G., \u0026amp; Zheng, P. (2025). The impact of students\u0026rsquo; use of ChatGPT on their research skills: The mediating effects of autonomous motivation, engagement, and self-directed learning. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e, \u003cem\u003e30\u003c/em\u003e(4), 4185\u0026ndash;4216. https://doi.org/10.1007/s10639-024-12981-9\u003c/li\u003e\n \u003cli\u003eLiang, H. Y., Hwang, G. J., Hsu, T. Y., \u0026amp; Yeh, J. Y. (2024). Effect of an AI‐based chatbot on students\u0026apos; learning performance in alternate reality game‐based museum learning. \u003cem\u003eBritish Journal of Educational Technology\u003c/em\u003e, \u003cem\u003e55\u003c/em\u003e(5), 2315-2338. https://doi.org/10.1111/bjet.13448\u003c/li\u003e\n \u003cli\u003eLinnenbrink-Garcia, L., \u0026amp; Pekrun, R. (2011). Students\u0026rsquo; emotions and academic engagement: Introduction to the special issue. \u003cem\u003eContemporary Educational Psychology\u003c/em\u003e, \u003cem\u003e36\u003c/em\u003e(1), 1\u0026ndash;3. https://doi.org/10.1016/j.cedpsych.2010.11.004\u003c/li\u003e\n \u003cli\u003eLiu, C. C., Hwang, G. J., Yu, P., Tu, Y. F., \u0026amp; Wang, Y. (2025). Effects of an automated corrective feedback-based peer assessment approach on students\u0026rsquo; learning achievement, motivation, and self-regulated learning conceptions in foreign language pronunciation. \u003cem\u003eEducational Technology Research and Development\u003c/em\u003e, 1-22. https://doi.org/10.1007/s11423-025-10484-z\u003c/li\u003e\n \u003cli\u003eLiu, X., Guo, B., He, W., \u0026amp; Hu, X. (2025). Effects of generative artificial intelligence on k-12 and higher education students\u0026rsquo; learning outcomes: A Meta-Analysis. \u003cem\u003eJournal of Educational Computing Research\u003c/em\u003e, \u003cem\u003e63\u003c/em\u003e(5), 1249\u0026ndash;1291. https://doi.org/10.1177/07356331251329185\u003c/li\u003e\n \u003cli\u003eLiu, Z.-M., Hwang, G.-J., Chen, C.-Q., Chen, X.-D., \u0026amp; Ye, X.-D. (2024). Integrating large language models into EFL writing instruction: Effects on performance, self-regulated learning strategies, and motivation. \u003cem\u003eComputer Assisted Language Learning\u003c/em\u003e, 1\u0026ndash;25. https://doi.org/10.1080/09588221.2024.2389923\u003c/li\u003e\n \u003cli\u003eLuo, J. (Jess), \u0026amp; Liu, X. (Caroline). (2025). What do we mean by digital equality in education? Toward five conceptual lenses based on a systematic review. \u003cem\u003eJournal of Research on Technology in Education\u003c/em\u003e, 1\u0026ndash;21. https://doi.org/10.1080/15391523.2025.2487279\u003c/li\u003e\n \u003cli\u003eMahapatra, S. (2024). Impact of ChatGPT on ESL students\u0026rsquo; academic writing skills: A mixed methods intervention study. \u003cem\u003eSmart Learning Environments\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e(1). https://doi.org/10.1186/s40561-024-00295-9\u003c/li\u003e\n \u003cli\u003eMercer, N., \u0026amp; Littleton, K. (2007). \u003cem\u003eDialogue and the Development of Children\u0026rsquo;s Thinking\u003c/em\u003e (0 ed.). Routledge. https://doi.org/10.4324/9780203946657\u003c/li\u003e\n \u003cli\u003eMoher, D., Liberati, A., Tetzlaff, J., Altman, D. G., \u0026amp; for the PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. \u003cem\u003eBMJ\u003c/em\u003e, \u003cem\u003e339\u003c/em\u003e(jul21 1), b2535\u0026ndash;b2535. https://doi.org/10.1136/bmj.b2535\u003c/li\u003e\n \u003cli\u003eNusivera, E., \u0026amp; Hikmat, A. (2025). Integration of Chat-GPT usage in language learning model to improve argumentation skills, complex comprehension skills, and critical thinking skills. \u003cem\u003eIJLTER. ORG\u003c/em\u003e, \u003cem\u003e24\u003c/em\u003e(2), 375-390. https://doi.org/10.26803/ijlter.24.2.19\u003c/li\u003e\n \u003cli\u003ePanadero, E. (2017). A review of self-regulated learning: Six models and four directions for research. \u003cem\u003eFrontiers in Psychology\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e, 422. https://doi.org/10.3389/fpsyg.2017.00422\u003c/li\u003e\n \u003cli\u003ePekrun, R. (2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. \u003cem\u003eEducational Psychology Review\u003c/em\u003e, \u003cem\u003e18\u003c/em\u003e(4), 315\u0026ndash;341. https://doi.org/10.1007/s10648-006-9029-9\u003c/li\u003e\n \u003cli\u003ePintrich, P. R. (2000). The role of goal orientation in self-regulated learning. In \u003cem\u003eHandbook of Self-Regulation\u003c/em\u003e (pp. 451\u0026ndash;502). Elsevier. https://doi.org/10.1016/B978-012109890-2/50043-3\u003c/li\u003e\n \u003cli\u003eRyan, R. M., \u0026amp; Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. \u003cem\u003eAmerican Psychologist\u003c/em\u003e, \u003cem\u003e55\u003c/em\u003e(1), 68\u0026ndash;78. https://doi.org/10.1037/0003-066X.55.1.68\u003c/li\u003e\n \u003cli\u003eSapan, M., \u0026amp; Uzun, L. (2024). The effect of ChatGPT-integrated English teaching on high school EFL learners\u0026rsquo; writing skills and vocabulary development. \u003cem\u003eInternational Journal of Education in Mathematics, Science and Technology\u003c/em\u003e, \u003cem\u003e12\u003c/em\u003e(6), 1679\u0026ndash;1699. https://doi.org/10.46328/ijemst.4655\u003c/li\u003e\n \u003cli\u003eShahsavar, Z., Kafipour, R., Khojasteh, L., \u0026amp; Pakdel, F. (2024). Is artificial intelligence for everyone? Analyzing the role of ChatGPT as a writing assistant for medical students. \u003cem\u003eFrontiers in Education\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e. https://doi.org/10.3389/feduc.2024.1457744\u003c/li\u003e\n \u003cli\u003eShi, H., Chai, C. S., Zhou, S., \u0026amp; Aubrey, S. (2025). Comparing the effects of ChatGPT and automated writing evaluation on students\u0026rsquo; writing and ideal L2 writing self. \u003cem\u003eComputer Assisted Language Learning\u003c/em\u003e, 1-28. https://doi.org/10.1080/09588221.2025.2454541\u003c/li\u003e\n \u003cli\u003eSun, L., \u0026amp; Zhou, L. (2024). Does generative artificial intelligence improve the academic achievement of college students? A Meta-analysis. \u003cem\u003eJournal of Educational Computing Research\u003c/em\u003e, \u003cem\u003e62\u003c/em\u003e(7), 1676\u0026ndash;1713. https://doi.org/10.1177/07356331241277937\u003c/li\u003e\n \u003cli\u003eSweller, J. (1988). Cognitive load during problem solving: Effects on learning. \u003cem\u003eCognitive Science\u003c/em\u003e, \u003cem\u003e12\u003c/em\u003e(2), 257\u0026ndash;285. https://doi.org/10.1207/s15516709cog1202_4\u003c/li\u003e\n \u003cli\u003eTeng, D., Wang, X., Xia, Y., Zhang, Y., Tang, L., Chen, Q., Zhang, R., Xie, S., \u0026amp; Yu, W. (2025). Investigating the utilization and impact of large language model-based intelligent teaching assistants in flipped classrooms. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e, \u003cem\u003e30\u003c/em\u003e(8), 10777\u0026ndash;10810. https://doi.org/10.1007/s10639-024-13264-z\u003c/li\u003e\n \u003cli\u003eTlili, A., Saqer, K., Salha, S., \u0026amp; Huang, R. (2025). Investigating the effect of artificial intelligence in education (AIEd) on learning achievement: A meta-analysis and research synthesis. \u003cem\u003eInformation Development\u003c/em\u003e, \u003cem\u003e41\u003c/em\u003e(3), 825\u0026ndash;842. https://doi.org/10.1177/02666669241304407\u003c/li\u003e\n \u003cli\u003eTsay, C. H., Kofinas, A. K., Trivedi, S. K., \u0026amp; Yang, Y. (2020). Overcoming the novelty effect in online gamified learning systems: An empirical evaluation of student engagement and performance. \u003cem\u003eJournal of Computer Assisted Learning\u003c/em\u003e, \u003cem\u003e36\u003c/em\u003e(2), 128\u0026ndash;146. https://doi.org/10.1111/jcal.12385\u003c/li\u003e\n \u003cli\u003eValentine, J. C., Pigott, T. D., \u0026amp; Rothstein, H. R. (2010). How many studies do you need?: A primer on statistical power for meta-analysis. \u003cem\u003eJournal of Educational and Behavioral Statistics\u003c/em\u003e, \u003cem\u003e35\u003c/em\u003e(2), 215\u0026ndash;247. https://doi.org/10.3102/1076998609346961\u003c/li\u003e\n \u003cli\u003eVan De Pol, J., Volman, M., \u0026amp; Beishuizen, J. (2010). Scaffolding in teacher\u0026ndash;student interaction: A decade of research. \u003cem\u003eEducational Psychology Review\u003c/em\u003e, \u003cem\u003e22\u003c/em\u003e(3), 271\u0026ndash;296. https://doi.org/10.1007/s10648-010-9127-6\u003c/li\u003e\n \u003cli\u003eVygotskij, L. S. (1981). \u003cem\u003eMind in society: The development of higher psychological processes\u003c/em\u003e (Nachdr.). Harvard Univ. Press.\u003c/li\u003e\n \u003cli\u003eWang, C., Zou, B., Du, Y., \u0026amp; Wang, Z. (2024). The impact of different conversational generative AI chatbots on EFL learners: An analysis of willingness to communicate, foreign language speaking anxiety, and self-perceived communicative competence. \u003cem\u003eSystem\u003c/em\u003e, \u003cem\u003e127\u003c/em\u003e, 103533. https://doi.org/10.1016/j.system.2024.103533\u003c/li\u003e\n \u003cli\u003eWang, H., Wang, C., Chen, Z., Liu, F., Bao, C., \u0026amp; Xu, X. (2025). Impact of AI-agent-supported collaborative learning on the learning outcomes of University programming courses. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e. https://doi.org/10.1007/s10639-025-13487-8\u003c/li\u003e\n \u003cli\u003eWang, M., Zhang, D., Zhu, J., \u0026amp; Gu, H. (2025). Effects of incorporating a large language model-based adaptive mechanism into contextual games on students\u0026rsquo; academic performance, flow experience, cognitive load and behavioral patterns. \u003cem\u003eJournal of Educational Computing Research\u003c/em\u003e, \u003cem\u003e63\u003c/em\u003e(3), 662-694. https://doi.org/10.1177/07356331251321719\u003c/li\u003e\n \u003cli\u003eWang, Y. (2025). A study on the efficacy of ChatGPT-4 in enhancing students\u0026rsquo; English communication skills. \u003cem\u003eSage Open\u003c/em\u003e, \u003cem\u003e15\u003c/em\u003e(1), 21582440241310644. https://doi.org/10.1177/21582440241310644\u003c/li\u003e\n \u003cli\u003eWei, X., Wang, L., Lee, L. K., \u0026amp; Liu, R. (2025). Multiple generative AI pedagogical agents in augmented reality environments: A study on implementing the 5E model in science education. \u003cem\u003eJournal of Educational Computing Research\u003c/em\u003e, \u003cem\u003e63\u003c/em\u003e(2), 336-371. https://doi.org/10.1177/07356331241305519\u003c/li\u003e\n \u003cli\u003eWood, D., Bruner, J. S., \u0026amp; Ross, G. (1976). The role of tutoring in problem solving. \u003cem\u003eJournal of Child Psychology and Psychiatry\u003c/em\u003e, \u003cem\u003e17\u003c/em\u003e(2), 89\u0026ndash;100. https://doi.org/10.1111/j.1469-7610.1976.tb00381.x\u003c/li\u003e\n \u003cli\u003eWu, T.-T., Hapsari, I. P., \u0026amp; Huang, Y.-M. (2025). Effects of incorporating AI chatbots into think\u0026ndash;pair\u0026ndash;share activities on EFL speaking anxiety, language enjoyment, and speaking performance. \u003cem\u003eComputer Assisted Language Learning\u003c/em\u003e, 1\u0026ndash;39. https://doi.org/10.1080/09588221.2025.2478271\u003c/li\u003e\n \u003cli\u003eYang, T.-C., Hsu, Y.-C., \u0026amp; Wu, J.-Y. (2025). The effectiveness of ChatGPT in assisting high school students in programming learning: Evidence from a quasi-experimental research. \u003cem\u003eInteractive Learning Environments\u003c/em\u003e, 1\u0026ndash;18. https://doi.org/10.1080/10494820.2025.2450659\u003c/li\u003e\n \u003cli\u003eZahran, F. A. (2025). The Impact of using Poe ChatGPT-based TPACK model on English as a foreign language teachers\u0026apos; performance and their students\u0026apos; vocabulary learning. \u003cem\u003eHigher Learning Research Communications\u003c/em\u003e, \u003cem\u003e15\u003c/em\u003e(1), n1.\u003c/li\u003e\n \u003cli\u003eZhao, G., Yang, L., Hu, B., \u0026amp; Wang, J. (2025). A Generative artificial intelligence (AI)-based human-computer collaborative programming learning method to improve computational thinking, learning attitudes, and learning achievement. \u003cem\u003eJournal of Educational Computing Research\u003c/em\u003e, \u003cem\u003e63\u003c/em\u003e(5), 1059\u0026ndash;1087. https://doi.org/10.1177/07356331251336154\u003c/li\u003e\n \u003cli\u003eZhou, Q., Hashim, H., \u0026amp; Sulaiman, N. A. (2025). Supporting English speaking practice in higher education: the impact of AI chatbot-integrated mobile-assisted blended learning framework. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e, 1-32.\u003c/li\u003e\n \u003cli\u003eZhu, Y., Liu, Q., \u0026amp; Zhao, L. (2025). Exploring the impact of Generative artificial intelligence on students\u0026rsquo; learning outcomes: A meta-analysis. \u003cem\u003eEducation and Information Technologies\u003c/em\u003e, \u003cem\u003e30\u003c/em\u003e(11), 16211\u0026ndash;16239. https://doi.org/10.1007/s10639-025-13420-z\u003c/li\u003e\n \u003cli\u003eZimmerman, B. J. (2002). Becoming a self-regulated learner: An overview. \u003cem\u003eTheory Into Practice\u003c/em\u003e, \u003cem\u003e41\u003c/em\u003e(2), 64\u0026ndash;70. https://doi.org/10.1207/s15430421tip4102_2\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Generative artificial intelligence, academic achievement, learning outcomes, meta-analysis, AI in education","lastPublishedDoi":"10.21203/rs.3.rs-7577394/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7577394/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis meta-analysis examines the impact of generative artificial intelligence (GenAI) tools, such as ChatGPT, on students’ academic achievement. Drawing on 52 experimental and quasi-experimental studies across educational levels and domains, we synthesized evidence from interventions using GenAI to support learning. Eligible studies reported performance outcomes (e.g., test scores, grades, GPA) and met rigorous inclusion criteria. Overall, GenAI-based instruction showed a positive effect (Hedges' g = 1.193) on academic achievement, with substantial between-study variability indicating that GenAI’s effectiveness depends on contextual and design features. Moderator analyses identified two significant factors, which are instructional role and subject area. GenAI was most effective when used to support formative functions such as assessment, feedback, and tutoring, suggesting that its strength lies in providing adaptive guidance and personalized learning support. Effects also varied across subject areas. Language education showed the strongest and most consistent gains, reflecting a close alignment between GenAI’s natural language capabilities and core instructional practices. In contrast, more modest effects were observed in computer science and art education, where applications tend to be narrower in scope. Other moderators, including educational level, sample size, intervention duration, and learning domain, did not yield statistically significant differences but revealed descriptive patterns that may inform future research and implementation. These findings suggest that GenAI tools hold considerable promise for improving academic performance when thoughtfully integrated into instructional practice. Educators and policymakers should consider both the role GenAI plays and the subject context to ensure its effective use in diverse educational settings.\u003c/p\u003e","manuscriptTitle":"Transforming Learning or Empty Promise? A Meta-Analysis of Generative AI in Education","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-26 20:45:28","doi":"10.21203/rs.3.rs-7577394/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"475db337-1bdd-4c3f-8ea7-b3f10cae97ee","owner":[],"postedDate":"September 26th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-11-10T10:28:01+00:00","versionOfRecord":[],"versionCreatedAt":"2025-09-26 20:45:28","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7577394","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7577394","identity":"rs-7577394","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.