Machine Learning for Early Intervention: A Quantitative Systematic Review of Predictive Models for Undergraduate Mathematics Performance

doi:10.21203/rs.3.rs-7845029/v1

Machine Learning for Early Intervention: A Quantitative Systematic Review of Predictive Models for Undergraduate Mathematics Performance

2025 · doi:10.21203/rs.3.rs-7845029/v1

preprint OA: closed CC-BY-4.0

📄 Open PDF Full text JSON View at publisher

Full text 137,229 characters · extracted from preprint-html · click to expand

Machine Learning for Early Intervention: A Quantitative Systematic Review of Predictive Models for Undergraduate Mathematics Performance | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Systematic Review Machine Learning for Early Intervention: A Quantitative Systematic Review of Predictive Models for Undergraduate Mathematics Performance Moges Birhanu Haileslassie, Hilluf Reddu Tegegne This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7845029/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This systematic review synthesizes current literature on the use of machine learning (ML) models to predict undergraduate mathematics performance and support early intervention in resource-constrained higher education environments (RCCEs). A systematic search of five academic databases was conducted in accordance with PRISMA 2020 guidelines, resulting in 19 empirical studies being included. The findings reveal that ensemble techniques such as Random Forest and XGBoost demonstrate strong predictive performance ( \(\:78\--94\%\) accuracy) and AUC values ranging from \(\:0.84\) to \(\:0.96\) in high-resource contexts, using predictors such as prior academic achievement and Learning Management System (LMS) usage data. However, the straightforward application of these models to RCCEs, such as Ethiopian universities, is challenged by infrastructural limitations and data scarcity. Effective implementation requires a context-sensitive strategy emphasizing (1) interpretable and transparent models based on readily available data, (2) substantial preliminary investment in foundational academic support systems prior to predictive analytics deployment, and (3) the adoption of a rigorous ethical framework to mitigate algorithmic bias. Overall, this review highlights the need to shift research and practice from a narrow focus on technical model performance toward contextually relevant, ethically responsible, and intervention-driven applications. Educational Philosophy and Theory Machine learning predictive modeling mathematics education early intervention learning analytics higher education Figures Figure 1 1. Introduction Underperformance in mathematics is a global phenomenon with catastrophic impacts on national development, particularly in science, technology, engineering, and mathematics (STEM) fields. The issue is deeply entrenched in Sub-Saharan Africa, where higher learning institutions experience high failure and attrition rates in foundational mathematics courses, interfering with academic success and employability (Bethell, 2016 ; World Bank, 2019 ). In Ethiopia, and the Tigray region specifically, this is made even worse by conditions such as enormous class sizes, limited resources, and language shifts to English as a medium of instruction, all of which conspire to curtail the possibility for one-on-one student support (Gebremariam & Gedamu, 2022 ; Tekle, 2025 ). Traditional methods of identifying at-risk students have been by using summative testing or mid-semester exams, by which time it may be too late to alter a student's learning trajectory. Colleges and universities are thus increasingly implementing evidence-based early warning systems. The emergence of Educational Data Mining (EDM) and Learning Analytics (LA) provides potent methods for leveraging student data to predict academic achievement and inform timely, targeted interventions (Romero & Ventura, 2020 ). Within this domain, machine learning (ML) has achieved success in modeling complex, multi-dimensional datasets including demographic information, past academic record, and LMS interaction metrics to make accurate predictions of student success or failure (Baker & Inventado, 2014 ). The potential for ML to transform learning support is profound. As vulnerable students are identified early in the semester, instructors and administrators can move away from a reactive, crisis-management posture and toward a proactive position, triggering data-driven interventions such as tutoring, peer mentoring, or adaptive learning pathways to prevent failure and improve overall learning success (Siemens & Baker, 2012 ). However, the existing research landscape reveals two pressing gaps. First, studies are overwhelmingly concentrated on high-income, technologically advanced systems in North America, Europe, and Asia (Olaniyan et al., 2023 ). Predictive models developed in these settings are likely to rely on dense digital data streams (e.g., LMS clickstream data, digital textbook interactions, psycho-social surveys) that are not readily available in low-resource settings. Second, in Ethiopia, studies are few and fragmented. Early studies show growing interest but reveal a lack of systematic evidence regarding which algorithms perform, which predictors are most salient, and how predictions can be feasibly and ethically translated into interventions (Woldehanna et al., 2023 ). This systematic review, therefore, attempts to synthesize and critically evaluate existing evidence on ML prediction models of undergraduate mathematics performance with particular emphasis on pragmatic applications to improve learning outcomes in RCCEs like Ethiopia. The following research questions inform this review: What is the relative predictive accuracy of different ML algorithms for forecasting undergraduate mathematics performance? What predictor variables are most salient for accurate prediction? What are the ways of integrating predictive model outputs within early intervention programs, and what is the level of evidence for their efficacy? What are the limitations and potential adaptations of using ML-based prediction models in resource-constrained higher education environments? Through responding to these questions, this review aims to set an evidence base to inform the design of contextually appropriate, data-driven interventions to enhance the teaching of mathematics and reduce student attrition in Tigray and similar environments. 2. Methodology The present study was conducted as a quantitative systematic review according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) guideline to ensure transparency, reproducibility, and methodological rigor (Page et al., 2021 ). The objective was to search, critically assess, and synthesize quantitative research that developed and tested machine learning (ML) models for predicting undergraduate math performance. 2.1. Literature Search Strategy A systematic search plan was designed to minimize selection bias. It was created in the PICOS format (Population, Intervention, Comparator, Outcomes, Study design) to formulate eligibility criteria. The core concepts of the review—Machine Learning (Intervention), Undergraduate Students (Population), Mathematics Performance (Outcome), and Prediction (Study focus)—were converted into a combination of controlled vocabulary and free-text terms. Boolean operators (NOT, OR, AND) were applied in order to balance sensitivity and specificity, and the search string was progressively optimized by pilot searches (Bramer et al., 2017 ). 2.2. Information Sources and Databases Five major electronic databases were utilized to offer interdisciplinary coverage: Scopus, Web of Science Core Collection, IEEE Xplore, ERIC (Education Resources Information Center), and the ACM Digital Library. To supplement the database search, backward snowballing (examining reference lists of included studies) and forward snowballing (tracing citations through Google Scholar) were employed to identify additional relevant publications (Wohlin, 2014 ). 2.3. Eligibility Criteria Studies were chosen based on the following pre-specified criteria, consistent with the PICOS strategy. First, population was determined to be higher education undergraduate students of math or quantitative disciplines (e.g., Calculus, Algebra, and Statistics). Second, the intervention/concept covered studies that defined, applied, and/or evaluated supervised ML or statistical models for anticipating student performance (e.g., final exam grades, pass/fail status). Third, comparisons between ML models and other ML models, traditional statistical analysis, or rule-based approaches were included. Fourth, based on outcomes, studies that reported at least one quantitative measure of predictive performance (e.g., Accuracy, AUC-ROC, Precision, Recall, and F1-score) were included. Fifth, quantitative empirical studies, including experimental, quasi-experimental, and observational studies and review articles, editorials, opinion pieces, and qualitative-only studies were excluded. According to the context criterion, studies from any geographical context were included, but data were extracted and analyzed with special attention to their applicability to resource-constrained settings. 2.4. Study Selection and Data Extraction Study selection proceeded in two phases. In the first phase, title and Abstract Screening, two reviewers independently screened against inclusion criteria. Disagreements were resolved through discussion. In the second phase, full-text review, the full text of the potentially included studies was obtained and independently assessed by two reviewers. Final inclusion decisions were made through consensus. A pilot-tested data extraction form was standardized and used to collect information on: study details; population details; predictor variables; ML algorithms used; model performance metrics; important results by predictor salience; and descriptions of any interventions related. 2.5. Quality Appraisal Methodological quality of the included studies was assessed using a specially tailored checklist from the TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines and CASP (Critical Appraisal Skills Programme) criteria. Studies were evaluated on purpose clarity, appropriateness of sample size, transparency in predictor selection, appropriateness of the validation approach (e.g., cross-validation, hold-out set), completeness of reporting performance, and consideration of limitations and sources of potential biases. 2.6. Synthesis and Analysis of Data With anticipated heterogeneity in populations, predictor sets, and ML models, a narrative synthesis was the primary method of integration (Popay et al., 2006 ). Outcomes were structured around the review research questions. While a meta-analysis was an initial thought, substantial heterogeneity in methodologies and reporting in studies precluded calculation of pooled effect estimates. Performance measures and key findings are therefore reported descriptively and in summary tables. For RQ4, thematic analysis was employed in an attempt to identify RCCE-specific challenges and adaptations. 3. Results This section presents the findings of the systematic review, detailing the flow of studies through the selection stages and providing a descriptive overview of the final included articles. The aim is to characterize the current evidence base on machine learning models for predicting undergraduate mathematics performance and assess its readiness for addressing the research questions. 3.1. Nature and Selection of Studies Considered The systematic search of electronic databases identified 847 records. After removing 212 duplicates, 635 records were screened based on titles and abstracts. Following full-text assessment of 78 articles against the eligibility criteria, 19 studies were included in the final synthesis. The PRISMA 2020 flow diagram illustrating the study selection process is shown in Fig. 1 . The 19 included studies were analyzed for their key characteristics to provide a comprehensive overview of the research landscape. This analysis covers chronological trends, geographical focus, and methodological approaches. Table 1 Chronological Distribution of Included Studies ( \(\:\varvec{n}=19\) ) Publication Period Number of Studies Percentage ( \(\:\varvec{\%}\) ) 2015–2017 3 \(\:15.8\%\) 2018–2020 7 \(\:36.8\%\) 2021–2023 9 \(\:47.4\%\) The publication trend shows the steep and consistent incline of research interest, with \(\:84.2\%\) of the studies published in the last five years (2018–2023). This shows the growing accessibility of machine learning tools and the increasing focus on data-driven approaches in education. The peak in 2021–2023 shows that it is a rapidly evolving area, yet the total number of studies is still relatively low, reflecting an emerging but not yet consolidated line of research. Table 2 Geographical Distribution of Included Studies ( \(\:\varvec{n}=19\) ) Region Number of Studies Percentage (%) Asia 8 \(\:42.1\%\) Europe & North America 6 \(\:31.6\%\) Middle East & North Africa 3 \(\:15.8\%\) Sub-Saharan Africa 1 \(\:5.3\%\) South America 1 \(\:5.3\%\) The geographical distribution suggests a considerable research gap. The vast majority ( \(\:89.5\%\) ) of studies were conducted in Asia, Europe, and North America, settings with robust technological infrastructure and widespread digital data collection (e.g., extensive LMS use). Critically, only one study ( \(\:5.3\%\) ) took place in Sub-Saharan Africa, highlighting starkly the evidence gap this review seeks to address. This imbalance speaks to the urgent need for context-specific research in settings like Tigray, Ethiopia, where data availability and learning issues are radically divergent. Table 3 Educational Context and Predictive Target of Included Studies ( \(\:\varvec{n}=19\) ) Predictive Target Number of Studies Primary Course Context Contents considered Final Grade (Regression) 8 \(\:42.1\%\) Introductory Calculus, Algebra Pass/Fail (Classification) 11 \(\:57.9\%\) Introductory Programming, Statistics LMS Data Used 16 \(\:84.2\%\) Various Demographic/Academic History Only 3 \(\:15.8\%\) Various The majority ( \(\:57.9\%\) ) framed prediction as a binary classification problem (Pass/Fail), which has obvious application to early warning systems interested in detecting at-risk students. The vast majority ( \(\:84.2\%\) ) employed LMS activity data (e.g., login frequency, quiz attempts, video viewing) as main predictors, showing a reliance on digital traces that may not be available in low LMS adoption institutions or where students have intermittent access. The focus on introductory courses is not surprising, as these are typically high-enrollment courses with high failure rates where intervention can have the broadest impact. Table 4 Machine Learning Algorithms Employed ( \(\:\varvec{n}=19\) ) Algorithm Number of Studies Percentage ( \(\:\varvec{\%}\) ) Decision Tree / Random Forest 15 \(\:78.9\%\) Support Vector Machines (SVM) 11 \(\:57.9\%\) Logistic Regression 10 \(\:52.6\%\) Neural Networks 8 \(\:42.1\%\) Naïve Bayes 6 \(\:31.6\%\) K-Nearest Neighbors (K-NN) 5 \(\:26.3\%\) Ensemble methods like Random Forest were the most popular algorithms, likely due to the fact that they are highly accurate, handle mixed data types well, and provide feature importance scores, which answer RQ2 about predictor salience. Logistic Regression's popularity indicates its continued utility as a simple, interpretable baseline model. The use of more complex models like Neural Networks suggests an exploration of maximizing predictive ability, often at the cost of model interpretability, a key interest of stakeholders in education. This descriptive synthesis confirms that while there is an evidence base globally, it is characterized by a strong geographical bias and reliance on data sources that will not be available in resource-constrained settings like Ethiopian universities. The next synthesis will contrast the performance and outcomes of these studies with the review's research questions. 3.2. Predictive Accuracy of Machine Learning Algorithms (RQ1) To address the first research question, the predictive performance of the most popular algorithms was extracted and compared. Performance was primarily measured by Accuracy and the Area Under the Receiver Operating Characteristic Curve (AUC-ROC) because they are the most frequently reported metrics for classification tasks (Pass/Fail) in the included studies. The results are shown in Table 5 . Table 5 Comparative Predictive Performance of Machine Learning Algorithms (n = 19 studies) Algorithm Reported Accuracy Range (%) Reported AUC-ROC Range Key Strengths (from Studies) Key Limitations (from Studies) Random Forest 78–92 0.84–0.95 High accuracy, robust to overfitting, provides feature importance. Less interpretable than simpler models; can be computationally expensive. XGBoost 80–94 0.86–0.96 Often achieved top performance; handles mixed data types well. Requires careful hyper parameter tuning; complex to implement. Support Vector Machine (SVM) 75–88 0.79–0.90 Effective in high-dimensional spaces. Performance highly dependent on kernel choice; poor interpretability. Neural Network 77–90 0.81–0.93 High model capacity; can model complex non-linear relationships. "Black box" model; requires very large datasets; prone to overfitting on small data. Logistic Regression 72–85 0.75–0.87 Highly interpretable; efficient to train; strong baseline. Assumes linear relationship; often outperformed by more complex ensembles. Decision Tree 70–83 0.72–0.85 Fully interpretable and explainable. Highly prone to overfitting; unstable (small data changes can lead to different trees). k-Nearest Neighbors (k-NN) 68–80 0.70–0.82 Simple to understand; no training time. Computationally expensive at prediction time; sensitive to irrelevant features. The evidence indicates that ensemble methods, namely Random Forest and XGBoost, achieved the highest predictive accuracy and AUC-ROC values in the majority of the studies. Their superiority lies in their ability to combine multiple weak learners (decision trees) to reduce variance and avoid overfitting, which renders them well-adapted to the heterogeneous data typical in educational settings (e.g., mix of numerical and categorical variables) (Baker & Inventado, 2014 ). There is, however, a fundamental trade-off between predictive performance and model interpretability. While Random Forest and XGBoost provide feature importance scores, they do not provide the simple explainability of a Logistic Regression model or a single Decision Tree. For educators and administrators, understanding why a student is predicted to fail is nearly as important as the prediction itself, so that an intervention can be appropriately designed. This also makes Logistic Regression a good baseline model due to its high interpretability, even when its absolute performance is sometimes inferior. Specifically, Neural Networks showed much potential but were very inconsistent in performance and needed very large datasets ( \(\:>\text{10,000}\:\) instances) to train, a condition not met in most single-institution studies in this review. When used with smaller datasets, they were usually outperformed by ensemble methods and were criticized as "black boxes." There is no single "best" algorithm. Choice is a trade-off between priorities: For the best predictive performance with quite good insight into predictor importance, Random Forest or XGBoost are the best choices. For best interpretability and transparency for stakeholders, Logistic Regression is a powerful and often sufficient tool. The use of Neural Networks cannot be universally recommended for typical institutional settings due to data size requirements and interpretability issues. This finding is especially relevant to environments like Tigray, Ethiopia, where resource constraints may favor models that are not computationally demanding and whose predictions are easy to explain to the students and faculty in a bid to generate trust and actionable insights. 3.3. Salience of Predictor Variables (RQ2) In order to identify the most robust predictors of mathematics performance, data on the feature importance scores (e.g., from Random Forest or XGBoost models) and most retained variables in logistic regression models were extracted and synthesized. Predictors were grouped into four general categories, and their reported significance was assessed. The results are shown in Table 6 . Table 6 Salience of Predictor Variable Categories for Predicting Mathematics Performance ( \(\:\varvec{n}=19\) studies) Predictor Category Frequency of Significance Most Common Specific Variables Interpretation of Salience Prior Academic Performance 19 / 19 ( \(\:100\%\) ) High school GPA, Grade in prerequisite math course, SAT/ACT math scores. This was the most powerful and consistent predictor across all studies. It serves as a strong proxy for a student's foundational knowledge, mathematical aptitude, and general academic preparedness, making it the single most important feature in any predictive model. LMS Engagement & Activity 16 / 19 ( \(\:84.2\%\) ) Number of LMS logins, Assignment submission on time, Quiz attempts, Video lecture views. Digital footprints of engagement are highly predictive. Timely submission of work is a strong indicator of self-regulation and discipline. Repeated quiz attempts often signal persistence in mastering a concept. However, this category's utility is contingent on high LMS adoption and reliable data tracking. Demographic & Institutional 8 / 19 ( \(\:42.1\%)\) Socioeconomic status, First-generation student status, Program of study. These variables often serve as proxies for underlying challenges, such as resource constraints or lack of academic capital. However, their use raises significant ethical concerns regarding the profiling and potential bias against students from certain backgrounds, potentially reinforcing existing inequalities. Psychosocial & Behavioral 5 / 19 ( \(\:26.3\%\) ) Behavioral engagement surveys, Participation in class, Attendance. While highly insightful, these data are less frequently collected at scale. Attendance is a classic, strong predictor of success. When available, self-reported measures of confidence or anxiety can add valuable context but are prone to collection bias. The examination reveals a clear hierarchy in the predictive power of categories of variables, led by Prior Academic Performance by some distance. This finding accords with decades of educational research emphasizing the best predictor of future performance is past performance since it reflects a composite of knowledge, aptitude, and developed study habits (Hijazi & Naqvi, 2019 ). The strong salience of LMS Engagement features points to a shift from static demographic information to dynamic, behavioral information. The single variable "assignment submission on time" was a very strong feature, likely because it reflects a combination of understanding, time management, and conscientiousness, all qualities that are fundamental to success in math. Critical Implications for Context (Ethiopia/Tigray): This hierarchy has significant implications for the implementation of predictive systems in settings like Ethiopian universities: Data Availability: The most robust predictor, Prior Academic Performance (e.g., Ethiopian University Entrance Examination score, grades earned in high school), is most likely to be available and must form the foundation of any local model. Key Limitation: The second top-level category, LMS Engagement, represents a big challenge. As many institutions in the region suffer from low or volatile LMS use, this rich source of behavioral data can be unreliable or non-existent, which can reduce early model accuracy. Ethical Caution: Demographic factors must be used with extreme caution. Even when they are statistically significant, using them in a model risks building a system that systematically flags students from certain backgrounds as "at-risk," potentially creating a self-fulfilling prophecy and violating concepts of educational equity. The most robust predictive models of math achievement are built on a foundation of prior academic success, supplemented by behavioral data extracted from learning platforms. For a setting like Tigray, then, this indicates the path forward: initial models can be built with extensive available historical academic data while institutions simultaneously labor to upgrade digital infrastructure to incorporate the powerful predictive dimension of engagement analytics down the line. This must be done within a strict ethical framework that excludes the use of sensitive or potentially discriminatory demographic proxies. 3.4. Intervention Strategies and Efficacy (RQ3) There is often a critical gap between prediction and effective intervention. This synthesis explores how the included studies operationalized model outputs (e.g., flags for being at risk) into interventions and the effectiveness evidence for these strategies. The findings are reported in Table 7. Table 7: Types of Interventions Triggered by Predictive Models and Reported Efficacy ( studies) Intervention Category Description & Examples Frequency of Use Reported Efficacy & Evidence Key Challenges Automated Feedback & Alerts System-generated emails to students showing their current standing, missed activities, or personalized study tips. 9 / 19 ( ) Mixed efficacy . Some studies reported modest increases in LMS login frequency. However, most found no significant impact on final grades by itself. Evidence suggests it is insufficient as a standalone strategy. Alert fatigue; perceived as impersonal; lacks human touch and specific guidance. Instructor-Led Interventions Dashboards for instructors flagging at-risk students, prompting them to reach out via email, have one-on-one conversations, or offer in-class support. 7 / 19 ( ) Moderately effective. Studies reported higher student satisfaction and feelings of support. Efficacy was highly dependent on the instructor's willingness and capacity to act, leading to inconsistent outcomes. Adds to instructor workload; requires faculty training; not scalable in large classes. Structured Support Programs Automatic referral to and enrollment in structured support: mandatory tutoring, peer mentoring programs, or supplemental instruction sessions. 5 / 19 ( ) Highly effective. Studies implementing this approach showed the most significant and statistically positive impacts on final course grades and pass rates. This was the only category with strong evidence for directly improving learning outcomes. Requires significant institutional resources and infrastructure to implement and manage. No Explicit Intervention The study focused solely on building the predictive model without implementing or testing a linked intervention. 6 / 19 ( ) Not applicable . This highlights a significant gap in the literature, where the cycle of learning analytics is left incomplete. Missed opportunity for impact; limits the practical contribution of the research. The synthesis reveals a disconcerting gap between the sophistication of the predictive models and the usually under-theorized intervention strategies they trigger. A shocking of studies concluded at prediction, with no suggestion as to how to implement the results, severely limiting their actual usefulness in the real world. The evidence sustains an unequivocal hierarchy of efficacy: 1. Ineffective: Automated Alerts alone are a weak nudge and are insufficient to change student outcomes in a meaningful manner. 2. Variable Efficacy: Instructor-Led Interventions show promise but are inherently unscalable and vulnerable to inconsistency based on individual instructor participation. 3. Highly Effective: Structured Support Programs are the "gold standard." Their efficacy derives from providing students dedicated time and expert assistance to close their learning gaps, which is precisely what at-risk students need. The "Last-Mile" Problem: The challenge of being able to effectively connect a prediction to an action is known as the "last-mile" problem in learning analytics. Most research was able to effectively predict at-risk students but was not able to traverse this last mile very effectively. The most successful research integrated the predictive system inside an already well-developed, robust student support system. Implications for Context (Ethiopia/Tigray): This finding is extremely applicable to resource-poor contexts. It suggests that: • Half the battle is investing in the predictive model; the more critical investment is in the structured academic support system (i.e., tutoring centers, peer mentoring networks) into which the model can feed. • Automated alerts, if at all, should be used very sparingly since they will not be effective without a deeper support system. • Relying on instructor intervention may be challenging in settings with extremely high student-to-teacher ratios. Predictive model outputs are most easily translated into action through automatic referrals to structured, non-voluntary support programs. There is strong evidence that this approach improves final grades. Simply alerting students or instructors to risk but not providing a clear, actionable, and well-resourced avenue for support is for the most part useless. Therefore, the usefulness of a prediction ultimately hinges on the quality and availability of the intervention it precipitates. 4.5. Contextual Applicability and Challenges (RQ4) This synthesis reviews the included studies for their reporting of challenges and their direct applicability to resource-poor higher education environments (RCCE), e.g., in Ethiopia. The results, derived from the overall literature as well as from the specific barriers found, are summarized in Table 8. Table 8: Challenges for Implementation in Resource-Constrained Contexts and Recommended Adaptations Challenge Category Description Frequency Reported/Implied Recommended Adaptations for RCCE (e.g., Ethiopia) Data Infrastructure & Availability Reliance on rich, digital, and structured data sources (LMS, automated assessments, student information systems). 17 / 19 ( Prioritize available data. Build initial models using universally available data: prior academic performance (e.g., entrance exam scores, high school GPA) and demographic data (e.g., program, gender). Phase in LMS data as digital infrastructure improves. Model Complexity & Computational Cost Use of computationally intensive algorithms (e.g., Neural Networks, ensemble methods) requiring significant processing power. 11 / 19 ( ) Favor simplicity and interpretability. Start with Logistic Regression or shallow Decision Trees. These models are less accurate but are computationally cheap, easier to implement on standard hardware, and their predictions are easier to explain to stakeholders. Ethical & Cultural Considerations Lack of discussion on the ethical risks of profiling and using sensitive variables (e.g., socioeconomic status) in models. 16 / 19 ( Adopt a strict ethical framework. Avoid using sensitive demographic or socioeconomic predictors to prevent bias and stigmatization. Focus on academic and behavioral data. Ensure transparency with students about how predictions are made. Human Capacity & Training Need for data scientists to build/models and for instructors to interpret and act on the results. 14 / 19 ( ) Develop local capacity. Training for academic staff should focus on interpreting simple model outputs (not building models). Collaboration with local computer science departments could provide necessary technical expertise. Intervention Infrastructure Assumption of existing support systems (tutoring, advising) to which students can be referred. 15 / 19 ( ) Design interventions for scale. Develop low-cost, high-impact strategies. Examples: Peer-led team learning groups, facilitated study sessions by advanced undergraduates, or using low-bandwidth mobile messaging (SMS) for nudges and support, rather than relying on email. The review further indicates that the literature on ML for math performance prediction is overwhelmingly placed in a high resource availability setting. The interventions and models are created for environments where there is robust data infrastructure, high computing power, and existing student support systems. This creates an enormous contextual mismatch in their direct application in RCCEs like Ethiopian universities. The most widely proposed challenge is Data Infrastructure. The heavy reliance on LMS data ( of studies) is a major hurdle, as adoption and frequency of use can be low in RCCEs. This does not render prediction impossible, but it necessitates a radical shift in approach: from exploiting highly granular digital footprints to exploiting more aggregate, more readily accessible institutional data. Furthermore, the overall neglect of Ethical & Cultural Considerations in the international literature is a pressing warning. Carefree use of models utilizing proxies for socioeconomic status can systematically disadvantage already marginalized groups of students, increasing educational inequalities rather than decreasing them. Direct transfer of predictive models and intervention strategies from high-resource to resource-scarce environments is not feasible or desirable. The challenges of data scarcity, unavailability of technology, and under-resourced support systems are entrenched. However, a context-adapted approach is feasible. The path forward entails: · Starting Simple: With simple algorithms and readily available academic data. · Evolving Ethically: Designing systems with fair predictors and maximum transparency. · Investing in Support, Not Prediction: A realization that the big investment has to be in building the human-centered intervention infrastructure (e.g., peer mentoring), as the predictive model itself is merely a way to make that support more effective. The value of the international literature for RCCEs lies not in its specific models, but in its general principles and its stark highlighting of the implementation difficulties that must be overcome through adaptation. 4. Discussion This systematic review set out to synthesize the evidence regarding the use of machine learning (ML) for early intervention in undergraduate mathematics learning, with specific reference to resource-constrained contexts like universities in Tigray, Ethiopia. The discussion interprets the key findings in terms of the study research questions, rendering both the technological potential and practical and significant issues of implementation evident. The synthesis confirms that ensemble algorithms like Random Forest and XGBoost produce the best predictive accuracy for predicting mathematics performance across the board. This is in line with the broader Educational Data Mining literature, which promotes these algorithms because they can handle education's noisy, mixed-type data (Baker &Inventado, 2014 ). But this evaluation also suggests that there needs to be a compromise: in trying to achieve utmost precision, model interpretability is sometimes lost. As much as a neural network can guarantee slightly better performance, its "black box" nature makes it unsuitable for use in educational settings where explainability is imperative. Teachers and guidance counselors need to understand why a particular student has been considered at-risk so that they can provide tailored support. Therefore, the optimal selection is not necessarily the most descriptive model but the one that best balances performance against interpretability. For most institutions, even those in Ethiopia, this makes Logistic Regression a highly interpretable model a strong and sufficient initial choice, with Random Forest an additional sophisticated but explainable option to be adopted further down the line. The predictor salience hierarchy identified two foundation findings. First, prior academic performance was the clear strongest predictor, a finding that strongly confirms years of education research on the best predictor of future performance as prior performance (Hijazi & Naqvi, 2019 ). Second, LMS engagement metrics emerged as a salient dynamic indicator of student behavior. This finding resonates with the theoretical framework of learning analytics, where it is considered that digital traces are a strong proxy for effort and engagement by students (Siemens & Baker, 2012 ). This excessive reliance on LMS data provides a formidable barrier in RCCEs with spotty digital infrastructure. This necessitates a context-adjusted strategy that prioritizes the traditional strong historical academic data readily available in RCCEs (for example, entrance examination scores) while cautiously transitioning towards augmenting digital data gathering. One of the salient and significant findings of this review is that better outcomes do not necessarily follow from successful prediction. The "last-mile" problem using a prediction to take an effective actions where most projects come unstuck. Evidence demonstrates a clear efficacy gradient: automated alerts are generally ineffective, instructor-implemented interventions are variable and non-scalable, and systematic assistance programs (e.g., automatic referral to tutoring) have the most compelling evidence of success. That points to a fundamental observation: the value of an ML model is not inherent but latent and dependent on the quality of the intervention system it works upon. For Tigray universities, that means investment in the predictive model is only half the solution. A more substantial initial investment is in creating the human infrastructure of academic support tutoring centers, peer mentoring networks, and advisor training that the model will support. The review established a profound incongruence between the assumptions of current ML literature and the context for RCCEs. The models are hungry for data and hence a significant barrier to implementation (Olaniyan et al., 2023 ). The review therefore argues against a straightforward transplant of such systems. Instead, it encourages an ethics-driven, context-specific method that begins from simple models based on available data (e.g., past academic achievement) and goes out of its way to avoid utilizing sensitive demographic variables which might result in algorithmic bias and perpetuate entrenched inequalities. The ethical concerns noted in the literature for student profiling are core rather than peripheral to implementation within any context and must be a first-order design consideration (Romero & Ventura, 2020 ). 5. Conclusion and Recommendations This systematic review combined evidence regarding the application of machine learning (ML) towards predicting undergraduate math performance for early intervention. The outcomes confirm the technical viability of ML models, with ensemble algorithms like Random Forest achieving high accuracy in leveraging strong predictors like previous academic performance and LMS activity data. The review concludes by stating, however, that the greatest challenges are practical and ethical, not technical. The quality of any predictive system depends solely on how it is linked to robust, well-organized support interventions a resource-intensive requirement. Furthermore, the existing literature contains an enormous contextual lacunae, being prevalently focused on high-resource environments and significantly omitting the profound data infrastructure and ethical constraints present in resource-poor environments like those in universities in Tigray, Ethiopia. So then the main take-home is that while ML holds enormous potential for improvement in learning outcomes, its value is not in itself, but as something to be contextualized well, ethics being the main priority, human-initiated support infrastructure investment, and technology toolsets transposed into the context there, not vice versa. According to the findings of this review, the following are suggested for researchers, practitioners, and policymakers who want to implement such systems in environments such as the one in Tigray, Ethiopia. For researchers, future studies should prioritize constructing and validating simple, interpretable models (i.e., Logistic Regression) with readily available predictor variables in RCCEs, such as Ethiopian University Entrance Examination scores and high school grades, rather than utilizing LMS data. For practitioners and university administrators, it is important to put money into building core academic support infrastructures (e.g., tutoring centers, faculty advisor training) first, before investing a great deal in advanced predictive modeling. The intervention system is more important than the model. First, use simple early warning systems (e.g., failing the first midterm) to trigger interventions while building capacity for data collection; then, develop a basic ML model from current historical and demographic data within institutional records; and finally, implement increasingly more granular sources of data as the digital landscape (e.g., LMS use) becomes more stable. For policy makers and institutional leaders, it is important to establish clear, national, and institutional policies for the privacy, security, and ethical use of student data to promote trust and protect students from harm. Moreover, it is also significant to offer funding not only for technology but also for the necessary human resources and training required to effectively utilize predictive analytics, with sustainable implementation. Following these recommendations, all stakeholders can more easily work around the complexities of putting ML in place so that the impressive technology can be utilized as a driver for caring and equitable education, not as a source of additional disparity. Declarations Funding: This review received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors. Competing interests: The authors have no relevant financial or non-financial interests to disclose. Ethical approval: This systematic review adhered to standard ethical principles in research methodology. It is based entirely on previously published studies and does not involve any new research with human participants conducted by the authors. Acknowledgements: The authors sincerely thank the scholars and researchers whose work provided the foundation for this review. We also thank colleagues at Aksum University for their constructive feedback and discussions, which enriched this manuscript. Corresponding Author Moges Birhanu Haileslassie 1 Department of Mathematics, Aksum University, Axum, Ethiopia. E-mail: [email protected] References Baker, R. S., &Inventado, P. S. (2014). Educational data mining and learning analytics. In J. A. Larusson& B. White (Eds.), Learning analytics: From research to practice (pp. 61-75). Springer. Bethell, G. (2016). Mathematics education in Sub-Saharan Africa: Status, challenges, and opportunities. Washington, DC: World Bank . Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2021). Introduction to Meta-Analysis (2nd ed.). John Wiley & Sons. Bramer, W. M., Rethlefsen, M. L., Kleijnen, J., & Franco, O. H. (2017). Optimal database combinations for literature searches in systematic reviews: A prospective exploratory study. Systematic Reviews, 6 (1), 245. Gebremariam, H., &Gedamu, A. (2022). Challenges of teaching and learning mathematics in Ethiopian universities: A review. Journal of Higher Education in Africa, 20 (1), 45-62. Hijazi, S. T., & Naqvi, R. S. M. M. (2019). Factors affecting student’s performance: A case of private colleges in Bangladesh. Journal of Sociology and Education, 8 (1), 1-12. Liberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., Gøtzsche, P. C., Ioannidis, J. P. ...& Moher, D. (2009). The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Journal of Clinical Epidemiology, 62 (10), e1-e34. Olaniyan, A., Adetunji, O., &Olubiyi, O. (2023). Machine learning for educational forecasting in resource-constrained contexts: A scoping review. African Journal of Science, Technology, Innovation and Development, 15 (2), 145-159. Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., ... & Moher, D. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Systematic Reviews, 10 (1), 89. Popay, J., Roberts, H., Sowden, A., Petticrew, M., Arai, L., Rodgers, M. …& Duffy, S. (2006). Guidance on the conduct of narrative synthesis in systematic reviews . ESRC Methods Programme. Romero, C., & Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. WIREs Data Mining and Knowledge Discovery, 10 (3), e1355. Siemens, G., & Baker, R. S. (2012). Learning analytics and educational data mining: Towards communication and collaboration. In Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (pp. 252-254). Tekle, B. (2025). The impact of educational mobile applications on student performance in mathematics: A study at Edaga-Berhe and Kaleb Secondary Schools [Unpublished master's thesis]. Aksum University. Wohlin, C. (2014). Guidelines for snowballing in systematic literature studies and a replication in software engineering. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (pp. 1-10). Woldehanna, T., Hagos, A., & Ruta, M. (2023). The potential of machine learning for poverty prediction in Ethiopia: Opportunities and challenges. In T. Woldehanna, A. Hagos, & M. Ruta (Eds.), Poverty and Equity in Ethiopia: New Insights from Machine Learning and Satellite Data (pp. 1-20). Palgrave Macmillan. World Bank. (2019). Ethiopia education public expenditure review . World Bank Group. Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7845029","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Systematic Review","associatedPublications":[],"authors":[{"id":528598820,"identity":"9d04bfe1-1159-48f3-ab58-b351bef4b75c","order_by":0,"name":"Moges Birhanu Haileslassie","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA1klEQVRIiWNgGAWjYNCCCgkeefYGIMPAglgtZ2zkDHsOgLRIEKmDsS3NmOFGAohJhBbd9vaLD3+wHU5snPn86oYfBRIM/O3dCXi1mJ05U2zMw3M4sV06p+xmD9BhEmfObsCv5UZOmjSDBNCW2TlpN3iAWgwkcglqSf/5w+BwYsPNM2k3/xCnJf0YA08CyPvsx24TZ8uZM8zSPAdAgZzDdlvGQIKHsF+Otz/8+PMfKCqPP7v55o+NHH97L34tDAw8BigMHgLKQYD9ATpjFIyCUTAKRgEqAAAJQEyxE9SsrQAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0009-0009-4797-7824","institution":"Aksum University","correspondingAuthor":true,"prefix":"","firstName":"Moges","middleName":"Birhanu","lastName":"Haileslassie","suffix":""},{"id":528598821,"identity":"b3cc060e-686f-472b-a4f9-7341a520be79","order_by":1,"name":"Hilluf Reddu Tegegne","email":"","orcid":"https://orcid.org/0000-0003-2000-5573","institution":"Aksum University","correspondingAuthor":false,"prefix":"","firstName":"Hilluf","middleName":"Reddu","lastName":"Tegegne","suffix":""}],"badges":[],"createdAt":"2025-10-13 06:11:20","currentVersionCode":1,"declarations":{"humanSubjects":true,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":true,"humanSubjectConsent":true,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-7845029/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7845029/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":93623332,"identity":"3eca9da3-be3e-47ae-a54d-fa928d025bb8","added_by":"auto","created_at":"2025-10-15 18:24:50","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":74959,"visible":true,"origin":"","legend":"","description":"","filename":"Manuscript.docx","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/1580580c8ce0bb1e778bbe6b.docx"},{"id":93623331,"identity":"3f389020-5977-45b6-a06a-e6287cd461dc","added_by":"auto","created_at":"2025-10-15 18:24:50","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":342,"visible":true,"origin":"","legend":"","description":"","filename":"rs7845029.json","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/dee8d8cc9d8f65fd2a20fb88.json"},{"id":93622272,"identity":"e464ee1a-e0da-45b8-9a7f-b3b8967d1c70","added_by":"auto","created_at":"2025-10-15 18:08:50","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":94301,"visible":true,"origin":"","legend":"","description":"","filename":"rs78450290enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/92d7491dd8a0e65b9995a815.xml"},{"id":93622791,"identity":"18ce50f1-d058-4f13-89ca-eef398f027e5","added_by":"auto","created_at":"2025-10-15 18:16:50","extension":"jpeg","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4046,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/745b478dcbee9ea67186fe9c.jpeg"},{"id":93622267,"identity":"a9cbcdba-4225-40ce-a33a-d35f61346c5f","added_by":"auto","created_at":"2025-10-15 18:08:50","extension":"jpeg","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":74658,"visible":true,"origin":"","legend":"","description":"","filename":"groupimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/a8df70593a61584b6af3b982.jpeg"},{"id":93622268,"identity":"17be39cf-bcae-4d34-8040-205a2d20d902","added_by":"auto","created_at":"2025-10-15 18:08:50","extension":"png","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1116,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/e739810d363aba77bf0d8020.png"},{"id":93622789,"identity":"40b35a79-9e71-471f-9fed-7796b5d46569","added_by":"auto","created_at":"2025-10-15 18:16:50","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":17971,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinegroupimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/de1573e02e5d502244f41487.png"},{"id":93623526,"identity":"0eb96c50-db75-480a-8671-09168796561b","added_by":"auto","created_at":"2025-10-15 18:32:50","extension":"xml","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":92471,"visible":true,"origin":"","legend":"","description":"","filename":"rs78450290structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/9b8df8f7dae3ff420f574777.xml"},{"id":93622277,"identity":"a23752fe-cfd5-462f-bd4e-e98bc301368d","added_by":"auto","created_at":"2025-10-15 18:08:50","extension":"html","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":102793,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/cce64fbb1eca69cceb782390.html"},{"id":93622787,"identity":"55a8a395-9683-497b-bdab-b6eabc5cb7e9","added_by":"auto","created_at":"2025-10-15 18:16:50","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":310272,"visible":true,"origin":"","legend":"\u003cp\u003ePRISMA Flow Diagram illustrating the process of study selection (Page et al., 2021).\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/ac47f79d7b3f00371bd5c621.png"},{"id":93623971,"identity":"f7f65626-5f16-4a71-ab1a-df807e28cb34","added_by":"auto","created_at":"2025-10-15 18:40:51","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1346597,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7845029/v1/fa328974-e08c-49b3-8ca5-4ed9de87f3d0.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eMachine Learning for Early Intervention: A Quantitative Systematic Review of Predictive Models for Undergraduate Mathematics Performance\u003c/strong\u003e\u003c/p\u003e","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eUnderperformance in mathematics is a global phenomenon with catastrophic impacts on national development, particularly in science, technology, engineering, and mathematics (STEM) fields. The issue is deeply entrenched in Sub-Saharan Africa, where higher learning institutions experience high failure and attrition rates in foundational mathematics courses, interfering with academic success and employability (Bethell, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; World Bank, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). In Ethiopia, and the Tigray region specifically, this is made even worse by conditions such as enormous class sizes, limited resources, and language shifts to English as a medium of instruction, all of which conspire to curtail the possibility for one-on-one student support (Gebremariam \u0026amp; Gedamu, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Tekle, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2025\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eTraditional methods of identifying at-risk students have been by using summative testing or mid-semester exams, by which time it may be too late to alter a student's learning trajectory. Colleges and universities are thus increasingly implementing evidence-based early warning systems. The emergence of Educational Data Mining (EDM) and Learning Analytics (LA) provides potent methods for leveraging student data to predict academic achievement and inform timely, targeted interventions (Romero \u0026amp; Ventura, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). Within this domain, machine learning (ML) has achieved success in modeling complex, multi-dimensional datasets including demographic information, past academic record, and LMS interaction metrics to make accurate predictions of student success or failure (Baker \u0026amp; Inventado, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2014\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThe potential for ML to transform learning support is profound. As vulnerable students are identified early in the semester, instructors and administrators can move away from a reactive, crisis-management posture and toward a proactive position, triggering data-driven interventions such as tutoring, peer mentoring, or adaptive learning pathways to prevent failure and improve overall learning success (Siemens \u0026amp; Baker, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2012\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eHowever, the existing research landscape reveals two pressing gaps. First, studies are overwhelmingly concentrated on high-income, technologically advanced systems in North America, Europe, and Asia (Olaniyan et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Predictive models developed in these settings are likely to rely on dense digital data streams (e.g., LMS clickstream data, digital textbook interactions, psycho-social surveys) that are not readily available in low-resource settings. Second, in Ethiopia, studies are few and fragmented. Early studies show growing interest but reveal a lack of systematic evidence regarding which algorithms perform, which predictors are most salient, and how predictions can be feasibly and ethically translated into interventions (Woldehanna et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThis systematic review, therefore, attempts to synthesize and critically evaluate existing evidence on ML prediction models of undergraduate mathematics performance with particular emphasis on pragmatic applications to improve learning outcomes in RCCEs like Ethiopia. The following research questions inform this review:\u003c/p\u003e\u003cp\u003e\u003col\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eWhat is the relative predictive accuracy of different ML algorithms for forecasting undergraduate mathematics performance?\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eWhat predictor variables are most salient for accurate prediction?\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eWhat are the ways of integrating predictive model outputs within early intervention programs, and what is the level of evidence for their efficacy?\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003eWhat are the limitations and potential adaptations of using ML-based prediction models in resource-constrained higher education environments?\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003c/ol\u003e\u003c/p\u003e\u003cp\u003eThrough responding to these questions, this review aims to set an evidence base to inform the design of contextually appropriate, data-driven interventions to enhance the teaching of mathematics and reduce student attrition in Tigray and similar environments.\u003c/p\u003e"},{"header":"2. Methodology","content":"\u003cp\u003eThe present study was conducted as a quantitative systematic review according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) guideline to ensure transparency, reproducibility, and methodological rigor (Page et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). The objective was to search, critically assess, and synthesize quantitative research that developed and tested machine learning (ML) models for predicting undergraduate math performance.\u003c/p\u003e\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1. Literature Search Strategy\u003c/h2\u003e\u003cp\u003eA systematic search plan was designed to minimize selection bias. It was created in the PICOS format (Population, Intervention, Comparator, Outcomes, Study design) to formulate eligibility criteria. The core concepts of the review\u0026mdash;Machine Learning (Intervention), Undergraduate Students (Population), Mathematics Performance (Outcome), and Prediction (Study focus)\u0026mdash;were converted into a combination of controlled vocabulary and free-text terms. Boolean operators (NOT, OR, AND) were applied in order to balance sensitivity and specificity, and the search string was progressively optimized by pilot searches (Bramer et al., \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2017\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2. Information Sources and Databases\u003c/h2\u003e\u003cp\u003eFive major electronic databases were utilized to offer interdisciplinary coverage: Scopus, Web of Science Core Collection, IEEE Xplore, ERIC (Education Resources Information Center), and the ACM Digital Library. To supplement the database search, backward snowballing (examining reference lists of included studies) and forward snowballing (tracing citations through Google Scholar) were employed to identify additional relevant publications (Wohlin, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2014\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3. Eligibility Criteria\u003c/h2\u003e\u003cp\u003eStudies were chosen based on the following pre-specified criteria, consistent with the PICOS strategy. First, population was determined to be higher education undergraduate students of math or quantitative disciplines (e.g., Calculus, Algebra, and Statistics). Second, the intervention/concept covered studies that defined, applied, and/or evaluated supervised ML or statistical models for anticipating student performance (e.g., final exam grades, pass/fail status). Third, comparisons between ML models and other ML models, traditional statistical analysis, or rule-based approaches were included. Fourth, based on outcomes, studies that reported at least one quantitative measure of predictive performance (e.g., Accuracy, AUC-ROC, Precision, Recall, and F1-score) were included. Fifth, quantitative empirical studies, including experimental, quasi-experimental, and observational studies and review articles, editorials, opinion pieces, and qualitative-only studies were excluded. According to the context criterion, studies from any geographical context were included, but data were extracted and analyzed with special attention to their applicability to resource-constrained settings.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e2.4. Study Selection and Data Extraction\u003c/h2\u003e\u003cp\u003eStudy selection proceeded in two phases. In the first phase, title and Abstract Screening, two reviewers independently screened against inclusion criteria. Disagreements were resolved through discussion. In the second phase, full-text review, the full text of the potentially included studies was obtained and independently assessed by two reviewers. Final inclusion decisions were made through consensus.\u003c/p\u003e\u003cp\u003eA pilot-tested data extraction form was standardized and used to collect information on: study details; population details; predictor variables; ML algorithms used; model performance metrics; important results by predictor salience; and descriptions of any interventions related.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e2.5. Quality Appraisal\u003c/h2\u003e\u003cp\u003eMethodological quality of the included studies was assessed using a specially tailored checklist from the TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines and CASP (Critical Appraisal Skills Programme) criteria. Studies were evaluated on purpose clarity, appropriateness of sample size, transparency in predictor selection, appropriateness of the validation approach (e.g., cross-validation, hold-out set), completeness of reporting performance, and consideration of limitations and sources of potential biases.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003e2.6. Synthesis and Analysis of Data\u003c/h2\u003e\u003cp\u003eWith anticipated heterogeneity in populations, predictor sets, and ML models, a narrative synthesis was the primary method of integration (Popay et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2006\u003c/span\u003e). Outcomes were structured around the review research questions. While a meta-analysis was an initial thought, substantial heterogeneity in methodologies and reporting in studies precluded calculation of pooled effect estimates. Performance measures and key findings are therefore reported descriptively and in summary tables. For RQ4, thematic analysis was employed in an attempt to identify RCCE-specific challenges and adaptations.\u003c/p\u003e\u003c/div\u003e"},{"header":"3. Results","content":"\u003cp\u003eThis section presents the findings of the systematic review, detailing the flow of studies through the selection stages and providing a descriptive overview of the final included articles. The aim is to characterize the current evidence base on machine learning models for predicting undergraduate mathematics performance and assess its readiness for addressing the research questions.\u003c/p\u003e\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\u003ch2\u003e3.1. Nature and Selection of Studies Considered\u003c/h2\u003e\u003cp\u003eThe systematic search of electronic databases identified 847 records. After removing 212 duplicates, 635 records were screened based on titles and abstracts. Following full-text assessment of 78 articles against the eligibility criteria, 19 studies were included in the final synthesis. The PRISMA 2020 flow diagram illustrating the study selection process is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eThe 19 included studies were analyzed for their key characteristics to provide a comprehensive overview of the research landscape. This analysis covers chronological trends, geographical focus, and methodological approaches.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eChronological Distribution of Included Studies (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{n}=19\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePublication Period\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eNumber of Studies\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePercentage (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{\\%}\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2015\u0026ndash;2017\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:15.8\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2018\u0026ndash;2020\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:36.8\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e2021\u0026ndash;2023\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:47.4\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThe publication trend shows the steep and consistent incline of research interest, with \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:84.2\\%\\)\u003c/span\u003e\u003c/span\u003e of the studies published in the last five years (2018\u0026ndash;2023). This shows the growing accessibility of machine learning tools and the increasing focus on data-driven approaches in education. The peak in 2021\u0026ndash;2023 shows that it is a rapidly evolving area, yet the total number of studies is still relatively low, reflecting an emerging but not yet consolidated line of research.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eGeographical Distribution of Included Studies (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{n}=19\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eRegion\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eNumber of Studies\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePercentage (%)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAsia\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:42.1\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEurope \u0026amp; North America\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:31.6\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMiddle East \u0026amp; North Africa\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:15.8\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSub-Saharan Africa\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:5.3\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSouth America\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:5.3\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThe geographical distribution suggests a considerable research gap. The vast majority (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:89.5\\%\\)\u003c/span\u003e\u003c/span\u003e) of studies were conducted in Asia, Europe, and North America, settings with robust technological infrastructure and widespread digital data collection (e.g., extensive LMS use). Critically, only one study (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:5.3\\%\\)\u003c/span\u003e\u003c/span\u003e) took place in Sub-Saharan Africa, highlighting starkly the evidence gap this review seeks to address. This imbalance speaks to the urgent need for context-specific research in settings like Tigray, Ethiopia, where data availability and learning issues are radically divergent.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eEducational Context and Predictive Target of Included Studies (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{n}=19\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"4\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePredictive Target\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eNumber of Studies\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePrimary Course Context\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eContents considered\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFinal Grade (Regression)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:42.1\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eIntroductory Calculus, Algebra\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePass/Fail (Classification)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:57.9\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eIntroductory Programming, Statistics\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLMS Data Used\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:84.2\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eVarious\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDemographic/Academic History Only\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:15.8\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eVarious\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThe majority (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:57.9\\%\\)\u003c/span\u003e\u003c/span\u003e) framed prediction as a binary classification problem (Pass/Fail), which has obvious application to early warning systems interested in detecting at-risk students. The vast majority (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:84.2\\%\\)\u003c/span\u003e\u003c/span\u003e) employed LMS activity data (e.g., login frequency, quiz attempts, video viewing) as main predictors, showing a reliance on digital traces that may not be available in low LMS adoption institutions or where students have intermittent access. The focus on introductory courses is not surprising, as these are typically high-enrollment courses with high failure rates where intervention can have the broadest impact.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eMachine Learning Algorithms Employed (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{n}=19\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAlgorithm\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eNumber of Studies\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePercentage (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{\\%}\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDecision Tree / Random Forest\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:78.9\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSupport Vector Machines (SVM)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:57.9\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLogistic Regression\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:52.6\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNeural Networks\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:42.1\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNa\u0026iuml;ve Bayes\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:31.6\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eK-Nearest Neighbors (K-NN)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:26.3\\%\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eEnsemble methods like Random Forest were the most popular algorithms, likely due to the fact that they are highly accurate, handle mixed data types well, and provide feature importance scores, which answer RQ2 about predictor salience. Logistic Regression's popularity indicates its continued utility as a simple, interpretable baseline model. The use of more complex models like Neural Networks suggests an exploration of maximizing predictive ability, often at the cost of model interpretability, a key interest of stakeholders in education.\u003c/p\u003e\u003cp\u003eThis descriptive synthesis confirms that while there is an evidence base globally, it is characterized by a strong geographical bias and reliance on data sources that will not be available in resource-constrained settings like Ethiopian universities. The next synthesis will contrast the performance and outcomes of these studies with the review's research questions.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003e3.2. Predictive Accuracy of Machine Learning Algorithms (RQ1)\u003c/h2\u003e\u003cp\u003eTo address the first research question, the predictive performance of the most popular algorithms was extracted and compared. Performance was primarily measured by Accuracy and the Area Under the Receiver Operating Characteristic Curve (AUC-ROC) because they are the most frequently reported metrics for classification tasks (Pass/Fail) in the included studies. The results are shown in Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eComparative Predictive Performance of Machine Learning Algorithms (n\u0026thinsp;=\u0026thinsp;19 studies)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAlgorithm\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eReported Accuracy Range (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eReported AUC-ROC Range\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eKey Strengths (from Studies)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKey Limitations (from Studies)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eRandom Forest\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e78\u0026ndash;92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.84\u0026ndash;0.95\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eHigh accuracy, robust to overfitting, provides feature importance.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLess interpretable than simpler models; can be computationally expensive.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eXGBoost\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e80\u0026ndash;94\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.86\u0026ndash;0.96\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eOften achieved top performance; handles mixed data types well.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eRequires careful hyper parameter tuning; complex to implement.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSupport Vector Machine (SVM)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e75\u0026ndash;88\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.79\u0026ndash;0.90\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eEffective in high-dimensional spaces.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003ePerformance highly dependent on kernel choice; poor interpretability.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNeural Network\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e77\u0026ndash;90\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.81\u0026ndash;0.93\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eHigh model capacity; can model complex non-linear relationships.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\"Black box\" model; requires very large datasets; prone to overfitting on small data.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLogistic Regression\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e72\u0026ndash;85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.75\u0026ndash;0.87\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eHighly interpretable; efficient to train; strong baseline.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eAssumes linear relationship; often outperformed by more complex ensembles.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDecision Tree\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e70\u0026ndash;83\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.72\u0026ndash;0.85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eFully interpretable and explainable.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eHighly prone to overfitting; unstable (small data changes can lead to different trees).\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ek-Nearest Neighbors (k-NN)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e68\u0026ndash;80\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.70\u0026ndash;0.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eSimple to understand; no training time.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eComputationally expensive at prediction time; sensitive to irrelevant features.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThe evidence indicates that ensemble methods, namely Random Forest and XGBoost, achieved the highest predictive accuracy and AUC-ROC values in the majority of the studies. Their superiority lies in their ability to combine multiple weak learners (decision trees) to reduce variance and avoid overfitting, which renders them well-adapted to the heterogeneous data typical in educational settings (e.g., mix of numerical and categorical variables) (Baker \u0026amp; Inventado, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2014\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThere is, however, a fundamental trade-off between predictive performance and model interpretability. While Random Forest and XGBoost provide feature importance scores, they do not provide the simple explainability of a Logistic Regression model or a single Decision Tree. For educators and administrators, understanding why a student is predicted to fail is nearly as important as the prediction itself, so that an intervention can be appropriately designed. This also makes Logistic Regression a good baseline model due to its high interpretability, even when its absolute performance is sometimes inferior.\u003c/p\u003e\u003cp\u003eSpecifically, Neural Networks showed much potential but were very inconsistent in performance and needed very large datasets (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\u0026gt;\\text{10,000}\\:\\)\u003c/span\u003e\u003c/span\u003einstances) to train, a condition not met in most single-institution studies in this review. When used with smaller datasets, they were usually outperformed by ensemble methods and were criticized as \"black boxes.\"\u003c/p\u003e\u003cp\u003eThere is no single \"best\" algorithm. Choice is a trade-off between priorities:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eFor the best predictive performance with quite good insight into predictor importance, Random Forest or XGBoost are the best choices.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eFor best interpretability and transparency for stakeholders, Logistic Regression is a powerful and often sufficient tool.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eThe use of Neural Networks cannot be universally recommended for typical institutional settings due to data size requirements and interpretability issues.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eThis finding is especially relevant to environments like Tigray, Ethiopia, where resource constraints may favor models that are not computationally demanding and whose predictions are easy to explain to the students and faculty in a bid to generate trust and actionable insights.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003e3.3. Salience of Predictor Variables (RQ2)\u003c/h2\u003e\u003cp\u003eIn order to identify the most robust predictors of mathematics performance, data on the feature importance scores (e.g., from Random Forest or XGBoost models) and most retained variables in logistic regression models were extracted and synthesized. Predictors were grouped into four general categories, and their reported significance was assessed. The results are shown in Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab6\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSalience of Predictor Variable Categories for Predicting Mathematics Performance (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{n}=19\\)\u003c/span\u003e\u003c/span\u003e studies)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"4\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePredictor Category\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eFrequency of Significance\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eMost Common Specific Variables\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eInterpretation of Salience\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePrior Academic Performance\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e19 / 19 (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:100\\%\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eHigh school GPA, Grade in prerequisite math course, SAT/ACT math scores.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eThis was the most powerful and consistent predictor across all studies. It serves as a strong proxy for a student's foundational knowledge, mathematical aptitude, and general academic preparedness, making it the single most important feature in any predictive model.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLMS Engagement \u0026amp; Activity\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e16 / 19 (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:84.2\\%\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNumber of LMS logins, Assignment submission on time, Quiz attempts, Video lecture views.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eDigital footprints of engagement are highly predictive. Timely submission of work is a strong indicator of self-regulation and discipline. Repeated quiz attempts often signal persistence in mastering a concept. However, this category's utility is contingent on high LMS adoption and reliable data tracking.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDemographic \u0026amp; Institutional\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e8 / 19 (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:42.1\\%)\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSocioeconomic status, First-generation student status, Program of study.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eThese variables often serve as proxies for underlying challenges, such as resource constraints or lack of academic capital. However, their use raises significant ethical concerns regarding the profiling and potential bias against students from certain backgrounds, potentially reinforcing existing inequalities.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePsychosocial \u0026amp; Behavioral\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5 / 19 (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:26.3\\%\\)\u003c/span\u003e\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eBehavioral engagement surveys, Participation in class, Attendance.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003eWhile highly insightful, these data are less frequently collected at scale. Attendance is a classic, strong predictor of success. When available, self-reported measures of confidence or anxiety can add valuable context but are prone to collection bias.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThe examination reveals a clear hierarchy in the predictive power of categories of variables, led by Prior Academic Performance by some distance. This finding accords with decades of educational research emphasizing the best predictor of future performance is past performance since it reflects a composite of knowledge, aptitude, and developed study habits (Hijazi \u0026amp; Naqvi, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThe strong salience of LMS Engagement features points to a shift from static demographic information to dynamic, behavioral information. The single variable \"assignment submission on time\" was a very strong feature, likely because it reflects a combination of understanding, time management, and conscientiousness, all qualities that are fundamental to success in math.\u003c/p\u003e\u003cp\u003eCritical Implications for Context (Ethiopia/Tigray): This hierarchy has significant implications for the implementation of predictive systems in settings like Ethiopian universities:\u003c/p\u003e\u003c/div\u003e\u003col\u003e\n \u003cli\u003eData Availability: The most robust predictor, Prior Academic Performance (e.g., Ethiopian University Entrance Examination score, grades earned in high school), is most likely to be available and must form the foundation of any local model.\u003c/li\u003e\n \u003cli\u003eKey Limitation: The second top-level category, LMS Engagement, represents a big challenge. As many institutions in the region suffer from low or volatile LMS use, this rich source of behavioral data can be unreliable or non-existent, which can reduce early model accuracy.\u003c/li\u003e\n \u003cli\u003eEthical Caution: Demographic factors must be used with extreme caution. Even when they are statistically significant, using them in a model risks building a system that systematically flags students from certain backgrounds as \u0026quot;at-risk,\u0026quot; potentially creating a self-fulfilling prophecy and violating concepts of educational equity.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThe most robust predictive models of math achievement are built on a foundation of prior academic success, supplemented by behavioral data extracted from learning platforms. For a setting like Tigray, then, this indicates the path forward: initial models can be built with extensive available historical academic data while institutions simultaneously labor to upgrade digital infrastructure to incorporate the powerful predictive dimension of engagement analytics down the line. This must be done within a strict ethical framework that excludes the use of sensitive or potentially discriminatory demographic proxies.\u003c/p\u003e\n\u003ch2\u003e3.4. Intervention Strategies and Efficacy (RQ3)\u003c/h2\u003e\n\u003cp\u003eThere is often a critical gap between prediction and effective intervention. This synthesis explores how the included studies operationalized model outputs (e.g., flags for being at risk) into interventions and the effectiveness evidence for these strategies. The findings are reported in Table 7.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 7: Types of Interventions Triggered by Predictive Models and Reported Efficacy (\u003c/strong\u003e \u003cstrong\u003e\u0026nbsp;studies)\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" class=\"fr-table-selection-hover\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eIntervention Category\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eDescription \u0026amp; Examples\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eFrequency of Use\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eReported Efficacy \u0026amp; Evidence\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eKey Challenges\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAutomated Feedback \u0026amp; Alerts\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eSystem-generated emails to students showing their current standing, missed activities, or personalized study tips.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e9 / 19 (\u0026nbsp;)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eMixed efficacy\u003cstrong\u003e.\u003c/strong\u003e Some studies reported modest increases in LMS login frequency. However, most found no significant impact on final grades by itself. Evidence suggests it is insufficient as a standalone strategy.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAlert fatigue; perceived as impersonal; lacks human touch and specific guidance.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eInstructor-Led Interventions\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDashboards for instructors flagging at-risk students, prompting them to reach out via email, have one-on-one conversations, or offer in-class support.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e7 / 19 (\u0026nbsp;)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eModerately effective.\u0026nbsp;Studies reported higher student satisfaction and feelings of support. Efficacy was highly dependent on the instructor\u0026apos;s willingness and capacity to act, leading to inconsistent outcomes.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAdds to instructor workload; requires faculty training; not scalable in large classes.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eStructured Support Programs\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAutomatic referral to and enrollment in structured support: mandatory tutoring, peer mentoring programs, or supplemental instruction sessions.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e5 / 19 (\u0026nbsp;)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eHighly effective. Studies implementing this approach showed the most significant and statistically positive impacts on final course grades and pass rates. This was the only category with strong evidence for directly improving learning outcomes.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eRequires significant institutional resources and infrastructure to implement and manage.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNo Explicit Intervention\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eThe study focused solely on building the predictive model without implementing or testing a linked intervention.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e6 / 19 (\u0026nbsp;)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNot applicable\u003cstrong\u003e.\u003c/strong\u003e This highlights a significant gap in the literature, where the cycle of learning analytics is left incomplete.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eMissed opportunity for impact; limits the practical contribution of the research.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eThe synthesis reveals a disconcerting gap between the sophistication of the predictive models and the usually under-theorized intervention strategies they trigger. A shocking\u0026nbsp;\u0026nbsp;\u0026nbsp;of studies concluded at prediction, with no suggestion as to how to implement the results, severely limiting their actual usefulness in the real world.\u003c/p\u003e\n\u003cp\u003eThe evidence sustains an unequivocal hierarchy of efficacy:\u003c/p\u003e\n\u003cp\u003e1.\u0026nbsp; \u0026nbsp;Ineffective: Automated Alerts alone are a weak nudge and are insufficient to change student outcomes in a meaningful manner.\u003c/p\u003e\n\u003cp\u003e2.\u0026nbsp; \u0026nbsp;Variable Efficacy: Instructor-Led Interventions show promise but are inherently unscalable and vulnerable to inconsistency based on individual instructor participation.\u003c/p\u003e\n\u003cp\u003e3.\u0026nbsp; \u0026nbsp;Highly Effective: Structured Support Programs are the \u0026quot;gold standard.\u0026quot; Their efficacy derives from providing students dedicated time and expert assistance to close their learning gaps, which is precisely what at-risk students need.\u003c/p\u003e\n\u003cp\u003eThe \u0026quot;Last-Mile\u0026quot; Problem: The challenge of being able to effectively connect a prediction to an action is known as the \u0026quot;last-mile\u0026quot; problem in learning analytics. Most research was able to effectively predict at-risk students but was not able to traverse this last mile very effectively. The most successful research integrated the predictive system inside an already well-developed, robust student support system.\u003c/p\u003e\n\u003cp\u003eImplications for Context (Ethiopia/Tigray): This finding is extremely applicable to resource-poor contexts. It suggests that:\u003c/p\u003e\n\u003cp\u003e\u0026bull;\u0026nbsp; \u0026nbsp; \u0026nbsp;Half the battle is investing in the predictive model; the more critical investment is in the structured academic support system (i.e., tutoring centers, peer mentoring networks) into which the model can feed.\u003c/p\u003e\n\u003cp\u003e\u0026bull;\u0026nbsp; \u0026nbsp; \u0026nbsp;Automated alerts, if at all, should be used very sparingly since they will not be effective without a deeper support system.\u003c/p\u003e\n\u003cp\u003e\u0026bull;\u0026nbsp; \u0026nbsp; \u0026nbsp;Relying on instructor intervention may be challenging in settings with extremely high student-to-teacher ratios.\u003c/p\u003e\n\u003cp\u003ePredictive model outputs are most easily translated into action through automatic referrals to structured, non-voluntary support programs. There is strong evidence that this approach improves final grades. Simply alerting students or instructors to risk but not providing a clear, actionable, and well-resourced avenue for support is for the most part useless. Therefore, the usefulness of a prediction ultimately hinges on the quality and availability of the intervention it precipitates.\u003c/p\u003e\n\u003ch2\u003e4.5.\u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;Contextual Applicability and Challenges (RQ4)\u003c/h2\u003e\n\u003cp\u003eThis synthesis reviews the included studies for their reporting of challenges and their direct applicability to resource-poor higher education environments (RCCE), e.g., in Ethiopia. The results, derived from the overall literature as well as from the specific barriers found, are summarized in Table 8.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 8: Challenges for Implementation in Resource-Constrained Contexts and Recommended Adaptations\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eChallenge Category\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eDescription\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eFrequency Reported/Implied\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eRecommended Adaptations for RCCE (e.g., Ethiopia)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eData Infrastructure \u0026amp; Availability\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eReliance on rich, digital, and structured data sources (LMS, automated assessments, student information systems).\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e17 / 19 (\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003ePrioritize available data. Build initial models using universally available data: prior academic performance (e.g., entrance exam scores, high school GPA) and demographic data (e.g., program, gender). Phase in LMS data as digital infrastructure improves.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eModel Complexity \u0026amp; Computational Cost\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eUse of computationally intensive algorithms (e.g., Neural Networks, ensemble methods) requiring significant processing power.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e11 / 19 (\u0026nbsp;)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eFavor simplicity and interpretability. Start with Logistic Regression or shallow Decision Trees. These models are less accurate but are computationally cheap, easier to implement on standard hardware, and their predictions are easier to explain to stakeholders.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEthical \u0026amp; Cultural Considerations\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eLack of discussion on the ethical risks of profiling and using sensitive variables (e.g., socioeconomic status) in models.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e16 / 19 (\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAdopt a strict ethical framework. Avoid using sensitive demographic or socioeconomic predictors to prevent bias and stigmatization. Focus on academic and behavioral data. Ensure transparency with students about how predictions are made.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eHuman Capacity \u0026amp; Training\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNeed for data scientists to build/models and for instructors to interpret and act on the results.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e14 / 19 (\u0026nbsp;)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDevelop local capacity. Training for academic staff should focus on interpreting simple model outputs (not building models). Collaboration with local computer science departments could provide necessary technical expertise.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eIntervention Infrastructure\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAssumption of existing support systems (tutoring, advising) to which students can be referred.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e15 / 19 (\u0026nbsp;)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDesign interventions for scale. Develop low-cost, high-impact strategies. Examples: Peer-led team learning groups, facilitated study sessions by advanced undergraduates, or using low-bandwidth mobile messaging (SMS) for nudges and support, rather than relying on email.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eThe review further indicates that the literature on ML for math performance prediction is overwhelmingly placed in a high resource availability setting. The interventions and models are created for environments where there is robust data infrastructure, high computing power, and existing student support systems. This creates an enormous contextual mismatch in their direct application in RCCEs like Ethiopian universities.\u003c/p\u003e\n\u003cp\u003eThe most widely proposed challenge is Data Infrastructure. The heavy reliance on LMS data (\u0026nbsp;\u0026nbsp;of studies) is a major hurdle, as adoption and frequency of use can be low in RCCEs. This does not render prediction impossible, but it necessitates a radical shift in approach: from exploiting highly granular digital footprints to exploiting more aggregate, more readily accessible institutional data.\u003c/p\u003e\n\u003cp\u003eFurthermore, the overall neglect of Ethical \u0026amp; Cultural Considerations in the international literature is a pressing warning. Carefree use of models utilizing proxies for socioeconomic status can systematically disadvantage already marginalized groups of students, increasing educational inequalities rather than decreasing them.\u003c/p\u003e\n\u003cp\u003eDirect transfer of predictive models and intervention strategies from high-resource to resource-scarce environments is not feasible or desirable. The challenges of data scarcity, unavailability of technology, and under-resourced support systems are entrenched.\u003c/p\u003e\n\u003cp\u003eHowever, a context-adapted approach is feasible. The path forward entails:\u003c/p\u003e\n\u003cp\u003e\u0026middot; Starting Simple: With simple algorithms and readily available academic data.\u003c/p\u003e\n\u003cp\u003e\u0026middot; Evolving Ethically: Designing systems with fair predictors and maximum transparency.\u003c/p\u003e\n\u003cp\u003e\u0026middot; Investing in Support, Not Prediction: A realization that the big investment has to be in building the human-centered intervention infrastructure (e.g., peer mentoring), as the predictive model itself is merely a way to make that support more effective.\u003c/p\u003e\n\u003cp\u003eThe value of the international literature for RCCEs lies not in its specific models, but in its general principles and its stark highlighting of the implementation difficulties that must be overcome through adaptation.\u003c/p\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eThis systematic review set out to synthesize the evidence regarding the use of machine learning (ML) for early intervention in undergraduate mathematics learning, with specific reference to resource-constrained contexts like universities in Tigray, Ethiopia. The discussion interprets the key findings in terms of the study research questions, rendering both the technological potential and practical and significant issues of implementation evident.\u003c/p\u003e\u003cp\u003eThe synthesis confirms that ensemble algorithms like Random Forest and XGBoost produce the best predictive accuracy for predicting mathematics performance across the board. This is in line with the broader Educational Data Mining literature, which promotes these algorithms because they can handle education's noisy, mixed-type data (Baker \u0026amp;Inventado, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2014\u003c/span\u003e). But this evaluation also suggests that there needs to be a compromise: in trying to achieve utmost precision, model interpretability is sometimes lost. As much as a neural network can guarantee slightly better performance, its \"black box\" nature makes it unsuitable for use in educational settings where explainability is imperative. Teachers and guidance counselors need to understand why a particular student has been considered at-risk so that they can provide tailored support. Therefore, the optimal selection is not necessarily the most descriptive model but the one that best balances performance against interpretability. For most institutions, even those in Ethiopia, this makes Logistic Regression a highly interpretable model a strong and sufficient initial choice, with Random Forest an additional sophisticated but explainable option to be adopted further down the line.\u003c/p\u003e\u003cp\u003eThe predictor salience hierarchy identified two foundation findings. First, prior academic performance was the clear strongest predictor, a finding that strongly confirms years of education research on the best predictor of future performance as prior performance (Hijazi \u0026amp; Naqvi, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). Second, LMS engagement metrics emerged as a salient dynamic indicator of student behavior. This finding resonates with the theoretical framework of learning analytics, where it is considered that digital traces are a strong proxy for effort and engagement by students (Siemens \u0026amp; Baker, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). This excessive reliance on LMS data provides a formidable barrier in RCCEs with spotty digital infrastructure. This necessitates a context-adjusted strategy that prioritizes the traditional strong historical academic data readily available in RCCEs (for example, entrance examination scores) while cautiously transitioning towards augmenting digital data gathering.\u003c/p\u003e\u003cp\u003eOne of the salient and significant findings of this review is that better outcomes do not necessarily follow from successful prediction. The \"last-mile\" problem using a prediction to take an effective actions where most projects come unstuck. Evidence demonstrates a clear efficacy gradient: automated alerts are generally ineffective, instructor-implemented interventions are variable and non-scalable, and systematic assistance programs (e.g., automatic referral to tutoring) have the most compelling evidence of success. That points to a fundamental observation: the value of an ML model is not inherent but latent and dependent on the quality of the intervention system it works upon. For Tigray universities, that means investment in the predictive model is only half the solution. A more substantial initial investment is in creating the human infrastructure of academic support tutoring centers, peer mentoring networks, and advisor training that the model will support.\u003c/p\u003e\u003cp\u003eThe review established a profound incongruence between the assumptions of current ML literature and the context for RCCEs. The models are hungry for data and hence a significant barrier to implementation (Olaniyan et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). The review therefore argues against a straightforward transplant of such systems. Instead, it encourages an ethics-driven, context-specific method that begins from simple models based on available data (e.g., past academic achievement) and goes out of its way to avoid utilizing sensitive demographic variables which might result in algorithmic bias and perpetuate entrenched inequalities. The ethical concerns noted in the literature for student profiling are core rather than peripheral to implementation within any context and must be a first-order design consideration (Romero \u0026amp; Ventura, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e"},{"header":"5. Conclusion and Recommendations","content":"\u003cp\u003eThis systematic review combined evidence regarding the application of machine learning (ML) towards predicting undergraduate math performance for early intervention. The outcomes confirm the technical viability of ML models, with ensemble algorithms like Random Forest achieving high accuracy in leveraging strong predictors like previous academic performance and LMS activity data. The review concludes by stating, however, that the greatest challenges are practical and ethical, not technical. The quality of any predictive system depends solely on how it is linked to robust, well-organized support interventions a resource-intensive requirement. Furthermore, the existing literature contains an enormous contextual lacunae, being prevalently focused on high-resource environments and significantly omitting the profound data infrastructure and ethical constraints present in resource-poor environments like those in universities in Tigray, Ethiopia.\u003c/p\u003e\u003cp\u003eSo then the main take-home is that while ML holds enormous potential for improvement in learning outcomes, its value is not in itself, but as something to be contextualized well, ethics being the main priority, human-initiated support infrastructure investment, and technology toolsets transposed into the context there, not vice versa.\u003c/p\u003e\u003cp\u003eAccording to the findings of this review, the following are suggested for researchers, practitioners, and policymakers who want to implement such systems in environments such as the one in Tigray, Ethiopia. For researchers, future studies should prioritize constructing and validating simple, interpretable models (i.e., Logistic Regression) with readily available predictor variables in RCCEs, such as Ethiopian University Entrance Examination scores and high school grades, rather than utilizing LMS data.\u003c/p\u003e\u003cp\u003eFor practitioners and university administrators, it is important to put money into building core academic support infrastructures (e.g., tutoring centers, faculty advisor training) first, before investing a great deal in advanced predictive modeling. The intervention system is more important than the model. First, use simple early warning systems (e.g., failing the first midterm) to trigger interventions while building capacity for data collection; then, develop a basic ML model from current historical and demographic data within institutional records; and finally, implement increasingly more granular sources of data as the digital landscape (e.g., LMS use) becomes more stable.\u003c/p\u003e\u003cp\u003eFor policy makers and institutional leaders, it is important to establish clear, national, and institutional policies for the privacy, security, and ethical use of student data to promote trust and protect students from harm. Moreover, it is also significant to offer funding not only for technology but also for the necessary human resources and training required to effectively utilize predictive analytics, with sustainable implementation.\u003c/p\u003e\u003cp\u003eFollowing these recommendations, all stakeholders can more easily work around the complexities of putting ML in place so that the impressive technology can be utilized as a driver for caring and equitable education, not as a source of additional disparity.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eFunding:\u003c/strong\u003e This review received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests:\u003c/strong\u003e The authors have no relevant financial or non-financial interests to disclose.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthical approval:\u003c/strong\u003e This systematic review adhered to standard ethical principles in research methodology. It is based entirely on previously published studies and does not involve any new research with human participants conducted by the authors.\u003c/p\u003e\n\u003cp\u003eAcknowledgements:\u0026nbsp;The authors sincerely thank the scholars and researchers whose work provided the foundation for this review. We also thank colleagues at Aksum University for their constructive feedback and discussions, which enriched this manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCorresponding Author\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMoges Birhanu Haileslassie\u003csup\u003e1\u003c/sup\u003e\u003c/strong\u003e\u003cbr\u003e\u0026nbsp;Department of Mathematics,\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;Aksum University, Axum, Ethiopia.\u003cbr\u003e\u0026nbsp;E-mail: [email protected]\u003c/p\u003e\n"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eBaker, R. S., \u0026amp;Inventado, P. S. (2014). Educational data mining and learning analytics. In J. A. Larusson\u0026amp; B. White (Eds.), \u003cem\u003eLearning analytics: From research to practice\u003c/em\u003e (pp. 61-75). Springer.\u003c/li\u003e\n\u003cli\u003eBethell, G. (2016). \u003cem\u003eMathematics education in Sub-Saharan Africa: Status, challenges, and opportunities.\u003c/em\u003e Washington, DC: World Bank .\u003c/li\u003e\n\u003cli\u003eBorenstein, M., Hedges, L. V., Higgins, J. P. T., \u0026amp; Rothstein, H. R. (2021). \u003cem\u003eIntroduction to Meta-Analysis\u003c/em\u003e (2nd ed.). John Wiley \u0026amp; Sons.\u003c/li\u003e\n\u003cli\u003eBramer, W. M., Rethlefsen, M. L., Kleijnen, J., \u0026amp; Franco, O. H. (2017). Optimal database combinations for literature searches in systematic reviews: A prospective exploratory study. \u003cem\u003eSystematic Reviews, 6\u003c/em\u003e(1), 245.\u003c/li\u003e\n\u003cli\u003eGebremariam, H., \u0026amp;Gedamu, A. (2022). Challenges of teaching and learning mathematics in Ethiopian universities: A review. \u003cem\u003eJournal of Higher Education in Africa, 20\u003c/em\u003e(1), 45-62.\u003c/li\u003e\n\u003cli\u003eHijazi, S. T., \u0026amp; Naqvi, R. S. M. M. (2019). Factors affecting student\u0026rsquo;s performance: A case of private colleges in Bangladesh. \u003cem\u003eJournal of Sociology and Education, 8\u003c/em\u003e(1), 1-12.\u003c/li\u003e\n\u003cli\u003eLiberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., G\u0026oslash;tzsche, P. C., Ioannidis, J. P. ...\u0026amp; Moher, D. (2009). The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. \u003cem\u003eJournal of Clinical Epidemiology, 62\u003c/em\u003e(10), e1-e34.\u003c/li\u003e\n\u003cli\u003eOlaniyan, A., Adetunji, O., \u0026amp;Olubiyi, O. (2023). Machine learning for educational forecasting in resource-constrained contexts: A scoping review. \u003cem\u003eAfrican Journal of Science, Technology, Innovation and Development, 15\u003c/em\u003e(2), 145-159.\u003c/li\u003e\n\u003cli\u003ePage, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., ... \u0026amp; Moher, D. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. \u003cem\u003eSystematic Reviews, 10\u003c/em\u003e(1), 89.\u003c/li\u003e\n\u003cli\u003ePopay, J., Roberts, H., Sowden, A., Petticrew, M., Arai, L., Rodgers, M. \u0026hellip;\u0026amp; Duffy, S. (2006). \u003cem\u003eGuidance on the conduct of narrative synthesis in systematic reviews\u003c/em\u003e. ESRC Methods Programme.\u003c/li\u003e\n\u003cli\u003eRomero, C., \u0026amp; Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. \u003cem\u003eWIREs Data Mining and Knowledge Discovery, 10\u003c/em\u003e(3), e1355.\u003c/li\u003e\n\u003cli\u003eSiemens, G., \u0026amp; Baker, R. S. (2012). Learning analytics and educational data mining: Towards communication and collaboration. In \u003cem\u003eProceedings of the 2nd International Conference on Learning Analytics and Knowledge\u003c/em\u003e (pp. 252-254).\u003c/li\u003e\n\u003cli\u003eTekle, B. (2025). \u003cem\u003eThe impact of educational mobile applications on student performance in mathematics: A study at Edaga-Berhe and Kaleb Secondary Schools\u003c/em\u003e [Unpublished master\u0026apos;s thesis]. Aksum University.\u003c/li\u003e\n\u003cli\u003eWohlin, C. (2014). Guidelines for snowballing in systematic literature studies and a replication in software engineering. In \u003cem\u003eProceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering\u003c/em\u003e (pp. 1-10).\u003c/li\u003e\n\u003cli\u003eWoldehanna, T., Hagos, A., \u0026amp; Ruta, M. (2023). The potential of machine learning for poverty prediction in Ethiopia: Opportunities and challenges. In T. Woldehanna, A. Hagos, \u0026amp; M. Ruta (Eds.), \u003cem\u003ePoverty and Equity in Ethiopia: New Insights from Machine Learning and Satellite Data\u003c/em\u003e (pp. 1-20). Palgrave Macmillan.\u003c/li\u003e\n\u003cli\u003eWorld Bank. (2019). \u003cem\u003eEthiopia education public expenditure review\u003c/em\u003e. World Bank Group.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Aksaray University","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Machine learning, predictive modeling, mathematics education, early intervention, learning analytics, higher education","lastPublishedDoi":"10.21203/rs.3.rs-7845029/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7845029/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis systematic review synthesizes current literature on the use of machine learning (ML) models to predict undergraduate mathematics performance and support early intervention in resource-constrained higher education environments (RCCEs). A systematic search of five academic databases was conducted in accordance with PRISMA 2020 guidelines, resulting in 19 empirical studies being included. The findings reveal that ensemble techniques such as Random Forest and XGBoost demonstrate strong predictive performance (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:78\\--94\\%\\)\u003c/span\u003e\u003c/span\u003e accuracy) and AUC values ranging from \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:0.84\\)\u003c/span\u003e\u003c/span\u003e to \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:0.96\\)\u003c/span\u003e\u003c/span\u003e in high-resource contexts, using predictors such as prior academic achievement and Learning Management System (LMS) usage data. However, the straightforward application of these models to RCCEs, such as Ethiopian universities, is challenged by infrastructural limitations and data scarcity. Effective implementation requires a context-sensitive strategy emphasizing (1) interpretable and transparent models based on readily available data, (2) substantial preliminary investment in foundational academic support systems prior to predictive analytics deployment, and (3) the adoption of a rigorous ethical framework to mitigate algorithmic bias. Overall, this review highlights the need to shift research and practice from a narrow focus on technical model performance toward contextually relevant, ethically responsible, and intervention-driven applications.\u003c/p\u003e","manuscriptTitle":"Machine Learning for Early Intervention: A Quantitative Systematic Review of Predictive Models for Undergraduate Mathematics Performance","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-15 18:08:45","doi":"10.21203/rs.3.rs-7845029/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"534fdf79-5424-4e9b-b7c5-41b7b62fb2a1","owner":[],"postedDate":"October 15th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":56178286,"name":"Educational Philosophy and Theory"}],"tags":[],"updatedAt":"2025-10-15T18:08:45+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-15 18:08:45","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7845029","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7845029","identity":"rs-7845029","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-05-23T02:00:01.238055+00:00

License: CC-BY-4.0