Comparative Evaluation of Machine Learning and Deep Learning Models for Blood Glucose Prediction on the OhioT1DM Dataset

doi:10.21203/rs.3.rs-7410777/v1

Comparative Evaluation of Machine Learning and Deep Learning Models for Blood Glucose Prediction on the OhioT1DM Dataset

2025 · doi:10.21203/rs.3.rs-7410777/v1

preprint OA: closed CC-BY-4.0

📄 Open PDF Full text JSON View at publisher

Full text 153,264 characters · extracted from preprint-html · click to expand

Comparative Evaluation of Machine Learning and Deep Learning Models for Blood Glucose Prediction on the OhioT1DM Dataset | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Comparative Evaluation of Machine Learning and Deep Learning Models for Blood Glucose Prediction on the OhioT1DM Dataset Taofiq Olanrewaju MUSA, Arsene ADJEVI, Donaldo Omondi JACCOJWANG, and 7 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7410777/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Type 1 diabetes mellitus is a common condition among young individuals, highlighting the need for accurate blood glucose level (BGL) predictions for effective continuous glucose monitoring. Investigating and comparing the performance of extreme gradient boosting models using a data-driven approach is essential for improving BGL prediction accuracy. This study extends the analysis of the OhioT1DM dataset by evaluating and comparing the performance of traditional machine learning models, extreme gradient boosting models (XGBoost, CatBoost, and LightGBM), and deep learning models (LSTM and Bi-LSTM) in predicting BGL. The findings demonstrate that extreme gradient boosting models can achieve competitive performance compared to certain deep learning architectures while being less computationally expensive. In this study, the LSTM model achieves an RMSE of 13.65 for a 30-minute prediction horizon, while the Bi-LSTM model records an RMSE of 21.73 when using continuous glucose monitoring (CGM) as the sole feature for future predictions using all the 12 patients. Blood Glucose Prediction Machine Learning Deep Learning OhioT1DM Dataset SHAP Interpretability Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 1. Introduction Type 1 diabetes mellitus (T1DM) is a serious disease that is mostly diagnosed in young people, but it can be found in any age group. T1DM is mainly caused when the pancreas that creates insulin is destroyed because it treats the beta cells in the pancreas as an intruder in the immune system. The more the beta cells are being killed, the more difficult it becomes for the pancreas to make the necessary insulin the body needs to live. There is a high chance of hyperglycemia when not enough blood glucose enters the body’s cells to build energy. Not enough insulin is produced in the pancreas due to diabetes ( Understanding Type 1 Diabetes | ADA , n.d.). It is estimated that by the year 2030, the total number of people that will be living with diabetes is 643 million, and 783 million by 2045, which is approximately a 22% increase in 15 years (International Diabetes Federation, 2025 ). In recent years, there has been an adoption of continuous glucose monitoring (CGM) devices to actively monitor the glucose levels of T1DM patients in the interstitial fluids to control for both the hyperglycemia and hypoglycemia stages, and the international diabetes community has welcomed the CGM devices (Petrie et al., 2017 ). Accurately tracking, monitoring, and maintaining a normal blood glucose level (BGL) for T1DM patients is essential for reducing the death rates from the disease (Kavakiotis et al., 2017 ). The CGM system is also known to be vital for collecting some glucose-related records, like insulin intake and other physiological records. Over the years, machine learning (ML) and deep learning (DL) approaches have been adopted in predicting BGL to actively monitor by taking a proactive approach in alerting the T1DM patients of any danger that may occur through their daily activities to improve their health and well-being using a real-time prediction method. The research on which models are being suited to accurately predict T1DM diabetes has been hindered by a lack of real patient data, but as of 2020, a group of researchers released the OhioT1DM dataset (Marling, C., & Bunescu, R., 2020 ) for research purposes, which includes continuous glucose monitoring, insulin, physiological sensor, and self-reported life-event data for people with type 1 diabetes that contains eight weeks’ of data from 12 individual patients (2018 and 2020 cohorts), with the gender distribution being seven males and five females (Marling & Bunescu, 2020 ). The random forest model is used on a multivariate dataset to predict subcutaneous glucose concentration in type 1 diabetes patients (Georga et al., 2012 ). (McShinsky & Marshall, 2020 ) do a comparison of statistical time series, and machine learning models, such as Auto-Regressive Integrated Moving-Average, Vector Auto-Regression, Kalman Filter, Unscented Kalman Filter, Ordinary Least Squares, Support Vector Machines, Random Forests, Gradient Boosted Trees, XGBoosted Trees, Adaptive Neuro-Fuzzy Inference System (ANFIS), and Multi-Layer Perceptron in terms of Root Mean Squared Error, Mean Absolute Error, Coefficient of Determination, Matthews Correlation Coefficient, and Clarke Error Grid to compare their effectiveness in predicting future blood glucose levels on the Ohio T1DM dataset on a 30-minute and 60-minute prediction horizon by selecting only six patients from the twelve unique number of patients. XGBoost, LightGBM, and CatBoost models are compared in daily streamflow forecasting in mountainous catchments, with XGBoost being the highest-performing model in terms of RMSE, Linear regression is used as a baseline model (Szczepanek, 2022 ). The eXtreme gradient boosting models, which contain XGBoost (Chen & Guestrin, 2016 ), LightGBM (Ke et al., 2017 ), and CatBoost (Prokhorenkova et al., 2018 ). (Ding et al., 2020 ) show that the CatBoost model outperforms other traditional models on the Walmart sales forecast dataset. XGBoost, LightGBM, and CatBoost are machine-learning libraries that use the Gradient Boosting technique. XGBoost was created in 2014 and quickly gained popularity after outperforming big datasets and winning numerous data science competitions. Since then, it has evolved by integrating new features, such as GPU learning and distributed learning via version updates. LightGBM, developed by Microsoft in 2017, is faster and uses less memory than XGBoost. It is designed to ensure high speed in huge data while maintaining high accuracy even in small data samples. CatBoost, developed by Yahoo in 2017, excels at handling categorical data and is an optimized algorithm that automatically performs regularization to prevent overfitting and enables rapid learning on both CPU and GPU (Ahn et al., 2023 ). The dataset has been used to publish a lot of state-of-the-art models in forecasting T1DM using either ML or DL models in the field of artificial intelligence (AI), even though accurate prediction remains a challenge for AI experts. It has been established by many researchers that DL models continue to have the best model performance in forecasting the OhioT1DM dataset (Cui et al., 2021 ), (Gómez-Castillo et al., 2022 ), (Giancotti et al., 2024 ), while they are not comparing it with machine learning models and also stating the time and resources it takes to achieve the results. In this paper, we do a comparison of ML and DL models based on model performance and resources. We also extend our analysis to incorporate significant ML models like LightGBM, Xgboost and CatBoost models, which have shown great predictive power in a lot of forecasting-related tasks, and also evaluate the differences between using a univariate and multivariate model approach using Root Mean Squared Error (RMSE) and R-squared on predicting the BGL in 30-minute and 60-minute horizons. 2. Methods The BGL prediction challenge entails estimating the future BGL of OhioT1DM on the test set at 30- to 60-minute prediction horizons, using a sequence of BGL values obtained at 5-minute intervals. The models are used in both univariate and multivariate studies; univariate analysis uses only the previous glucose value to predict the future glucose value, whilst multivariate analysis includes additional features. When employing a machine learning model, the architecture can only represent two dimensions; however, deep learning models can represent three. ML (Univariate): (N, L) ML (Multivariate): (N, L * F) DL (Univariate): (N, L, 1) DL (Multivariate): (N, L, F) Where: N is the number of samples. L is the lag windows (also the prediction horizon value) F is the number of features Univariate setting: CGM (Glucose) Multivariate setting: CGM, Fingerstick, Total insulin dose, Insulin does 3-hours variability 3.1 Data OhioT1DM contains the eighth week's blood glucose level as well as other physiological parameters of twelve patients with type 1 diabetes, with six datasets released in 2018 and the remaining six in 2020. The dataset includes seven males and five females, with the 2020 cohort using the Empatica sensor band and the 2018 cohort wearing the Basis sensor band (Marling & Bunescu, 2020 ). When loaded from XML format with 5-minute uniform timestamps for combining the dataset into a single data frame, the training data has 134,789 rows, and the test data has 31,743 rows. 3.2 Data Preprocessing The XML format is loaded with 5-minute uniform timestamps for combining the dataset into a single data frame for 2018 and 2020, both train and test sets. We created a tag and year feature to denote if the observation is in the train or test data and to denote the years, then it was joined together to have a whole train and test dataset for all unique patients. The glucose level values were shifted back to 15 minutes to account for the delays that may occur from the CGM system and all timestamps with missing glucose values were dropped from the dataset for improved data quality, which is the main focus to have a model that will accurately forecast the glucose level. 3.3 Data Standardization Data standardization is also known as feature scaling; this step is important in data processing, especially for numerical data points, because it ensures that data points are of the same close range to speed up the training process and improve model performance. Several studies have comparison on different scaling methods (Ahsan et al., 2021 ), (R & P, 2024 ), (Raju et al., 2020 ), and (De Amorim et al., 2022 ) like Standard scaler ( StandardScaler , n.d.), Min-max scaler ( MinMaxScaler , n.d.), Quantile transformer ( QuantileTransformer , n.d.), Robust scaler ( RobustScaler , n.d.), Normalizer ( Normalizer , n.d.), etc. The Standard Scaler method is used to scale the numerical training data for the deep learning models, and it is not a good idea for eXtreme gradient boosting models (Jha, 2024 ). 3.4 Evaluation Three different evaluation metrics are used in evaluating the performance of the ML and DL models in predicting BGL on the ohioT1DM dataset. The metrics are root mean squared error (RMSE), mean absolute error (MAE), and R-squared. The RMSE (Eq-1) is the prediction error between the actual and predicted values of the BGL, which is calculated by taking the square root of the mean squared error (MSE) to penalize large errors. The closer the RMSE to 0, the better the model performance on the test data. RMSE = \(\:\sqrt{{\frac{1}{n}{\sum\:}_{i=1}^{n}\left({Actual\:BGL}_{i}\:-\:{Predicted\:BGL}_{i}\right)}^{2}}\) (Eq-1) The MAE (Eq-2) is known as the absolute average between actual and predicted values of the BGL, where errors are treated equally. The lower the MAE the better the model performance on the test data. MAE = \(\:\frac{1}{n}{\sum\:}_{i=1}^{n}\left|{Actual\:BGL}_{i}\:-\:{Predicted\:BGL}_{i}\right|\) (Eq-2) The R-squared (Eq-3) is also known as the coefficient of determination; it assesses the proportion of variance of the dependent variable that is been explained by independent variables. The closer the values to 1, the better the explained variance in the data. R 2 = \(\:1\) \(\:-\) \(\:\frac{{{\sum\:}_{i=1}^{n}\left({Actual\:BGL}_{i\:}\:-\:{Predicted\:BGL}_{i}\right)}^{2}}{{{\sum\:}_{i=1}^{n}\left({Actual\:BGL}_{i\:}\:-\:{Predicted\:BGL}_{i}\right)}^{2}}\) (Eq-3) 3.5 Model Interpretability To further extend the trustworthiness in our ML models we can adopt using the SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) which helps to balance the trade-off between accuracy and interpretability especially the ML and DL models we consider in our study (Lundberg & Lee, 2017 ). LIME method is used to explain individual predictions by approximating the model locally with concept of why should I trust my prediction because a lot of high performing ML and DL models are black-box (Ribeiro et al., 2016 ), then SHAP was later developed for explaining model prediction using additive feature importance which can helps to interpret the prediction of the models globally and individually that will assist the clinicians to understand which parameters most influence the predicted glucose levels and will also help in debugging the performance of the models when needed (Lundberg & Lee, 2017 ). A lot of interpretable ML techniques have been developed in the healthcare domain which is to help the physicians about ML prediction we have the seen the use case in cardiovascular diseases (Kennedy et al., 2021), Eye diseases (Niu et al., 2021 ), Cancer (Ustun et al., 2013 ), Influenza and Infection diseases (Hu et al., 2020), Covid-19 prediction (Yan et al., 2020 ), depression diagnosis (Lakkaraju et al., 2017 ), Autism (Garbulowski et al., 2021 ), which shows that a lot of use case in the healthcare industry are already adopting interpretable ML for better decision making. (Prendin et al., 2023) critically mentioned where there is a need for using interpretable ML for blood glucose prediction in diabetes using SHAP the study shows that using SHAP can help to debug on the features used in developing the BGL model, and also to know which model to deploy for decision making where patient safety is a priority. 3. Results The results of various ML and DL models are presented using a 30-minute and 60-minute prediction horizon under univariate modeling and multivariate BGL levels. As shown in the table below. Table 1 shows that LSTM achieved the lowest RMSE (13.65) and the highest R² (0.9416), making it the most accurate model for BGL prediction. However, Bi-LSTM had slightly worse performance, with an RMSE of 14.48 and an R² of 0.9365, suggesting that it may not generalize as well as LSTM. Among the gradient boosting models, LightGBM performed the best, with an RMSE of 14.73 and R² of 0.9405, followed closely by CatBoost (RMSE = 14.77, R² = 0.9402) and XGBoost (RMSE = 14.98, R² = 0.9385). Linear regression had the highest RMSE (15.44) and lowest R² (0.9345), indicating that it struggled to capture the complexity of glucose level variations. In terms of computational efficiency, linear regression was the fastest, completing in just 0.15 seconds, but it lacked predictive accuracy. Gradient boosting models (LightGBM = 15.95s, XGBoost = 31.42s, CatBoost = 99.81s) provided a good balance between accuracy and efficiency. Meanwhile, deep learning models were the most computationally expensive, with LSTM taking 1229.26 seconds and Bi-LSTM requiring 3745.86 seconds. Table 1 Univariate 30-minute prediction horizon Metrics Linear Regression LightGBM Xgboost Catboost LSTM Bi-LSTM RMSE 15.44 14.73 14.98 14.77 13.65 14.48 MAE 9.46 8.84 8.92 8.83 8.81 9.54 R-Squared 0.9345 0.9405 0.9385 0.9402 0.9416 0.9365 Training time (sec) 0.15 15.95 31.42 99.81 1229.26 3745.86 Table 2 shows that Bi-LSTM achieved the lowest RMSE (21.73) and the highest R² (0.8511), making it the most accurate model for long-term glucose prediction. Among the gradient boosting models, CatBoost performed the best (RMSE = 23.37, R² = 0.8502), followed closely by LightGBM (RMSE = 23.40, R² = 0.8498) and XGBoost (RMSE = 23.59, R² = 0.8474). Linear regression had the weakest performance, with the highest RMSE (24.41) and lowest R² (0.8366), indicating its inability to capture the complex glucose trends over a longer horizon. Interestingly, LSTM had a negative R² (-0.477), indicating poor predictive ability for a 60-minute horizon. This suggests that LSTM struggled with long-term dependencies, potentially due to overfitting or an inability to capture delayed glucose fluctuations effectively. Regarding computational efficiency, Linear Regression remained the fastest model (0.67s), but at the cost of accuracy. Gradient boosting models had a moderate computational cost (LightGBM = 44.80s, XGBoost = 95.18s, CatBoost = 232.21s), making them viable for real-time applications. However, deep learning models required significantly longer runtimes, with LSTM taking 1115.23 seconds and Bi-LSTM requiring 3847.55 seconds, making them computationally expensive. Table 2 Univariate 60-minute prediction horizon Metrics Linear Regression LightGBM Xgboost Catboost LSTM Bi-LSTM RMSE 24.41 23.40 23.59 23.37 45.46 21.73 MAE 16.01 15.05 15.10 15.01 38.55 14.99 R-Squared 0.8366 0.8498 0.8474 0.8502 -0.477 0.8511 Training time (sec) 0.67 44.80 95.18 232.21 1115.23 3847.55 Table 3 shows that Bi-LSTM achieved the lowest RMSE (14.64) and the highest R² (0.9354), making it the most accurate model for this prediction horizon. Among the gradient boosting models, CatBoost outperformed the others with an RMSE of 14.97 and R² of 0.9385, followed closely by LightGBM (RMSE = 15.21, R² = 0.9365) and XGBoost (RMSE = 15.74, R² = 0.9321). Linear Regression had the weakest ML performance, with an RMSE of 15.41 and R² of 0.9349, showing that while it captured glucose trends reasonably well, it lagged behind advanced ML models. Interestingly, LSTM performed significantly worse than other models, with an RMSE of 41.07 and a negative R² (-0.4251), indicating poor predictive ability. This suggests that LSTM struggled to learn the short-term dependencies effectively, potentially due to overfitting or challenges in capturing rapid glucose fluctuations. Regarding computational efficiency, linear regression remained the fastest model (0.77s), but at the cost of accuracy. Among the gradient boosting models, LightGBM was the most efficient (30.43s), followed by XGBoost (85.90s) and CatBoost (161.77s). However, deep learning models required significantly longer runtimes, with LSTM taking 836.51 seconds and Bi-LSTM requiring 3673.86 seconds, making them computationally expensive Table 3 Multivariate 30-minute prediction horizon Metrics Linear Regression LightGBM Xgboost Catboost LSTM Bi-LSTM RMSE 15.41 15.21 15.74 14.97 41.07 14.64 MAE 9.43 9.12 9.33 8.94 35.05 9.48 R-Squared 0.9349 0.9365 0.9321 0.9385 -0.4251 0.9354 Training time (sec) 0.77 30.43 85.90 161.77 836.51 3673.86 Table 4 shows that CatBoost achieved the lowest RMSE (23.52) and the highest R² (0.8483), making it the best-performing model for this prediction horizon. LightGBM (RMSE = 23.65, R² = 0.8467) and XGBoost (RMSE = 24.81, R² = 0.8474) followed closely, demonstrating strong predictive capabilities. Linear Regression, while computationally efficient (3.80s), had a higher RMSE (24.36) and lower R² (0.8372), indicating weaker performance compared to gradient boosting models. Among deep learning models, Bi-LSTM performed better than LSTM, with an RMSE of 22.54 and an R² of 0.8439, showing that bidirectional processing enhances predictive accuracy. However, LSTM had the worst performance, with an RMSE of 47.48 and a negative R² (-0.501), indicating that it failed to capture meaningful glucose trends for the 60-minute horizon. This suggests that LSTM struggled with the longer prediction window, potentially due to difficulties in learning long-term dependencies. Regarding computational efficiency, linear regression remained the fastest model (3.80s), but at the cost of accuracy. Among ML models, LightGBM was the most efficient (99.90s), followed by XGBoost (297.06s) and CatBoost (497.42s). However, deep learning models were significantly more computationally expensive, with LSTM taking 761.25 seconds and Bi-LSTM requiring 3861.32 seconds. Table 4 Multivariate 60-minute prediction horizon Metrics Linear Regression LightGBM Xgboost Catboost LSTM Bi-LSTM RMSE 24.36 23.65 24.81 23.52 47.48 22.54 MAE 15.97 15.22 15.82 15.10 40.09 15.74 R-Squared 0.8372 0.8467 0.8474 0.8483 -0.501 0.8439 Training time (sec) 3.80 99.90 297.06 497.42 761.25 3861.32 4.5 Explainability Insights with SHAP To investigate the best performing model behaviour for both 30 minutes and 60 minutes under univariate and bivariate model configurations SHAP global plot are being used to explain the model globally and this can also help to know the current state of each best performing models, we understand that LSTM is the best performing model for 30-minutes univariate, while Bi-LSTM is the best performing model for 60-minutes univariate, 30–60 minutes multivariate settings. In this work the interpretation of the best performing model is explained in each setting using SHAP summary plot for global interpretation and model debugging. In SHAP summary plot each row represents a feature. A specific instance of the training set is represented by each dot in a row. In relation to its mean value, the dot's colour shows whether that data instance is associated with a low (cyan) or high (magenta) value of the feature. On the y-axis the features are ranked from highest to lowest, while the position on the x-axis represents whether a feature's high or low level contributes to the high or low SHAP value of the model prediction. For example, if the magenta value of a feature shows at the right hand side of the x-axis that means the high value contributes to increasing the model performance. Figure 9 shows the summary plot for the LSTM model with a forecast horizon of 30 minutes univariate. We see that for all six outputs, t-0 (the most recent CGM value) before prediction is the most relevant feature, followed by t-1 and t-2, with t-3 having the lowest overall importance among the six. We see that the most recent CGM value has a significant impact on the prediction of the next CGM value, implying that the higher the CGM value, the higher the prediction, and the lower the CGM value, the lower the prediction. The impact of t-1, t-2, t-4, t-5, and t-3 on SHAP values is relatively negligible, with the exception of t-1 in the first output, which appears to have some minor importance, as higher values are likewise related to higher prediction values and vice versa. Figure 10 shows the summary plot for the Bi-LSTM model with a forecast horizon of 60 minutes univariate. We see that for all twelve outputs t-0 to t-11, t-0 (the most recent CGM value) before prediction is the most relevant feature. We see that the most recent CGM value has a significant impact on the prediction of the next CGM value, implying that the higher the CGM value, the higher the prediction, and the lower the CGM value, the lower the prediction across all output prediction Figure 11 shows the summary plots corresponding to Bi-LSTM with a forecast horizon of 30 minutes multivariate. The plot reveals the most important feature is fingerstick, then followed by glucose (CGM) level, insulin dosage three hours variability, and total insulin dose. Fingerstick has large impact, glucose has moderate impact while insulin dosage three hours variability, and total insulin dosage have low impact. We observe that high fingerstick readings increase the CGM value, and low fingersticks decrease the CGM value. Glucose value have varied effects, insulin dosage three hours variability, higher value increases the CGM value and vice versa, and total insulin dosage have relatively non zero impact. Figure 12 shows the summary plots corresponding to Bi-LSTM with a forecast horizon of 60 minutes multivariate. The plot reveals the most important feature is insulin dosage three hours variability, then followed by glucose (CGM) level, total insulin dose and fingerstick. insulin dosage three hours variability has large impact, glucose and total insulin dosage has moderate impact while fingerstick has low impact. We observe that high insulin dosage three hours variability increases the CGM value, and low insulin dosage three hours variability decreases the CGM value. Glucose values have varied effects, total insulin dosage, higher value increase the CGM value and vice versa, and fingerstick have relatively non zero impact. 4.1 State-of-the-art comparison To contextualize the performance of the evaluated models, we compare our results with previously published state-of-the-art studies that have also utilized the OhioT1DM dataset or similar CGM-based datasets. Table 5 presents a comparative overview, focusing on RMSE values for 30- and 60-minute prediction horizons using CGM as the primary input feature. Table 5 State of the Art Comparison Study (Year) Methods Input Features RMSE (30 min) RMSE (60 min) (Ghimire et al., 2024 ) LSTM CGM 18.26 31.12 (Zhu et al., 2022 ) E33N CGM 18.92 32.54 (Nemat et al., 2022 ) Ensemble CGM 19.63 33.45 (Dudukcu et al., 2021 ) GRU CGM 21.90 35.10 Our Method Catboost CGM 14.77 23.37 Our Method LightGBM CGM 14.73 23.40 Our Method Xgboost CGM 14.98 23.59 Table 5 provides a comparative analysis of the suggested models (CatBoost, LightGBM, and XGBoost) in comparison with several published state-of-the-art approaches employed for short-term prediction of blood glucose levels from Continuous Glucose Monitoring (CGM) data. Performance evaluation is conducted with the metric of Root Mean Squared Error (RMSE) for two prediction time windows: 30 minutes and 60 minutes. Among the baseline models, Ghimire et al. ( 2024 ) achieved the best with an LSTM-based model and reported 18.26 and 31.12 for 30-minute and 60-minute horizons, respectively. Zhu et al. ( 2022 ) and Nemat et al. ( 2022 ) reported comparatively higher error rates with E33N and ensemble models. The GRU-based solution by Dudukcu et al. ( 2021 ) yielded the highest RMSEs, indicating lower predictive accuracy. On the contrary, our suggested gradient boosting-based techniques show considerable performance gains. To be specific, LightGBM obtained the best RMSE at the 30-minute horizon (14.73), whereas CatBoost had the best performance at the 60-minute horizon (23.37). XGBoost also showed competitive performance with RMSEs of 14.98 and 23.59, respectively. These results highlight the power and effectiveness of tree-based ensemble methods of modeling CGM data over deep learning models such as LSTM and GRU, particularly in settings with tabular, time-synchronized sensor inputs. The consistent improvement in performance on both prediction horizons is further evidence of the suitability of the models proposed for real-time or near-real-time glucose monitoring systems. This comparison clearly shows that our implementation of eXtreme Gradient Boosting models significantly outperforms previous deep learning approaches in terms of RMSE for both prediction horizons. These results emphasize that carefully tuned ML models—especially gradient boosting techniques—can achieve state-of-the-art results while requiring significantly less computational resources than deep learning methods. Additionally, while prior studies predominantly focused on deep neural networks, our study broadens the landscape by highlighting the efficacy of interpretable and scalable machine learning models. This adds a new dimension to the existing body of research and paves the way for real-world deployment scenarios where both accuracy and efficiency are essential. 4. Discussion Recent research has shown that machine learning (ML) and deep learning (DL) models outperform traditional methods for forecasting time series. Notably, ML and DL approaches have surpassed traditional statistical methods in various benchmark competitions, including those held on Kaggle (Bojer & Meldgaard, 2020 ). When applied to the T1DM dataset, recurrent self-attention networks with transfer learning produced cutting-edge results. Similarly, Nemat et al. ( 2024 ) compared classical time series (CTL), traditional machine learning (TML), and deep neural networks (DNN) in both univariate and multivariate prediction settings, concluding that TML models provided not only the best predictive performance but also the shortest training times. Building on previous work, we evaluate both ML and DL models on the Ohio diabetes dataset, focusing on 30- and 60-minute prediction horizons. We find that Bi-LSTM and Vanilla-LSTM designs perform well in the DL category, with Bi-LSTM outperforming Vanilla-LSTM, which is consistent with the findings of Butt et al. ( 2023 ). Furthermore, RMSE, correlation, and percentage error remain essential evaluation measures, similar to the January.ai methodology. (Zahedani et al., 2023 ). Our study compares the performance of various conventional models in BGL prediction with extreme gradient boosting models, which provide superior performance and enhanced computing efficiency relative to deep learning techniques used in prior research. (Ahn et al., 2023 ) uses XGBoost, LightGBM, CatBoost, and CNN-LSTM to forecast harmful algal blooms by using Bayesian hyperparameter techniques for hyperparameter tuning of the models. Gradient boosting models have been showing better performance in various tabular datasets over time since they were created by using an ensemble of multiple decision trees to improve predictive performance, which has a positive effect in preventing overfitting. Our results show DL models achieved the best performance across various prediction horizons. Specifically, in the univariate setting, the LSTM model attained the lowest RMSE of 13.65 for the 30-minute prediction horizon, while the Bi-LSTM model achieved an RMSE of 21.73 for the 60-minute horizon. Similarly, in the multivariate setting, the DL models demonstrated superior performance with RMSE values of 14.64 and 22.54 for the 30-minute and 60-minute prediction horizons, respectively. When comparing the best DL model in the univariate 30-minute prediction horizon to the best-performing ML model, the RMSE difference was only 7.91%. However, the ML model was approximately 77 times faster than its DL counterpart. In the univariate 60-minute setting, the RMSE difference between the best DL and ML models was 7.55%, with the ML model being 16.57 times faster. For the multivariate 30-minute prediction horizon, the best DL model outperformed the best ML model by a modest margin of 2.25% in RMSE, yet the ML model demonstrated a speed advantage, being 22.71 times faster. In the 60-minute multivariate setting, the RMSE difference was 4.35%, with the ML model operating 7.76 times faster than the DL model. This shows that, while DL models may exceed ML models in predictive accuracy (less than 8% in our study), ML models are more practical due to their computational efficiency, especially in real-time or resource-constrained healthcare situations. Furthermore, our ML models were assessed using the default hyperparameters. According to the literature, effective hyperparameter optimization can increase the performance of ML models (Pfob et al., 2022 ). However, this results in higher processing overhead and decreased interpretability (Ilemobayo et al., 2024 ). Furthermore, interpretability remains a major challenge in medical AI applications. Unlike DL models, which function as black boxes, ML models can convey information about feature relevance, particularly in multivariate scenarios. We used SHAP values (Lundberg & Lee, 2017 ) to improve the interpretability of our findings, allowing doctors to better grasp the underlying causes of blood glucose predictions. This could help clinicians make better-educated decisions about diabetes care. These findings underscore significant trade-offs among model accuracy, computational efficiency, and interpretability—elements to evaluate when selecting predictive models for blood glucose forecasting in clinical environments. 5. Conclusion This study presents a comprehensive comparative analysis of traditional machine learning, advanced gradient boosting, and deep learning approaches for blood glucose level prediction using the OhioT1DM dataset. By evaluating models across both univariate and multivariate settings, and over short-term (30-minute) and long-term (60-minute) prediction horizons, we identified the most effective modeling strategies for glucose forecasting in Type 1 diabetes. Our findings show that while Long Short-Term Memory (LSTM) models perform well in short-term univariate predictions, their performance deteriorates significantly over longer horizons, likely due to limitations in capturing long-term temporal dependencies and overfitting risks. In contrast, Bidirectional LSTM (Bi-LSTM) models showed superior performance in multivariate settings, particularly for the 30-minute horizon, albeit with much higher computational costs. Among machine learning models, gradient boosting techniques—especially LightGBM and CatBoost—consistently delivered competitive or superior predictive performance compared to deep learning models, with significantly reduced training time and computational requirements. LightGBM was the top-performing model in the 30-minute univariate setting, while CatBoost emerged as a robust contender across all multivariate tasks. Moreover, our study stands out in its integration of explainable AI perspectives. By using SHAP for model interpretability on each setting's best performing model, to highlight which features are driving the predictive performance of the models and also helps to debug our model performance for relevant insights in clinical decision-making processes. This transparency is crucial for real-world adoption of AI in healthcare. In conclusion, this research contributes to the growing evidence that well-tuned gradient boosting models can offer state-of-the-art performance for medical time-series forecasting, often rivaling or surpassing deep learning models in both accuracy and efficiency. We also extend our work to use SHAP for model interpretation and model debugging to efficiently make necessary decision-making in the clinical process. Future work should focus on real-time deployment, model interpretability through XAI methods, and the incorporation of additional physiological and behavioral features to further personalize and refine glucose prediction systems. Declarations Data availability and access: The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request. CrediT authorship contribution statement: Taofiq Olanrewaju Musa: Writing – original draft, Methodology, Conceptualization. Arsene Adjevi: Writing – review & editing, Validation, Software. Donaldo Omondi Jaccowang: Visualization, Data analysis. Raheem Nasirudeen Adeleye: Data preprocessing, Modeling. Diyaolu Abdulmalik Opeyemi: Statistical analysis, Comparative evaluation. Süleyman Uzun: Supervision, Conceptualization, Final review, Methodological oversight. Mustafa Zahid Yıldız: Methodology, Model optimization, Visualization validation. Ali Lazım: Project administration, Software infrastructure. Rhobi Peter Mwita: Literature review, Final validation. Declaration of competing interest: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. References Ahn, J. M., Kim, J., & Kim, K. (2023). Ensemble Machine Learning of Gradient Boosting (XGBoost, LightGBM, CatBoost) and Attention-Based CNN-LSTM for Harmful Algal Blooms Forecasting. Toxins , 15 (10), 608. https://doi.org/10.3390/toxins15100608 Ahsan, M., Mahmud, M., Saha, P., Gupta, K., & Siddique, Z. (2021). Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance. Technologies , 9 (3), 52. https://doi.org/10.3390/technologies9030052 Ashkan Dehghani Zahedani, Arvind, V., Saransh, A., Jiayu, Z., Jingyi, R., Michael, S., & Nima, A. (2023). Virtual blood glucose monitoring and prediction using machine learning . https://www.january.ai/ . https://www.january.ai/blog/white-paper-virtual-blood-glucose-monitoring-prediction-machine-learning Bojer, C. S., & Meldgaard, J. P. (2020). Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting , 37 (2), 587–603. https://doi.org/10.1016/j.ijforecast.2020.07.007 Butt, H., Khosa, I., & Iftikhar, M. A. (2023). Feature Transformation for Efficient Blood Glucose Prediction in Type 1 Diabetes Mellitus Patients. Diagnostics , 13 (3), 340. https://doi.org/10.3390/diagnostics13030340 Chen, T., & Guestrin, C. (2016). XGBoost. KDD , 785–794. https://doi.org/10.1145/2939672.2939785 Cui, R., Hettiarachchi, C., Nolan, C. J., Daskalaki, E., & Suominen, H. (2021). Personalised Short-Term Glucose Prediction via Recurrent Self-Attention Network. 34th International Symposium on Computer-Based Medical Systems (CBMS) , 154–159. https://doi.org/10.1109/cbms52027.2021.00064 De Amorim, L. B., Cavalcanti, G. D., & Cruz, R. M. (2022). The choice of scaling technique matters for classification performance. Applied Soft Computing , 133 , 109924. https://doi.org/10.1016/j.asoc.2022.109924 Ding, J., Chen, Z., Xiaolong, L., & Lai, B. (2020). Sales Forecasting Based on CatBoost. IEEE , 636–639. https://doi.org/10.1109/itca52113.2020.00138 Dudukcu, H. V., Taskiran, M., & Yildirim, T. (2021). Blood glucose prediction with deep neural networks using weighted decision level fusion. Journal of Applied Biomedicine , 41 (3), 1208–1223. https://doi.org/10.1016/j.bbe.2021.08.007 Garbulowski, M., Diamanti, K., Smolińska, K., Baltzer, N., Stoll, P., Bornelöv, S., Øhrn, A., Feuk, L., & Komorowski, J. (2021). R.ROSETTA: an interpretable machine learning framework. BMC Bioinformatics , 22 (1). https://doi.org/10.1186/s12859-021-04049-z Georga, E. I., Protopappas, V. C., Polyzos, D., & Fotiadis, D. I. (2012). A predictive model of subcutaneous glucose concentration in type 1 diabetes based on Random Forests. Annual International Conference of the IEEE Engineering in Medicine and Biology Society , 2889–2892. https://doi.org/10.1109/embc.2012.6346567 Ghimire, S., Celik, T., Gerdes, M., & Omlin, C. W. (2024). Deep learning for blood glucose level prediction: How well do models generalize across different data sets? PLoS ONE , 19 (9), e0310801. https://doi.org/10.1371/journal.pone.0310801 Giancotti, R., Bosoni, P., Vizza, P., Tradigo, G., Gnasso, A., Guzzi, P. H., Bellazzi, R., Irace, C., & Veltri, P. (2024). Forecasting glucose values for patients with type 1 diabetes using heart rate data. Computer Methods and Programs in Biomedicine , 257 , 108438. https://doi.org/10.1016/j.cmpb.2024.108438 Gómez-Castillo, N. Y., Cajilima-Cardenaz, P. E., Zhinin-Vera, L., Maldonado-Cuascota, B., Domínguez, D. L., Pineda-Molina, G., Hidalgo-Parra, A. A., & Gonzales-Zubiate, F. A. (2022). A machine learning approach for blood glucose level prediction using a LSTM network. In Communications in computer and information science (pp. 99–113). https://doi.org/10.1007/978-3-030-99170-8_8 Ilemobayo, J. A., Durodola, O., Alade, O., Awotunde, O. J., Olanrewaju, A. T., Falana, O., Ogungbire, A., Osinuga, A., Ogunbiyi, D., Ifeanyi, A., Odezuligbo, I. E., & Edu, O. E. (2024). Hyperparameter tuning in Machine Learning: A Comprehensive review. Journal of Engineering Research and Reports , 26 (6), 388–395. https://doi.org/10.9734/jerr/2024/v26i61188 International Diabetes Federation. (2025, February 14). Diabetes Facts and Figures | International Diabetes Federation . https://idf.org/about-diabetes/diabetes-facts-figures/ Jha, G. (2024, November 15). Feature Scaling in Machine Learning: Which Popular Algorithms Require It and Which Don’t? Medium . https://medium.com/@post.gourang/feature-scaling-in-machine-learning-which-popular-algorithms-require-it-and-which-dont-a71f5585d664 Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., & Chouvarda, I. (2017). Machine Learning and Data Mining Methods in Diabetes Research. Computational and Structural Biotechnology Journal , 15 , 104–116. https://doi.org/10.1016/j.csbj.2016.12.005 Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree . https://papers.nips.cc/paper_files/paper/2017/hash/6449f44a102fde 848669bdd9eb6b76fa-Abstract.html Lakkaraju, H., Kamar, E., Caruana, R., & Leskovec, J. (2017). Interpretable & Explorable Approximations of Black Box Models. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1707.01154 Lundberg, S. M., & Lee, S. (2017). A unified approach to interpreting model predictions. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1705.07874 Marling, C., & Bunescu, R. (2020, September 1). The OhioT1DM Dataset for Blood Glucose Level Prediction: Update 2020 . https://pmc.ncbi.nlm.nih.gov/articles/PMC7881904/ McShinsky, R., & Marshall, B. (2020). Comparison of Forecasting Algorithms for Type 1 Diabetic Glucose Prediction on 30 and 60-Minute Prediction Horizons. CEUR Workshop Proceeding , 12–18. http://ceur-ws.org/Vol-2675/paper2.pdf MinMaxScaler . (n.d.). Scikit-learn. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html Nemat, H., Khadem, H., Eissa, M. R., Elliott, J., & Benaissa, M. (2022). Blood glucose level prediction: Advanced Deep-Ensemble Learning Approach. IEEE Journal of Biomedical and Health Informatics , 26 (6), 2758–2769. https://doi.org/10.1109/jbhi.2022.3144870 Nemat, H., Khadem, H., Elliott, J., & Benaissa, M. (2024). Data-driven blood glucose level prediction in type 1 diabetes: a comprehensive comparative analysis. Scientific Reports , 14 (1). https://doi.org/10.1038/s41598-024-70277-x Niu, Y., Gu, L., Zhao, Y., & Lu, F. (2021). Explainable diabetic retinopathy detection and retinal image generation. IEEE Journal of Biomedical and Health Informatics , 26 (1), 44–55. https://doi.org/10.1109/jbhi.2021.3110593 Normalizer . (n.d.). Scikit-learn. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Normalizer.html Petrie, J. R., Peters, A. L., Bergenstal, R. M., Holl, R. W., Fleming, G. A., & Heinemann, L. (2017). Improving the Clinical Value and Utility of CGM Systems: Issues and Recommendations. Diabetes Care , 40 (12), 1614–1621. https://doi.org/10.2337/dci17-0043 Pfob, A., Lu, S., & Sidey-Gibbons, C. (2022). Machine learning in medicine: a practical introduction to techniques for data pre-processing, hyperparameter tuning, and model comparison. BMC Medical Research Methodology , 22 (1). https://doi.org/10.1186/s12874-022-01758-8 Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2018). CatBoost: unbiased boosting with categorical features. Neural Information Processing Systems , 31 , 6639–6649. https://papers.nips.cc/paper/7898-catboost-unbiased-boosting-with-categorical-features.pdf QuantileTransformer . (n.d.). Scikit-learn. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.QuantileTransformer.html R, R. C., & P, S. C. (2024). Evaluating Deep Learning with different feature scaling techniques for EEG-based Music Entrainment Brain Computer Interface. e-Prime - Advances in Electrical Engineering Electronics and Energy , 7 , 100448. https://doi.org/10.1016/j.prime.2024.100448 Raju, V. N. G., Lakshmi, K. P., Jain, V. M., Kalidindi, A., & Padma, V. (2020). Study the Influence of Normalization/Transformation process on the Accuracy of Supervised Classification. 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT) , 729–735. https://doi.org/10.1109/icssit48917.2020.9214160 Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: explaining the predictions of any classifier. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1602.04938 RobustScaler . (n.d.). Scikit-learn. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html StandardScaler . (n.d.). Scikit-learn. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html Szczepanek, R. (2022). Daily Streamflow Forecasting in Mountainous Catchment Using XGBoost, LightGBM and CatBoost. Hydrology , 9 (12), 226. https://doi.org/10.3390/hydrology9120226 Understanding Type 1 Diabetes | ADA . (n.d.). https://diabetes.org/about-diabetes/type-1 Ustun, B., Tracà, S., & Rudin, C. (2013). Supersparse linear integer models for interpretable classification. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1306.6677 Yan, L., Zhang, H., Goncalves, J., Xiao, Y., Wang, M., Guo, Y., Sun, C., Tang, X., Jing, L., Zhang, M., Huang, X., Xiao, Y., Cao, H., Chen, Y., Ren, T., Wang, F., Xiao, Y., Huang, S., Tan, X.,.. . Yuan, Y. (2020). An interpretable mortality prediction model for COVID-19 patients. Nature Machine Intelligence , 2 (5), 283–288. https://doi.org/10.1038/s42256-020-0180-7 Zhu, T., Kuang, L., Daniels, J., Herrero, P., Li, K., & Georgiou, P. (2022). IOMT-Enabled Real-Time Blood Glucose Prediction with deep learning and edge computing. IEEE Internet of Things Journal , 10 (5), 3706–3719. https://doi.org/10.1109/jiot.2022.3143375 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7410777","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":502781575,"identity":"5071669e-ef1c-4eb8-ab30-eaf475b14d23","order_by":0,"name":"Taofiq Olanrewaju MUSA","email":"","orcid":"","institution":"Crea Cubed Technology Inc.","correspondingAuthor":false,"prefix":"","firstName":"Taofiq","middleName":"Olanrewaju","lastName":"MUSA","suffix":""},{"id":502781576,"identity":"0daeb007-6fcb-44bf-8529-860eade9591f","order_by":1,"name":"Arsene ADJEVI","email":"","orcid":"","institution":"Crea Cubed Technology Inc.","correspondingAuthor":false,"prefix":"","firstName":"Arsene","middleName":"","lastName":"ADJEVI","suffix":""},{"id":502781577,"identity":"c26f823a-6b4a-4289-bfdc-f64d68452915","order_by":2,"name":"Donaldo Omondi JACCOJWANG","email":"","orcid":"","institution":"Crea Cubed Technology Inc.","correspondingAuthor":false,"prefix":"","firstName":"Donaldo","middleName":"Omondi","lastName":"JACCOJWANG","suffix":""},{"id":502781578,"identity":"fe1e06c0-71b5-49d7-9e9f-a7ae898bea95","order_by":3,"name":"Nasirudeen ADELEYE","email":"","orcid":"","institution":"Crea Cubed Technology Inc.","correspondingAuthor":false,"prefix":"","firstName":"Nasirudeen","middleName":"","lastName":"ADELEYE","suffix":""},{"id":502781579,"identity":"33844e73-dfac-4967-b227-57ef898e3261","order_by":4,"name":"Diyaolu Abdulmalik OPEYEMI","email":"","orcid":"","institution":"Crea Cubed Technology Inc.","correspondingAuthor":false,"prefix":"","firstName":"Diyaolu","middleName":"Abdulmalik","lastName":"OPEYEMI","suffix":""},{"id":502781580,"identity":"bf9f5d48-55e8-45c9-8d14-58d80b86bfd4","order_by":5,"name":"Süleyman UZUN","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/0lEQVRIiWNgGAWjYFAC5gYE+wMDQwKIlsCvhRGhhXEGQosBcVqYeYjRIj8jsfExb45dnsH5M2aPbf4czjM4wHzwNg/Dn3xcWgxuJDYb825LLjY4cMbcOLftMJDBlmzNw2Bg2YBLi0RimzTvNubEDQd7zKRzGw4nbjjAYyYN1ILTZUCHtf/m3VafuOEwUKXFH5AW/m94tTDcSGxj5t0GVHkMqIWBDWwLG14tBmceNkvO3XY8ceYZtjLJ3rb0YsnDbMaWcwyMcTusPfngh7fbqhP7zh/eJvHjj3Ue3/HmhzfeVMjhixgGJh4ULjPYdnwagJH5A7/8KBgFo2AUjHQAADMDVZwnfXWbAAAAAElFTkSuQmCC","orcid":"","institution":"Sakarya University of Applied Sciences","correspondingAuthor":true,"prefix":"","firstName":"Süleyman","middleName":"","lastName":"UZUN","suffix":""},{"id":502781581,"identity":"1309c34c-b470-4238-953e-a00ae05f1709","order_by":6,"name":"Mustafa Zahid YILDIZ","email":"","orcid":"","institution":"Sakarya University of Applied Sciences","correspondingAuthor":false,"prefix":"","firstName":"Mustafa","middleName":"Zahid","lastName":"YILDIZ","suffix":""},{"id":502781583,"identity":"456a7b2d-aba0-418f-a866-a8675675a37c","order_by":7,"name":"Ali LAZIM","email":"","orcid":"","institution":"Crea Cubed Technology Inc.","correspondingAuthor":false,"prefix":"","firstName":"Ali","middleName":"","lastName":"LAZIM","suffix":""},{"id":502781585,"identity":"ee86d3f6-017a-4647-ad31-e42657c8f87e","order_by":8,"name":"Rhobi Peter","email":"","orcid":"","institution":"Sakarya University","correspondingAuthor":false,"prefix":"","firstName":"Rhobi","middleName":"","lastName":"Peter","suffix":""},{"id":502781586,"identity":"81d5c908-90fa-4a1d-90f9-bc258e261cba","order_by":9,"name":"Selçuk YAYLACI","email":"","orcid":"","institution":"Sakarya University","correspondingAuthor":false,"prefix":"","firstName":"Selçuk","middleName":"","lastName":"YAYLACI","suffix":""}],"badges":[],"createdAt":"2025-08-19 17:08:20","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7410777/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7410777/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":89564205,"identity":"07667319-ce48-4eea-adec-112cd1f74118","added_by":"auto","created_at":"2025-08-21 10:33:13","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":21548,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eRMSE model comparison 30 minutes univariate\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/d6253626355e2a5b351885f3.png"},{"id":89565335,"identity":"468852d6-e04d-43ad-8b0b-163a892b482d","added_by":"auto","created_at":"2025-08-21 10:41:12","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":46783,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eModel Training Time Vs RMSE Model Comparison 30 Minutes Univariate\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/4ff93b99cd4072342d34583c.png"},{"id":89564202,"identity":"bdd4abde-82fc-4230-8556-01ef091bc04a","added_by":"auto","created_at":"2025-08-21 10:33:12","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":20774,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eRMSE Model Comparison 60 Minutes Univariate\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/6466b62ab51917b22a05738d.png"},{"id":89564203,"identity":"e2800935-d504-499f-97a1-4561b64d368f","added_by":"auto","created_at":"2025-08-21 10:33:12","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":43469,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eModel Training Time Vs RMSE Model Comparison 30 Minutes Univariate\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image4.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/06b317489d06062351ef4d5a.png"},{"id":89565337,"identity":"992de59c-7dc3-4e56-b4a6-58bc030daf6b","added_by":"auto","created_at":"2025-08-21 10:41:13","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":21553,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eRMSE Model Comparison 30 Minutes Multivariate\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image5.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/b1c9932bd382efdc049d8b8b.png"},{"id":89564210,"identity":"4dac3ea1-a722-4da6-b33c-b95bdbb5e4dd","added_by":"auto","created_at":"2025-08-21 10:33:13","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":43526,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eModel Training Time Vs RMSE Model Comparison 30 Minutes Multivariate\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image6.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/371a44e0913b5a49122a543d.png"},{"id":89565629,"identity":"bd4f2e43-53b0-4c4c-afea-86d37c740aeb","added_by":"auto","created_at":"2025-08-21 10:49:13","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":20743,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eRMSE Model Comparison 60 Minutes Multivariate\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image7.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/e6e16dfa8688c80114a8abab.png"},{"id":89564218,"identity":"0ef1f5bb-d136-47fa-8b54-288a867dddba","added_by":"auto","created_at":"2025-08-21 10:33:13","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":44425,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eModel Training Time Vs RMSE Model Comparison 60 Minutes Multivariate\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image8.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/08f9d57e8c08e211da536875.png"},{"id":89564212,"identity":"004f0f8c-d56a-481d-b2d6-0ee9f0d22440","added_by":"auto","created_at":"2025-08-21 10:33:13","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":57114,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e30 Minutes Univariate Interpretation\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image9.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/04f4ec6bc6d17a6f2d96a34a.png"},{"id":89565343,"identity":"b54be078-e608-4a38-80aa-13cdb4a49004","added_by":"auto","created_at":"2025-08-21 10:41:13","extension":"png","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":108697,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e60 Minutes Univariate Interpretation\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image10.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/a79d1f637f57d56f75fee3fa.png"},{"id":89564228,"identity":"037bfbcc-0929-471e-a901-131bb3c76e52","added_by":"auto","created_at":"2025-08-21 10:33:13","extension":"png","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":29482,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e30 Minutes Multivariate Interpretation\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image11.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/628ef301b25958de9ad4e723.png"},{"id":89564215,"identity":"0fadcc02-4bb6-468a-b32a-9aa5346025b7","added_by":"auto","created_at":"2025-08-21 10:33:13","extension":"png","order_by":12,"title":"Figure 12","display":"","copyAsset":false,"role":"figure","size":29460,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e60 Minutes Multivariate Interpretation\u003c/em\u003e\u003c/p\u003e","description":"","filename":"image12.png","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/02e5d009080be00d25a67ae8.png"},{"id":90205302,"identity":"c0023cb2-5db2-4cc1-a26c-4d718931dc9f","added_by":"auto","created_at":"2025-08-29 21:01:32","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1417589,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7410777/v1/d4a20631-57a8-4cb1-a2e5-2db2c7295cea.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Comparative Evaluation of Machine Learning and Deep Learning Models for Blood Glucose Prediction on the OhioT1DM Dataset","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eType 1 diabetes mellitus (T1DM) is a serious disease that is mostly diagnosed in young people, but it can be found in any age group. T1DM is mainly caused when the pancreas that creates insulin is destroyed because it treats the beta cells in the pancreas as an intruder in the immune system. The more the beta cells are being killed, the more difficult it becomes for the pancreas to make the necessary insulin the body needs to live. There is a high chance of hyperglycemia when not enough blood glucose enters the body\u0026rsquo;s cells to build energy. Not enough insulin is produced in the pancreas due to diabetes (\u003cem\u003eUnderstanding Type 1 Diabetes | ADA\u003c/em\u003e, n.d.). It is estimated that by the year 2030, the total number of people that will be living with diabetes is 643\u0026nbsp;million, and 783\u0026nbsp;million by 2045, which is approximately a 22% increase in 15 years (International Diabetes Federation, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2025\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eIn recent years, there has been an adoption of continuous glucose monitoring (CGM) devices to actively monitor the glucose levels of T1DM patients in the interstitial fluids to control for both the hyperglycemia and hypoglycemia stages, and the international diabetes community has welcomed the CGM devices (Petrie et al., \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2017\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eAccurately tracking, monitoring, and maintaining a normal blood glucose level (BGL) for T1DM patients is essential for reducing the death rates from the disease (Kavakiotis et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). The CGM system is also known to be vital for collecting some glucose-related records, like insulin intake and other physiological records.\u003c/p\u003e\u003cp\u003eOver the years, machine learning (ML) and deep learning (DL) approaches have been adopted in predicting BGL to actively monitor by taking a proactive approach in alerting the T1DM patients of any danger that may occur through their daily activities to improve their health and well-being using a real-time prediction method. The research on which models are being suited to accurately predict T1DM diabetes has been hindered by a lack of real patient data, but as of 2020, a group of researchers released the OhioT1DM dataset (Marling, C., \u0026amp; Bunescu, R., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2020\u003c/span\u003e ) for research purposes, which includes continuous glucose monitoring, insulin, physiological sensor, and self-reported life-event data for people with type 1 diabetes that contains eight weeks\u0026rsquo; of data from 12 individual patients (2018 and 2020 cohorts), with the gender distribution being seven males and five females (Marling \u0026amp; Bunescu, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThe random forest model is used on a multivariate dataset to predict subcutaneous glucose concentration in type 1 diabetes patients (Georga et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). (McShinsky \u0026amp; Marshall, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2020\u003c/span\u003e) do a comparison of statistical time series, and machine learning models, such as Auto-Regressive Integrated Moving-Average, Vector Auto-Regression, Kalman Filter, Unscented Kalman Filter, Ordinary Least Squares, Support Vector Machines, Random Forests, Gradient Boosted Trees, XGBoosted Trees, Adaptive Neuro-Fuzzy Inference System (ANFIS), and Multi-Layer Perceptron in terms of Root Mean Squared Error, Mean Absolute Error, Coefficient of Determination, Matthews Correlation Coefficient, and Clarke Error Grid to compare their effectiveness in predicting future blood glucose levels on the Ohio T1DM dataset on a 30-minute and 60-minute prediction horizon by selecting only six patients from the twelve unique number of patients. XGBoost, LightGBM, and CatBoost models are compared in daily streamflow forecasting in mountainous catchments, with XGBoost being the highest-performing model in terms of RMSE, Linear regression is used as a baseline model (Szczepanek, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e2022\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThe eXtreme gradient boosting models, which contain XGBoost (Chen \u0026amp; Guestrin, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2016\u003c/span\u003e), LightGBM (Ke et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2017\u003c/span\u003e), and CatBoost (Prokhorenkova et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). (Ding et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2020\u003c/span\u003e) show that the CatBoost model outperforms other traditional models on the Walmart sales forecast dataset. XGBoost, LightGBM, and CatBoost are machine-learning libraries that use the Gradient Boosting technique. XGBoost was created in 2014 and quickly gained popularity after outperforming big datasets and winning numerous data science competitions. Since then, it has evolved by integrating new features, such as GPU learning and distributed learning via version updates. LightGBM, developed by Microsoft in 2017, is faster and uses less memory than XGBoost. It is designed to ensure high speed in huge data while maintaining high accuracy even in small data samples. CatBoost, developed by Yahoo in 2017, excels at handling categorical data and is an optimized algorithm that automatically performs regularization to prevent overfitting and enables rapid learning on both CPU and GPU (Ahn et al., \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThe dataset has been used to publish a lot of state-of-the-art models in forecasting T1DM using either ML or DL models in the field of artificial intelligence (AI), even though accurate prediction remains a challenge for AI experts. It has been established by many researchers that DL models continue to have the best model performance in forecasting the OhioT1DM dataset (Cui et al., \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), (G\u0026oacute;mez-Castillo et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2022\u003c/span\u003e), (Giancotti et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2024\u003c/span\u003e), while they are not comparing it with machine learning models and also stating the time and resources it takes to achieve the results. In this paper, we do a comparison of ML and DL models based on model performance and resources. We also extend our analysis to incorporate significant ML models like LightGBM, Xgboost and CatBoost models, which have shown great predictive power in a lot of forecasting-related tasks, and also evaluate the differences between using a univariate and multivariate model approach using Root Mean Squared Error (RMSE) and R-squared on predicting the BGL in 30-minute and 60-minute horizons.\u003c/p\u003e"},{"header":"2. Methods","content":"\u003cp\u003eThe BGL prediction challenge entails estimating the future BGL of OhioT1DM on the test set at 30- to 60-minute prediction horizons, using a sequence of BGL values obtained at 5-minute intervals. The models are used in both univariate and multivariate studies; univariate analysis uses only the previous glucose value to predict the future glucose value, whilst multivariate analysis includes additional features. When employing a machine learning model, the architecture can only represent two dimensions; however, deep learning models can represent three.\u003c/p\u003e\u003cp\u003eML (Univariate): (N, L)\u003c/p\u003e\u003cp\u003eML (Multivariate): (N, L * F)\u003c/p\u003e\u003cp\u003eDL (Univariate): (N, L, 1)\u003c/p\u003e\u003cp\u003eDL (Multivariate): (N, L, F)\u003c/p\u003e\u003cp\u003eWhere:\u003c/p\u003e\u003cp\u003eN is the number of samples.\u003c/p\u003e\u003cp\u003eL is the lag windows (also the prediction horizon value)\u003c/p\u003e\u003cp\u003eF is the number of features\u003c/p\u003e\u003cp\u003eUnivariate setting: CGM (Glucose)\u003c/p\u003e\u003cp\u003eMultivariate setting: CGM, Fingerstick, Total insulin dose, Insulin does 3-hours variability\u003c/p\u003e\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e3.1 Data\u003c/h2\u003e\u003cp\u003eOhioT1DM contains the eighth week's blood glucose level as well as other physiological parameters of twelve patients with type 1 diabetes, with six datasets released in 2018 and the remaining six in 2020. The dataset includes seven males and five females, with the 2020 cohort using the Empatica sensor band and the 2018 cohort wearing the Basis sensor band (Marling \u0026amp; Bunescu, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). When loaded from XML format with 5-minute uniform timestamps for combining the dataset into a single data frame, the training data has 134,789 rows, and the test data has 31,743 rows.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e3.2 Data Preprocessing\u003c/h2\u003e\u003cp\u003eThe XML format is loaded with 5-minute uniform timestamps for combining the dataset into a single data frame for 2018 and 2020, both train and test sets. We created a tag and year feature to denote if the observation is in the train or test data and to denote the years, then it was joined together to have a whole train and test dataset for all unique patients. The glucose level values were shifted back to 15 minutes to account for the delays that may occur from the CGM system and all timestamps with missing glucose values were dropped from the dataset for improved data quality, which is the main focus to have a model that will accurately forecast the glucose level.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e3.3 Data Standardization\u003c/h2\u003e\u003cp\u003eData standardization is also known as feature scaling; this step is important in data processing, especially for numerical data points, because it ensures that data points are of the same close range to speed up the training process and improve model performance. Several studies have comparison on different scaling methods (Ahsan et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), (R \u0026amp; P, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e2024\u003c/span\u003e), (Raju et al., \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e2020\u003c/span\u003e), and (De Amorim et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) like Standard scaler (\u003cem\u003eStandardScaler\u003c/em\u003e, n.d.), Min-max scaler (\u003cem\u003eMinMaxScaler\u003c/em\u003e, n.d.), Quantile transformer (\u003cem\u003eQuantileTransformer\u003c/em\u003e, n.d.), Robust scaler (\u003cem\u003eRobustScaler\u003c/em\u003e, n.d.), Normalizer (\u003cem\u003eNormalizer\u003c/em\u003e, n.d.), etc. The Standard Scaler method is used to scale the numerical training data for the deep learning models, and it is not a good idea for eXtreme gradient boosting models (Jha, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e3.4 Evaluation\u003c/h2\u003e\u003cp\u003eThree different evaluation metrics are used in evaluating the performance of the ML and DL models in predicting BGL on the ohioT1DM dataset. The metrics are root mean squared error (RMSE), mean absolute error (MAE), and R-squared.\u003c/p\u003e\u003cp\u003eThe RMSE (Eq-1) is the prediction error between the actual and predicted values of the BGL, which is calculated by taking the square root of the mean squared error (MSE) to penalize large errors. The closer the RMSE to 0, the better the model performance on the test data.\u003c/p\u003e\u003cp\u003eRMSE = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\sqrt{{\\frac{1}{n}{\\sum\\:}_{i=1}^{n}\\left({Actual\\:BGL}_{i}\\:-\\:{Predicted\\:BGL}_{i}\\right)}^{2}}\\)\u003c/span\u003e\u003c/span\u003e (Eq-1)\u003c/p\u003e\u003cp\u003eThe MAE (Eq-2) is known as the absolute average between actual and predicted values of the BGL, where errors are treated equally. The lower the MAE the better the model performance on the test data.\u003c/p\u003e\u003cp\u003eMAE =\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\frac{1}{n}{\\sum\\:}_{i=1}^{n}\\left|{Actual\\:BGL}_{i}\\:-\\:{Predicted\\:BGL}_{i}\\right|\\)\u003c/span\u003e\u003c/span\u003e (Eq-2)\u003c/p\u003e\u003cp\u003eThe R-squared (Eq-3) is also known as the coefficient of determination; it assesses the proportion of variance of the dependent variable that is been explained by independent variables. The closer the values to 1, the better the explained variance in the data.\u003c/p\u003e\u003cp\u003eR\u003csup\u003e2\u003c/sup\u003e = \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:1\\)\u003c/span\u003e\u003c/span\u003e \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:-\\)\u003c/span\u003e\u003c/span\u003e \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\frac{{{\\sum\\:}_{i=1}^{n}\\left({Actual\\:BGL}_{i\\:}\\:-\\:{Predicted\\:BGL}_{i}\\right)}^{2}}{{{\\sum\\:}_{i=1}^{n}\\left({Actual\\:BGL}_{i\\:}\\:-\\:{Predicted\\:BGL}_{i}\\right)}^{2}}\\)\u003c/span\u003e\u003c/span\u003e (Eq-3)\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e3.5 Model Interpretability\u003c/h2\u003e\u003cp\u003eTo further extend the trustworthiness in our ML models we can adopt using the SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) which helps to balance the trade-off between accuracy and interpretability especially the ML and DL models we consider in our study (Lundberg \u0026amp; Lee, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). LIME method is used to explain individual predictions by approximating the model locally with concept of why should I trust my prediction because a lot of high performing ML and DL models are black-box (Ribeiro et al., \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2016\u003c/span\u003e), then SHAP was later developed for explaining model prediction using additive feature importance which can helps to interpret the prediction of the models globally and individually that will assist the clinicians to understand which parameters most influence the predicted glucose levels and will also help in debugging the performance of the models when needed (Lundberg \u0026amp; Lee, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). A lot of interpretable ML techniques have been developed in the healthcare domain which is to help the physicians about ML prediction we have the seen the use case in cardiovascular diseases (Kennedy et al., 2021), Eye diseases (Niu et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), Cancer (Ustun et al., \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2013\u003c/span\u003e), Influenza and Infection diseases (Hu et al., 2020), Covid-19 prediction (Yan et al., \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2020\u003c/span\u003e), depression diagnosis (Lakkaraju et al., \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2017\u003c/span\u003e), Autism (Garbulowski et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), which shows that a lot of use case in the healthcare industry are already adopting interpretable ML for better decision making. (Prendin et al., 2023) critically mentioned where there is a need for using interpretable ML for blood glucose prediction in diabetes using SHAP the study shows that using SHAP can help to debug on the features used in developing the BGL model, and also to know which model to deploy for decision making where patient safety is a priority.\u003c/p\u003e\u003c/div\u003e"},{"header":"3. Results","content":"\u003cp\u003eThe results of various ML and DL models are presented using a 30-minute and 60-minute prediction horizon under univariate modeling and multivariate BGL levels. As shown in the table below.\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e shows that LSTM achieved the lowest RMSE (13.65) and the highest R\u0026sup2; (0.9416), making it the most accurate model for BGL prediction. However, Bi-LSTM had slightly worse performance, with an RMSE of 14.48 and an R\u0026sup2; of 0.9365, suggesting that it may not generalize as well as LSTM. Among the gradient boosting models, LightGBM performed the best, with an RMSE of 14.73 and R\u0026sup2; of 0.9405, followed closely by CatBoost (RMSE\u0026thinsp;=\u0026thinsp;14.77, R\u0026sup2; = 0.9402) and XGBoost (RMSE\u0026thinsp;=\u0026thinsp;14.98, R\u0026sup2; = 0.9385). Linear regression had the highest RMSE (15.44) and lowest R\u0026sup2; (0.9345), indicating that it struggled to capture the complexity of glucose level variations.\u003c/p\u003e\u003cp\u003eIn terms of computational efficiency, linear regression was the fastest, completing in just 0.15 seconds, but it lacked predictive accuracy. Gradient boosting models (LightGBM\u0026thinsp;=\u0026thinsp;15.95s, XGBoost\u0026thinsp;=\u0026thinsp;31.42s, CatBoost\u0026thinsp;=\u0026thinsp;99.81s) provided a good balance between accuracy and efficiency. Meanwhile, deep learning models were the most computationally expensive, with LSTM taking 1229.26 seconds and Bi-LSTM requiring 3745.86 seconds.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eUnivariate 30-minute prediction horizon\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"7\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMetrics\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLinear Regression\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eXgboost\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCatboost\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLSTM\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eBi-LSTM\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eRMSE\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e15.44\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e14.73\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e14.98\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e14.77\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e13.65\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e14.48\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMAE\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e9.46\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e8.84\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e8.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e8.83\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e8.81\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e9.54\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eR-Squared\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0.9345\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.9405\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.9385\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.9402\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.9416\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.9365\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eTraining time (sec)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0.15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e15.95\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e31.42\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e99.81\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1229.26\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e3745.86\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e shows that Bi-LSTM achieved the lowest RMSE (21.73) and the highest R\u0026sup2; (0.8511), making it the most accurate model for long-term glucose prediction. Among the gradient boosting models, CatBoost performed the best (RMSE\u0026thinsp;=\u0026thinsp;23.37, R\u0026sup2; = 0.8502), followed closely by LightGBM (RMSE\u0026thinsp;=\u0026thinsp;23.40, R\u0026sup2; = 0.8498) and XGBoost (RMSE\u0026thinsp;=\u0026thinsp;23.59, R\u0026sup2; = 0.8474). Linear regression had the weakest performance, with the highest RMSE (24.41) and lowest R\u0026sup2; (0.8366), indicating its inability to capture the complex glucose trends over a longer horizon.\u003c/p\u003e\u003cp\u003eInterestingly, LSTM had a negative R\u0026sup2; (-0.477), indicating poor predictive ability for a 60-minute horizon. This suggests that LSTM struggled with long-term dependencies, potentially due to overfitting or an inability to capture delayed glucose fluctuations effectively.\u003c/p\u003e\u003cp\u003eRegarding computational efficiency, Linear Regression remained the fastest model (0.67s), but at the cost of accuracy. Gradient boosting models had a moderate computational cost (LightGBM\u0026thinsp;=\u0026thinsp;44.80s, XGBoost\u0026thinsp;=\u0026thinsp;95.18s, CatBoost\u0026thinsp;=\u0026thinsp;232.21s), making them viable for real-time applications. However, deep learning models required significantly longer runtimes, with LSTM taking 1115.23 seconds and Bi-LSTM requiring 3847.55 seconds, making them computationally expensive.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eUnivariate 60-minute prediction horizon\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"7\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMetrics\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLinear Regression\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eXgboost\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCatboost\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLSTM\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eBi-LSTM\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eRMSE\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e24.41\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e23.40\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e23.59\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e23.37\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e45.46\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e21.73\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMAE\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e16.01\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e15.05\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e15.10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e15.01\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e38.55\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e14.99\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eR-Squared\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0.8366\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.8498\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8474\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.8502\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e-0.477\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.8511\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eTraining time (sec)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0.67\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e44.80\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e95.18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e232.21\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1115.23\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e3847.55\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e shows that Bi-LSTM achieved the lowest RMSE (14.64) and the highest R\u0026sup2; (0.9354), making it the most accurate model for this prediction horizon. Among the gradient boosting models, CatBoost outperformed the others with an RMSE of 14.97 and R\u0026sup2; of 0.9385, followed closely by LightGBM (RMSE\u0026thinsp;=\u0026thinsp;15.21, R\u0026sup2; = 0.9365) and XGBoost (RMSE\u0026thinsp;=\u0026thinsp;15.74, R\u0026sup2; = 0.9321). Linear Regression had the weakest ML performance, with an RMSE of 15.41 and R\u0026sup2; of 0.9349, showing that while it captured glucose trends reasonably well, it lagged behind advanced ML models.\u003c/p\u003e\u003cp\u003eInterestingly, LSTM performed significantly worse than other models, with an RMSE of 41.07 and a negative R\u0026sup2; (-0.4251), indicating poor predictive ability. This suggests that LSTM struggled to learn the short-term dependencies effectively, potentially due to overfitting or challenges in capturing rapid glucose fluctuations.\u003c/p\u003e\u003cp\u003eRegarding computational efficiency, linear regression remained the fastest model (0.77s), but at the cost of accuracy. Among the gradient boosting models, LightGBM was the most efficient (30.43s), followed by XGBoost (85.90s) and CatBoost (161.77s). However, deep learning models required significantly longer runtimes, with LSTM taking 836.51 seconds and Bi-LSTM requiring 3673.86 seconds, making them computationally expensive\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eMultivariate 30-minute prediction horizon\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"7\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMetrics\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLinear Regression\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eXgboost\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCatboost\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLSTM\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eBi-LSTM\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eRMSE\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e15.41\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e15.21\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e15.74\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e14.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e41.07\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e14.64\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMAE\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e9.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e9.12\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e9.33\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e8.94\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e35.05\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e9.48\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eR-Squared\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0.9349\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.9365\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.9321\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.9385\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e-0.4251\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.9354\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eTraining time (sec)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0.77\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e30.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e85.90\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e161.77\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e836.51\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e3673.86\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e shows that CatBoost achieved the lowest RMSE (23.52) and the highest R\u0026sup2; (0.8483), making it the best-performing model for this prediction horizon. LightGBM (RMSE\u0026thinsp;=\u0026thinsp;23.65, R\u0026sup2; = 0.8467) and XGBoost (RMSE\u0026thinsp;=\u0026thinsp;24.81, R\u0026sup2; = 0.8474) followed closely, demonstrating strong predictive capabilities. Linear Regression, while computationally efficient (3.80s), had a higher RMSE (24.36) and lower R\u0026sup2; (0.8372), indicating weaker performance compared to gradient boosting models.\u003c/p\u003e\u003cp\u003eAmong deep learning models, Bi-LSTM performed better than LSTM, with an RMSE of 22.54 and an R\u0026sup2; of 0.8439, showing that bidirectional processing enhances predictive accuracy. However, LSTM had the worst performance, with an RMSE of 47.48 and a negative R\u0026sup2; (-0.501), indicating that it failed to capture meaningful glucose trends for the 60-minute horizon. This suggests that LSTM struggled with the longer prediction window, potentially due to difficulties in learning long-term dependencies. Regarding computational efficiency, linear regression remained the fastest model (3.80s), but at the cost of accuracy. Among ML models, LightGBM was the most efficient (99.90s), followed by XGBoost (297.06s) and CatBoost (497.42s). However, deep learning models were significantly more computationally expensive, with LSTM taking 761.25 seconds and Bi-LSTM requiring 3861.32 seconds.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eMultivariate 60-minute prediction horizon\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"7\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMetrics\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLinear Regression\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eXgboost\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCatboost\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLSTM\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eBi-LSTM\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eRMSE\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e24.36\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e23.65\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e24.81\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e23.52\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e47.48\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e22.54\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMAE\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e15.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e15.22\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e15.82\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e15.10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e40.09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e15.74\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eR-Squared\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0.8372\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.8467\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.8474\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0.8483\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e-0.501\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.8439\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eTraining time (sec)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e3.80\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e99.90\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e297.06\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e497.42\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e761.25\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e3861.32\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003e4.5 Explainability Insights with SHAP\u003c/h2\u003e\u003cp\u003eTo investigate the best performing model behaviour for both 30 minutes and 60 minutes under univariate and bivariate model configurations SHAP global plot are being used to explain the model globally and this can also help to know the current state of each best performing models, we understand that LSTM is the best performing model for 30-minutes univariate, while Bi-LSTM is the best performing model for 60-minutes univariate, 30\u0026ndash;60 minutes multivariate settings. In this work the interpretation of the best performing model is explained in each setting using SHAP summary plot for global interpretation and model debugging. In SHAP summary plot each row represents a feature. A specific instance of the training set is represented by each dot in a row. In relation to its mean value, the dot's colour shows whether that data instance is associated with a low (cyan) or high (magenta) value of the feature. On the y-axis the features are ranked from highest to lowest, while the position on the x-axis represents whether a feature's high or low level contributes to the high or low SHAP value of the model prediction. For example, if the magenta value of a feature shows at the right hand side of the x-axis that means the high value contributes to increasing the model performance.\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e shows the summary plot for the LSTM model with a forecast horizon of 30 minutes univariate. We see that for all six outputs, t-0 (the most recent CGM value) before prediction is the most relevant feature, followed by t-1 and t-2, with t-3 having the lowest overall importance among the six. We see that the most recent CGM value has a significant impact on the prediction of the next CGM value, implying that the higher the CGM value, the higher the prediction, and the lower the CGM value, the lower the prediction. The impact of t-1, t-2, t-4, t-5, and t-3 on SHAP values is relatively negligible, with the exception of t-1 in the first output, which appears to have some minor importance, as higher values are likewise related to higher prediction values and vice versa.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003e shows the summary plot for the Bi-LSTM model with a forecast horizon of 60 minutes univariate. We see that for all twelve outputs t-0 to t-11, t-0 (the most recent CGM value) before prediction is the most relevant feature. We see that the most recent CGM value has a significant impact on the prediction of the next CGM value, implying that the higher the CGM value, the higher the prediction, and the lower the CGM value, the lower the prediction across all output prediction\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003e shows the summary plots corresponding to Bi-LSTM with a forecast horizon of 30 minutes multivariate. The plot reveals the most important feature is fingerstick, then followed by glucose (CGM) level, insulin dosage three hours variability, and total insulin dose. Fingerstick has large impact, glucose has moderate impact while insulin dosage three hours variability, and total insulin dosage have low impact. We observe that high fingerstick readings increase the CGM value, and low fingersticks decrease the CGM value. Glucose value have varied effects, insulin dosage three hours variability, higher value increases the CGM value and vice versa, and total insulin dosage have relatively non zero impact.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e12\u003c/span\u003e shows the summary plots corresponding to Bi-LSTM with a forecast horizon of 60 minutes multivariate. The plot reveals the most important feature is insulin dosage three hours variability, then followed by glucose (CGM) level, total insulin dose and fingerstick. insulin dosage three hours variability has large impact, glucose and total insulin dosage has moderate impact while fingerstick has low impact. We observe that high insulin dosage three hours variability increases the CGM value, and low insulin dosage three hours variability decreases the CGM value. Glucose values have varied effects, total insulin dosage, higher value increase the CGM value and vice versa, and fingerstick have relatively non zero impact.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\u003ch2\u003e4.1 State-of-the-art comparison\u003c/h2\u003e\u003cp\u003eTo contextualize the performance of the evaluated models, we compare our results with previously published state-of-the-art studies that have also utilized the OhioT1DM dataset or similar CGM-based datasets. Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e presents a comparative overview, focusing on RMSE values for 30- and 60-minute prediction horizons using CGM as the primary input feature.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eState of the Art Comparison\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eStudy (Year)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eMethods\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eInput Features\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eRMSE (30 min)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eRMSE (60 min)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e(Ghimire et al., \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2024\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLSTM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCGM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e18.26\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e31.12\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e(Zhu et al., \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2022\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eE33N\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCGM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e18.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e32.54\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e(Nemat et al., \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2022\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eEnsemble\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCGM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e19.63\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e33.45\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e(Dudukcu et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2021\u003c/span\u003e)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGRU\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCGM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e21.90\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e35.10\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOur Method\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCatboost\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCGM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e14.77\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e23.37\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOur Method\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLightGBM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCGM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e14.73\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e23.40\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOur Method\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eXgboost\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCGM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e14.98\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e23.59\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e provides a comparative analysis of the suggested models (CatBoost, LightGBM, and XGBoost) in comparison with several published state-of-the-art approaches employed for short-term prediction of blood glucose levels from Continuous Glucose Monitoring (CGM) data. Performance evaluation is conducted with the metric of Root Mean Squared Error (RMSE) for two prediction time windows: 30 minutes and 60 minutes. Among the baseline models, Ghimire et al. (\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2024\u003c/span\u003e) achieved the best with an LSTM-based model and reported 18.26 and 31.12 for 30-minute and 60-minute horizons, respectively. Zhu et al. (\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) and Nemat et al. (\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) reported comparatively higher error rates with E33N and ensemble models. The GRU-based solution by Dudukcu et al. (\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) yielded the highest RMSEs, indicating lower predictive accuracy. On the contrary, our suggested gradient boosting-based techniques show considerable performance gains. To be specific, LightGBM obtained the best RMSE at the 30-minute horizon (14.73), whereas CatBoost had the best performance at the 60-minute horizon (23.37). XGBoost also showed competitive performance with RMSEs of 14.98 and 23.59, respectively. These results highlight the power and effectiveness of tree-based ensemble methods of modeling CGM data over deep learning models such as LSTM and GRU, particularly in settings with tabular, time-synchronized sensor inputs. The consistent improvement in performance on both prediction horizons is further evidence of the suitability of the models proposed for real-time or near-real-time glucose monitoring systems.\u003c/p\u003e\u003cp\u003eThis comparison clearly shows that our implementation of eXtreme Gradient Boosting models significantly outperforms previous deep learning approaches in terms of RMSE for both prediction horizons. These results emphasize that carefully tuned ML models\u0026mdash;especially gradient boosting techniques\u0026mdash;can achieve state-of-the-art results while requiring significantly less computational resources than deep learning methods. Additionally, while prior studies predominantly focused on deep neural networks, our study broadens the landscape by highlighting the efficacy of interpretable and scalable machine learning models. This adds a new dimension to the existing body of research and paves the way for real-world deployment scenarios where both accuracy and efficiency are essential.\u003c/p\u003e\u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eRecent research has shown that machine learning (ML) and deep learning (DL) models outperform traditional methods for forecasting time series. Notably, ML and DL approaches have surpassed traditional statistical methods in various benchmark competitions, including those held on Kaggle (Bojer \u0026amp; Meldgaard, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). When applied to the T1DM dataset, recurrent self-attention networks with transfer learning produced cutting-edge results. Similarly, Nemat et al. (\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2024\u003c/span\u003e) compared classical time series (CTL), traditional machine learning (TML), and deep neural networks (DNN) in both univariate and multivariate prediction settings, concluding that TML models provided not only the best predictive performance but also the shortest training times. Building on previous work, we evaluate both ML and DL models on the Ohio diabetes dataset, focusing on 30- and 60-minute prediction horizons. We find that Bi-LSTM and Vanilla-LSTM designs perform well in the DL category, with Bi-LSTM outperforming Vanilla-LSTM, which is consistent with the findings of Butt et al. (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Furthermore, RMSE, correlation, and percentage error remain essential evaluation measures, similar to the \u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003eJanuary.ai\u003c/span\u003e methodology. (Zahedani et al., \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eOur study compares the performance of various conventional models in BGL prediction with extreme gradient boosting models, which provide superior performance and enhanced computing efficiency relative to deep learning techniques used in prior research. (Ahn et al., \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2023\u003c/span\u003e) uses XGBoost, LightGBM, CatBoost, and CNN-LSTM to forecast harmful algal blooms by using Bayesian hyperparameter techniques for hyperparameter tuning of the models. Gradient boosting models have been showing better performance in various tabular datasets over time since they were created by using an ensemble of multiple decision trees to improve predictive performance, which has a positive effect in preventing overfitting.\u003c/p\u003e\u003cp\u003eOur results show DL models achieved the best performance across various prediction horizons. Specifically, in the univariate setting, the LSTM model attained the lowest RMSE of 13.65 for the 30-minute prediction horizon, while the Bi-LSTM model achieved an RMSE of 21.73 for the 60-minute horizon. Similarly, in the multivariate setting, the DL models demonstrated superior performance with RMSE values of 14.64 and 22.54 for the 30-minute and 60-minute prediction horizons, respectively. When comparing the best DL model in the univariate 30-minute prediction horizon to the best-performing ML model, the RMSE difference was only 7.91%. However, the ML model was approximately 77 times faster than its DL counterpart. In the univariate 60-minute setting, the RMSE difference between the best DL and ML models was 7.55%, with the ML model being 16.57 times faster. For the multivariate 30-minute prediction horizon, the best DL model outperformed the best ML model by a modest margin of 2.25% in RMSE, yet the ML model demonstrated a speed advantage, being 22.71 times faster. In the 60-minute multivariate setting, the RMSE difference was 4.35%, with the ML model operating 7.76 times faster than the DL model.\u003c/p\u003e\u003cp\u003eThis shows that, while DL models may exceed ML models in predictive accuracy (less than 8% in our study), ML models are more practical due to their computational efficiency, especially in real-time or resource-constrained healthcare situations. Furthermore, our ML models were assessed using the default hyperparameters. According to the literature, effective hyperparameter optimization can increase the performance of ML models (Pfob et al., \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). However, this results in higher processing overhead and decreased interpretability (Ilemobayo et al., \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eFurthermore, interpretability remains a major challenge in medical AI applications. Unlike DL models, which function as black boxes, ML models can convey information about feature relevance, particularly in multivariate scenarios. We used SHAP values (Lundberg \u0026amp; Lee, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2017\u003c/span\u003e) to improve the interpretability of our findings, allowing doctors to better grasp the underlying causes of blood glucose predictions. This could help clinicians make better-educated decisions about diabetes care.\u003c/p\u003e\u003cp\u003eThese findings underscore significant trade-offs among model accuracy, computational efficiency, and interpretability\u0026mdash;elements to evaluate when selecting predictive models for blood glucose forecasting in clinical environments.\u003c/p\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eThis study presents a comprehensive comparative analysis of traditional machine learning, advanced gradient boosting, and deep learning approaches for blood glucose level prediction using the OhioT1DM dataset. By evaluating models across both univariate and multivariate settings, and over short-term (30-minute) and long-term (60-minute) prediction horizons, we identified the most effective modeling strategies for glucose forecasting in Type 1 diabetes. Our findings show that while Long Short-Term Memory (LSTM) models perform well in short-term univariate predictions, their performance deteriorates significantly over longer horizons, likely due to limitations in capturing long-term temporal dependencies and overfitting risks. In contrast, Bidirectional LSTM (Bi-LSTM) models showed superior performance in multivariate settings, particularly for the 30-minute horizon, albeit with much higher computational costs.\u003c/p\u003e\u003cp\u003eAmong machine learning models, gradient boosting techniques\u0026mdash;especially LightGBM and CatBoost\u0026mdash;consistently delivered competitive or superior predictive performance compared to deep learning models, with significantly reduced training time and computational requirements. LightGBM was the top-performing model in the 30-minute univariate setting, while CatBoost emerged as a robust contender across all multivariate tasks. Moreover, our study stands out in its integration of explainable AI perspectives. By using SHAP for model interpretability on each setting's best performing model, to highlight which features are driving the predictive performance of the models and also helps to debug our model performance for relevant insights in clinical decision-making processes. This transparency is crucial for real-world adoption of AI in healthcare.\u003c/p\u003e\u003cp\u003eIn conclusion, this research contributes to the growing evidence that well-tuned gradient boosting models can offer state-of-the-art performance for medical time-series forecasting, often rivaling or surpassing deep learning models in both accuracy and efficiency. We also extend our work to use SHAP for model interpretation and model debugging to efficiently make necessary decision-making in the clinical process. Future work should focus on real-time deployment, model interpretability through XAI methods, and the incorporation of additional physiological and behavioral features to further personalize and refine glucose prediction systems.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eData availability and access: \u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCrediT authorship contribution statement:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTaofiq Olanrewaju Musa: Writing \u0026ndash; original draft, Methodology, Conceptualization.\u003c/p\u003e\n\u003cp\u003eArsene Adjevi: Writing \u0026ndash; review \u0026amp; editing, Validation, Software.\u003c/p\u003e\n\u003cp\u003eDonaldo Omondi Jaccowang: Visualization, Data analysis.\u003c/p\u003e\n\u003cp\u003eRaheem Nasirudeen Adeleye: Data preprocessing, Modeling.\u003c/p\u003e\n\u003cp\u003eDiyaolu Abdulmalik Opeyemi: Statistical analysis, Comparative evaluation.\u003c/p\u003e\n\u003cp\u003eS\u0026uuml;leyman Uzun: Supervision, Conceptualization, Final review, Methodological oversight.\u003c/p\u003e\n\u003cp\u003eMustafa Zahid Yıldız: Methodology, Model optimization, Visualization validation.\u003c/p\u003e\n\u003cp\u003eAli Lazım: Project administration, Software infrastructure.\u003c/p\u003e\n\u003cp\u003eRhobi Peter Mwita: Literature review, Final validation.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDeclaration of competing interest:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgments:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAhn, J. M., Kim, J., \u0026amp; Kim, K. (2023). Ensemble Machine Learning of Gradient Boosting (XGBoost, LightGBM, CatBoost) and Attention-Based CNN-LSTM for Harmful Algal Blooms Forecasting. \u003cem\u003eToxins\u003c/em\u003e, \u003cem\u003e15\u003c/em\u003e(10), 608. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/toxins15100608\u003c/span\u003e\u003cspan address=\"10.3390/toxins15100608\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAhsan, M., Mahmud, M., Saha, P., Gupta, K., \u0026amp; Siddique, Z. (2021). Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance. \u003cem\u003eTechnologies\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e(3), 52. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/technologies9030052\u003c/span\u003e\u003cspan address=\"10.3390/technologies9030052\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAshkan Dehghani Zahedani, Arvind, V., Saransh, A., Jiayu, Z., Jingyi, R., Michael, S., \u0026amp; Nima, A. (2023). \u003cem\u003eVirtual blood glucose monitoring and prediction using machine learning\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.january.ai/\u003c/span\u003e\u003cspan address=\"https://www.january.ai/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.january.ai/blog/white-paper-virtual-blood-glucose-monitoring-prediction-machine-learning\u003c/span\u003e\u003cspan address=\"https://www.january.ai/blog/white-paper-virtual-blood-glucose-monitoring-prediction-machine-learning\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBojer, C. S., \u0026amp; Meldgaard, J. P. (2020). Kaggle forecasting competitions: An overlooked learning opportunity. \u003cem\u003eInternational Journal of Forecasting\u003c/em\u003e, \u003cem\u003e37\u003c/em\u003e(2), 587\u0026ndash;603. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.ijforecast.2020.07.007\u003c/span\u003e\u003cspan address=\"10.1016/j.ijforecast.2020.07.007\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eButt, H., Khosa, I., \u0026amp; Iftikhar, M. A. (2023). Feature Transformation for Efficient Blood Glucose Prediction in Type 1 Diabetes Mellitus Patients. \u003cem\u003eDiagnostics\u003c/em\u003e, \u003cem\u003e13\u003c/em\u003e(3), 340. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/diagnostics13030340\u003c/span\u003e\u003cspan address=\"10.3390/diagnostics13030340\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, T., \u0026amp; Guestrin, C. (2016). XGBoost. \u003cem\u003eKDD\u003c/em\u003e, 785\u0026ndash;794. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1145/2939672.2939785\u003c/span\u003e\u003cspan address=\"10.1145/2939672.2939785\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCui, R., Hettiarachchi, C., Nolan, C. J., Daskalaki, E., \u0026amp; Suominen, H. (2021). Personalised Short-Term Glucose Prediction via Recurrent Self-Attention Network. \u003cem\u003e34th International Symposium on Computer-Based Medical Systems (CBMS)\u003c/em\u003e, 154\u0026ndash;159. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/cbms52027.2021.00064\u003c/span\u003e\u003cspan address=\"10.1109/cbms52027.2021.00064\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDe Amorim, L. B., Cavalcanti, G. D., \u0026amp; Cruz, R. M. (2022). The choice of scaling technique matters for classification performance. \u003cem\u003eApplied Soft Computing\u003c/em\u003e, \u003cem\u003e133\u003c/em\u003e, 109924. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.asoc.2022.109924\u003c/span\u003e\u003cspan address=\"10.1016/j.asoc.2022.109924\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDing, J., Chen, Z., Xiaolong, L., \u0026amp; Lai, B. (2020). Sales Forecasting Based on CatBoost. \u003cem\u003eIEEE\u003c/em\u003e, 636\u0026ndash;639. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/itca52113.2020.00138\u003c/span\u003e\u003cspan address=\"10.1109/itca52113.2020.00138\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDudukcu, H. V., Taskiran, M., \u0026amp; Yildirim, T. (2021). Blood glucose prediction with deep neural networks using weighted decision level fusion. \u003cem\u003eJournal of Applied Biomedicine\u003c/em\u003e, \u003cem\u003e41\u003c/em\u003e(3), 1208\u0026ndash;1223. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.bbe.2021.08.007\u003c/span\u003e\u003cspan address=\"10.1016/j.bbe.2021.08.007\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGarbulowski, M., Diamanti, K., Smolińska, K., Baltzer, N., Stoll, P., Bornel\u0026ouml;v, S., \u0026Oslash;hrn, A., Feuk, L., \u0026amp; Komorowski, J. (2021). R.ROSETTA: an interpretable machine learning framework. \u003cem\u003eBMC Bioinformatics\u003c/em\u003e, \u003cem\u003e22\u003c/em\u003e(1). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12859-021-04049-z\u003c/span\u003e\u003cspan address=\"10.1186/s12859-021-04049-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGeorga, E. I., Protopappas, V. C., Polyzos, D., \u0026amp; Fotiadis, D. I. (2012). A predictive model of subcutaneous glucose concentration in type 1 diabetes based on Random Forests. \u003cem\u003eAnnual International Conference of the IEEE Engineering in Medicine and Biology Society\u003c/em\u003e, 2889\u0026ndash;2892. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/embc.2012.6346567\u003c/span\u003e\u003cspan address=\"10.1109/embc.2012.6346567\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGhimire, S., Celik, T., Gerdes, M., \u0026amp; Omlin, C. W. (2024). Deep learning for blood glucose level prediction: How well do models generalize across different data sets? \u003cem\u003ePLoS ONE\u003c/em\u003e, \u003cem\u003e19\u003c/em\u003e(9), e0310801. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pone.0310801\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0310801\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGiancotti, R., Bosoni, P., Vizza, P., Tradigo, G., Gnasso, A., Guzzi, P. H., Bellazzi, R., Irace, C., \u0026amp; Veltri, P. (2024). Forecasting glucose values for patients with type 1 diabetes using heart rate data. \u003cem\u003eComputer Methods and Programs in Biomedicine\u003c/em\u003e, \u003cem\u003e257\u003c/em\u003e, 108438. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.cmpb.2024.108438\u003c/span\u003e\u003cspan address=\"10.1016/j.cmpb.2024.108438\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eG\u0026oacute;mez-Castillo, N. Y., Cajilima-Cardenaz, P. E., Zhinin-Vera, L., Maldonado-Cuascota, B., Dom\u0026iacute;nguez, D. L., Pineda-Molina, G., Hidalgo-Parra, A. A., \u0026amp; Gonzales-Zubiate, F. A. (2022). A machine learning approach for blood glucose level prediction using a LSTM network. In \u003cem\u003eCommunications in computer and information science\u003c/em\u003e (pp. 99\u0026ndash;113). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/978-3-030-99170-8_8\u003c/span\u003e\u003cspan address=\"10.1007/978-3-030-99170-8_8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIlemobayo, J. A., Durodola, O., Alade, O., Awotunde, O. J., Olanrewaju, A. T., Falana, O., Ogungbire, A., Osinuga, A., Ogunbiyi, D., Ifeanyi, A., Odezuligbo, I. E., \u0026amp; Edu, O. E. (2024). Hyperparameter tuning in Machine Learning: A Comprehensive review. \u003cem\u003eJournal of Engineering Research and Reports\u003c/em\u003e, \u003cem\u003e26\u003c/em\u003e(6), 388\u0026ndash;395. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.9734/jerr/2024/v26i61188\u003c/span\u003e\u003cspan address=\"10.9734/jerr/2024/v26i61188\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eInternational Diabetes Federation. (2025, February 14). \u003cem\u003eDiabetes Facts and Figures | International Diabetes Federation\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://idf.org/about-diabetes/diabetes-facts-figures/\u003c/span\u003e\u003cspan address=\"https://idf.org/about-diabetes/diabetes-facts-figures/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJha, G. (2024, November 15). Feature Scaling in Machine Learning: Which Popular Algorithms Require It and Which Don\u0026rsquo;t? \u003cem\u003eMedium\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://medium.com/@post.gourang/feature-scaling-in-machine-learning-which-popular-algorithms-require-it-and-which-dont-a71f5585d664\u003c/span\u003e\u003cspan address=\"https://medium.com/@post.gourang/feature-scaling-in-machine-learning-which-popular-algorithms-require-it-and-which-dont-a71f5585d664\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., \u0026amp; Chouvarda, I. (2017). Machine Learning and Data Mining Methods in Diabetes Research. \u003cem\u003eComputational and Structural Biotechnology Journal\u003c/em\u003e, \u003cem\u003e15\u003c/em\u003e, 104\u0026ndash;116. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.csbj.2016.12.005\u003c/span\u003e\u003cspan address=\"10.1016/j.csbj.2016.12.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKe, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., \u0026amp; Liu, T. (2017). \u003cem\u003eLightGBM: A Highly Efficient Gradient Boosting Decision Tree\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://papers.nips.cc/paper_files/paper/2017/hash/6449f44a102fde\u003c/span\u003e\u003cspan address=\"https://papers.nips.cc/paper_files/paper/2017/hash/6449f44a102fde\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e848669bdd9eb6b76fa-Abstract.html\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLakkaraju, H., Kamar, E., Caruana, R., \u0026amp; Leskovec, J. (2017). Interpretable \u0026amp; Explorable Approximations of Black Box Models. \u003cem\u003earXiv (Cornell University)\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arxiv.1707.01154\u003c/span\u003e\u003cspan address=\"10.48550/arxiv.1707.01154\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLundberg, S. M., \u0026amp; Lee, S. (2017). A unified approach to interpreting model predictions. \u003cem\u003earXiv (Cornell University)\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arxiv.1705.07874\u003c/span\u003e\u003cspan address=\"10.48550/arxiv.1705.07874\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMarling, C., \u0026amp; Bunescu, R. (2020, September 1). \u003cem\u003eThe OhioT1DM Dataset for Blood Glucose Level Prediction: Update 2020\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pmc.ncbi.nlm.nih.gov/articles/PMC7881904/\u003c/span\u003e\u003cspan address=\"https://pmc.ncbi.nlm.nih.gov/articles/PMC7881904/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMcShinsky, R., \u0026amp; Marshall, B. (2020). Comparison of Forecasting Algorithms for Type 1 Diabetic Glucose Prediction on 30 and 60-Minute Prediction Horizons. \u003cem\u003eCEUR Workshop Proceeding\u003c/em\u003e, 12\u0026ndash;18. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://ceur-ws.org/Vol-2675/paper2.pdf\u003c/span\u003e\u003cspan address=\"http://ceur-ws.org/Vol-2675/paper2.pdf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003e\u003cem\u003eMinMaxScaler\u003c/em\u003e. (n.d.). Scikit-learn. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html\u003c/span\u003e\u003cspan address=\"https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNemat, H., Khadem, H., Eissa, M. R., Elliott, J., \u0026amp; Benaissa, M. (2022). Blood glucose level prediction: Advanced Deep-Ensemble Learning Approach. \u003cem\u003eIEEE Journal of Biomedical and Health Informatics\u003c/em\u003e, \u003cem\u003e26\u003c/em\u003e(6), 2758\u0026ndash;2769. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/jbhi.2022.3144870\u003c/span\u003e\u003cspan address=\"10.1109/jbhi.2022.3144870\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNemat, H., Khadem, H., Elliott, J., \u0026amp; Benaissa, M. (2024). Data-driven blood glucose level prediction in type 1 diabetes: a comprehensive comparative analysis. \u003cem\u003eScientific Reports\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(1). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41598-024-70277-x\u003c/span\u003e\u003cspan address=\"10.1038/s41598-024-70277-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNiu, Y., Gu, L., Zhao, Y., \u0026amp; Lu, F. (2021). Explainable diabetic retinopathy detection and retinal image generation. \u003cem\u003eIEEE Journal of Biomedical and Health Informatics\u003c/em\u003e, \u003cem\u003e26\u003c/em\u003e(1), 44\u0026ndash;55. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/jbhi.2021.3110593\u003c/span\u003e\u003cspan address=\"10.1109/jbhi.2021.3110593\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003e\u003cem\u003eNormalizer\u003c/em\u003e. (n.d.). Scikit-learn. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Normalizer.html\u003c/span\u003e\u003cspan address=\"https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Normalizer.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePetrie, J. R., Peters, A. L., Bergenstal, R. M., Holl, R. W., Fleming, G. A., \u0026amp; Heinemann, L. (2017). Improving the Clinical Value and Utility of CGM Systems: Issues and Recommendations. \u003cem\u003eDiabetes Care\u003c/em\u003e, \u003cem\u003e40\u003c/em\u003e(12), 1614\u0026ndash;1621. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2337/dci17-0043\u003c/span\u003e\u003cspan address=\"10.2337/dci17-0043\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePfob, A., Lu, S., \u0026amp; Sidey-Gibbons, C. (2022). Machine learning in medicine: a practical introduction to techniques for data pre-processing, hyperparameter tuning, and model comparison. \u003cem\u003eBMC Medical Research Methodology\u003c/em\u003e, \u003cem\u003e22\u003c/em\u003e(1). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12874-022-01758-8\u003c/span\u003e\u003cspan address=\"10.1186/s12874-022-01758-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eProkhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., \u0026amp; Gulin, A. (2018). CatBoost: unbiased boosting with categorical features. \u003cem\u003eNeural Information Processing Systems\u003c/em\u003e, \u003cem\u003e31\u003c/em\u003e, 6639\u0026ndash;6649. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://papers.nips.cc/paper/7898-catboost-unbiased-boosting-with-categorical-features.pdf\u003c/span\u003e\u003cspan address=\"https://papers.nips.cc/paper/7898-catboost-unbiased-boosting-with-categorical-features.pdf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003e\u003cem\u003eQuantileTransformer\u003c/em\u003e. (n.d.). Scikit-learn. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.QuantileTransformer.html\u003c/span\u003e\u003cspan address=\"https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.QuantileTransformer.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eR, R. C., \u0026amp; P, S. C. (2024). Evaluating Deep Learning with different feature scaling techniques for EEG-based Music Entrainment Brain Computer Interface. \u003cem\u003ee-Prime - Advances in Electrical Engineering Electronics and Energy\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e, 100448. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.prime.2024.100448\u003c/span\u003e\u003cspan address=\"10.1016/j.prime.2024.100448\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRaju, V. N. G., Lakshmi, K. P., Jain, V. M., Kalidindi, A., \u0026amp; Padma, V. (2020). Study the Influence of Normalization/Transformation process on the Accuracy of Supervised Classification. \u003cem\u003e2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT)\u003c/em\u003e, 729\u0026ndash;735. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/icssit48917.2020.9214160\u003c/span\u003e\u003cspan address=\"10.1109/icssit48917.2020.9214160\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRibeiro, M. T., Singh, S., \u0026amp; Guestrin, C. (2016). \u0026ldquo;Why should I trust you?\u0026rdquo;: explaining the predictions of any classifier. \u003cem\u003earXiv (Cornell University)\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arxiv.1602.04938\u003c/span\u003e\u003cspan address=\"10.48550/arxiv.1602.04938\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003e\u003cem\u003eRobustScaler\u003c/em\u003e. (n.d.). Scikit-learn. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html\u003c/span\u003e\u003cspan address=\"https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003e\u003cem\u003eStandardScaler\u003c/em\u003e. (n.d.). Scikit-learn. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html\u003c/span\u003e\u003cspan address=\"https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSzczepanek, R. (2022). Daily Streamflow Forecasting in Mountainous Catchment Using XGBoost, LightGBM and CatBoost. \u003cem\u003eHydrology\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e(12), 226. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/hydrology9120226\u003c/span\u003e\u003cspan address=\"10.3390/hydrology9120226\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003e\u003cem\u003eUnderstanding Type 1 Diabetes | ADA\u003c/em\u003e. (n.d.). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://diabetes.org/about-diabetes/type-1\u003c/span\u003e\u003cspan address=\"https://diabetes.org/about-diabetes/type-1\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eUstun, B., Trac\u0026agrave;, S., \u0026amp; Rudin, C. (2013). Supersparse linear integer models for interpretable classification. \u003cem\u003earXiv (Cornell University)\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arxiv.1306.6677\u003c/span\u003e\u003cspan address=\"10.48550/arxiv.1306.6677\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYan, L., Zhang, H., Goncalves, J., Xiao, Y., Wang, M., Guo, Y., Sun, C., Tang, X., Jing, L., Zhang, M., Huang, X., Xiao, Y., Cao, H., Chen, Y., Ren, T., Wang, F., Xiao, Y., Huang, S., Tan, X.,.. . Yuan, Y. (2020). An interpretable mortality prediction model for COVID-19 patients. \u003cem\u003eNature Machine Intelligence\u003c/em\u003e, \u003cem\u003e2\u003c/em\u003e(5), 283\u0026ndash;288. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s42256-020-0180-7\u003c/span\u003e\u003cspan address=\"10.1038/s42256-020-0180-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhu, T., Kuang, L., Daniels, J., Herrero, P., Li, K., \u0026amp; Georgiou, P. (2022). IOMT-Enabled Real-Time Blood Glucose Prediction with deep learning and edge computing. \u003cem\u003eIEEE Internet of Things Journal\u003c/em\u003e, \u003cem\u003e10\u003c/em\u003e(5), 3706\u0026ndash;3719. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/jiot.2022.3143375\u003c/span\u003e\u003cspan address=\"10.1109/jiot.2022.3143375\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Blood Glucose Prediction, Machine Learning, Deep Learning, OhioT1DM Dataset, SHAP Interpretability","lastPublishedDoi":"10.21203/rs.3.rs-7410777/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7410777/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eType 1 diabetes mellitus is a common condition among young individuals, highlighting the need for accurate blood glucose level (BGL) predictions for effective continuous glucose monitoring. Investigating and comparing the performance of extreme gradient boosting models using a data-driven approach is essential for improving BGL prediction accuracy. This study extends the analysis of the OhioT1DM dataset by evaluating and comparing the performance of traditional machine learning models, extreme gradient boosting models (XGBoost, CatBoost, and LightGBM), and deep learning models (LSTM and Bi-LSTM) in predicting BGL. The findings demonstrate that extreme gradient boosting models can achieve competitive performance compared to certain deep learning architectures while being less computationally expensive. In this study, the LSTM model achieves an RMSE of 13.65 for a 30-minute prediction horizon, while the Bi-LSTM model records an RMSE of 21.73 when using continuous glucose monitoring (CGM) as the sole feature for future predictions using all the 12 patients.\u003c/p\u003e","manuscriptTitle":"Comparative Evaluation of Machine Learning and Deep Learning Models for Blood Glucose Prediction on the OhioT1DM Dataset","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-08-21 10:33:08","doi":"10.21203/rs.3.rs-7410777/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"644bf22e-24e0-4f88-8fa3-c22fa409c251","owner":[],"postedDate":"August 21st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-08-29T20:53:23+00:00","versionOfRecord":[],"versionCreatedAt":"2025-08-21 10:33:08","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7410777","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7410777","identity":"rs-7410777","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Outcome instruments

MUSA

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-05-23T02:00:01.238055+00:00

License: CC-BY-4.0