Machine learning-enhanced prediction of sensible heat storage potential in Kano-Nigeria based on thermogravimetric analysis | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Machine learning-enhanced prediction of sensible heat storage potential in Kano-Nigeria based on thermogravimetric analysis Abubakar D. Maiwada, Abdullahi A. Adamu, Jamilu Usman, Umar D. Maiwada, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6081166/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract The challenge of efficiently predicting the sensible heat storage potential of natural materials like Dawakin Tofa clay for sustainable energy applications necessitates innovative solutions. This study investigates the use of machine learning models: Interactive Linear Regression (ILR), Stepwise Linear Regression (SWLR), Robust Linear Regression (RLR), and (Kernel Support Vector Machine (KSVM). Also, four non-linear models were employed as: G-Matern 5/2 (GM5/2), Trilayered neural network (TNN), Boosted Tree (BoT) and bagged Tree Neural Networks (BTNN). Further, some ensemble methods used are: Simple Average Ensemble (SAE), Weighted Average Ensemble (WAE), and Neural Network Ensemble (NNE). In the laboratory, the test was carried out at the Centre for Genetics Engineering and Biotechnology at the Federal University of Technology in Minna, Niger State, Nigeria. The clay sample was placed in a platinum pan, then heated it at a rate of 10°C per minute while using nitrogen and air as purge gases. The entire experiment took 33 minutes to complete, with results printed for documentation. To ensure accuracy, we repeated the analysis three times and averaged the results. By utilizing locally abundant Dawakin Tofa clay, the research promotes sustainable and cost-effective energy storage solutions, reducing reliance on synthetic materials and lowering the environmental footprint. Among the models, NNE exhibited the best performance, achieving near-perfect accuracy with minimal error metrics (MSE = 0.000212, RMSE = 0.01456 in training; MSE = 0.0001696, RMSE = 0.01302 in testing). SAE demonstrated moderate accuracy with reliable generalization, while WAE showed high variability in training and weaker performance, despite improvement in the testing phase. This study highlights the superiority of nonlinear machine learning models, particularly Neural Network Ensemble (NNE), in accurately modeling the thermal behavior of the sample. It also provides a foundation for optimizing natural materials for thermal storage, recommending material modifications, expanded datasets, pilot-scale studies, and economic assessments. It further underscores the potential of integrating advanced machine learning techniques with natural materials to create scalable, sustainable energy systems, addressing critical environmental challenges in the transition to renewable energy. Mechanical Engineering Renewable Resources Energy Engineering Artificial Intelligence and Machine Learning Sensible heat storage Machine learning Thermogravimetric analysis Renewable energy integration Sustainable energy systems Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Introduction The rising global energy need, along with the urgent requirement for energy systems that are sustainable, has led to increased interest in thermal energy storage (TES) technologies (Amir et al., 2023). Sensible heat storage (SHS), which stores and releases heat through temperature changes in materials, has become a workable and scalable option for different uses, such as solar power plants, heating and cooling in buildings, and industrial operations (Chekifi & Boukraa, 2023). Among the materials being studied for SHS, natural clays are of interest because they are readily available, inexpensive, and environmentally friendly. However, how well clays work as TES materials relies heavily on their thermal properties (Tiskatine et al., 2017), which means there is a need for precise and effective models to assess their performance. According to (Bisu et al., 2024), Nigeria, recognized as one of the most populous nations in Africa, is endowed with a wealth of natural resources, particularly substantial reserves of oil and gas. Despite these advantages, the country grapples with a persistent energy crisis that adversely affects both urban and rural populations. This ongoing challenge has catalyzed a renewed emphasis on sustainable development, prompting an increased focus on renewable energy solutions. Renewable energy sources, characterized by their environmental sustainability and abundance, are especially prevalent in the Northern regions of Nigeria. The potential for harnessing solar, wind, and biomass energy in these areas presents a significant opportunity for diversifying the nation’s energy portfolio and reducing its reliance on fossil fuels. The transition towards renewable energy in Nigeria is influenced by a multifaceted interplay of policy frameworks, socio-economic factors, and technological advancements. Policymakers are increasingly recognizing the importance of integrating renewable energy into the national energy strategy to address the energy deficit and promote economic growth. Furthermore, socio-economic dynamics, including population growth and urbanization, necessitate urgent action to meet the rising energy demands sustainably. Technological progress plays a crucial role in this transition, as innovations in renewable energy technologies enhance efficiency and reduce costs, making these solutions more accessible. This convergence of policy, socio-economic needs, and technological advancements underscores the urgency of adopting renewable energy strategies to foster sustainable development in Nigeria. Dawakin Tofa clay, found a lot in Dawakin Tofa Local Government of Kano State-Nigeria, can be a cheap material for SHS because of its mineral makeup and ability to withstand heat (Maiwada & Abba, 2024). To use it in TES systems, we need to study it carefully, focusing on things like heat transfer, heat capacity, and how well it holds up under heat. Traditional methods like thermogravimetric analysis (TGA), differential scanning calorimetry (DSC), and measuring thermal conductivity are dependable, but they take a long time, use many resources, and often depend on specific samples (Mansa & Zou, 2021). So, it’s very important to create models that can predict how well Dawakin Tofa clay will perform thermally. TGA is a known method used to check how materials behave under heat. It gives important information about how materials break down, how much moisture they contain, and their stability when heated (Nurazzi et al., 2021) . In this work, TGA data is used mainly for Machine Learning (ML) models, forming a strong foundation for predicting the SHS potential of Dawakin Tofa clay. Important thermal details, like weight loss at certain temperatures and stability measures, are taken from TGA graphs to help the predictive models. ML methods have changed how predictive modeling works in materials science because they can manage complicated, non-linear connections between variables. By using past data, ML models can find hidden trends and make precise predictions with little need for experiments (Mobarak et al., 2023). Although interest in ML for thermochemical energy storage (TES) is increasing, research on the self-heating substance (SlHS) potential of natural clays, especially Dawakin Tofa clay, is limited. This study aims to address this issue by combining ML algorithms with TGA data to improve predictions of Dawakin Tofa clay's SHS potential. To improve prediction accuracy, a range of ML algorithms are used, such as linear regression (LR), support vector machines (SVM), and deep learning models like artificial neural networks (ANNs). Techniques for selecting features are employed to find the most important input variables, while hyperparameter optimization helps find the best setup for each algorithm. The performance of the models is assessed using common statistical measures like mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), Mean Absolute Percentage Error (MAPE) and coefficient of determination (R²). Data preprocessing is about cleaning and adjusting TGA-derived thermal parameters so that they work with ML models. We handle outliers and missing data using imputation methods and basic statistical checks. The data is split into training and testing sets in an 80:20 ratio, which helps ensure an unbiased evaluation. All ML models are trained with a k-fold cross-validation method to reduce overfitting. Models were tuned with hyperparameter optimization using grid search and Bayesian optimization methods. Track of performance metrics were kept to find the best algorithms. For model evaluation, prediction accuracy was checked using different statistical measures. MSE, RMSE, MAPE, and MAE are used to evaluate prediction errors, while R² reflects the model's explanatory ability (Pravin et al., 2022). A comparison of different ML models helps find the best predictive framework. This study is innovative as it integrates thermal data obtained from TGA with advanced machine learning algorithms to forecast SHS potential. This method helps overcome limitations in current TES studies by cutting down on the need for many lab experiments, thus allowing for better and broader evaluations of natural clays. The expected outcomes of this work include creating a framework to predict the SHS capacity of Dawakin Tofa clay, finding important thermal parameters that affect SHS performance, and showing how ML can help in TES materials research. Additionally, this study highlights the wider possibilities of using ML in material design and optimization for sustainable energy. This research combines experimental thermal data with advanced machine learning techniques to provide a more precise, effective, and scalable assessment of the specific heat storage (SpHS) capabilities of natural clays. The results aim to help develop cost-effective and sustainable TES systems, aiding the global shift towards cleaner and more robust energy solutions. Some studies have looked at using ML methods to predict thermal features of TES materials, allowing for comparisons. For instance, A study by (Darvishvand et al., 2022) employed computational fluid dynamics (CFD) simulations alongside machine learning models to predict the melting processes of phase change materials in thermal storage units. The study demonstrated the effectiveness of ANNs in forecasting thermal behaviors in engineered systems. A study by (Baghbani et al., 2023) on the introduction of genetic programming (GP) as a grey-box artificial intelligence method for predicting quartz sand thermal conductivity. The study demonstrated that CRRF (classification and regression random forest) achieved the highest prediction accuracy with a coefficient of determination R 2 =0.993 and a MAE = 0.045. GP achieved excellent predictive results with R 2 =0.986R and MAE = 0.063, proving its reliability as a grey-box model. ANN, using Levenberg–Marquardt and Bayesian regularization algorithms, obtained R 2 =0.916 and MAE = 0.151. In contrast, multiple linear regression (MLR) showed the weakest performance with R 2 =0.737and MAE = 0.300. In his study (Bang et al., 2020) found that Traditional linear regression models had a root RMSE of 0.1186, but machine learning methods significantly reduced this value. Among the tested models, Gaussian Process Regression (GPR) with an exponential kernel achieved an RMSE of 0.07338, while ensemble learning using XGBoost delivered the best performance with an RMSE of 0.07197. These models managed data uncertainty, noise, and outliers more effectively than traditional models, which are often limited by strict statistical assumptions and data distribution constraints. This capability underscores the essential role of machine learning in optimizing energy storage system design and ensuring operational safety. Incorporating TGA data into ML models can enhance predictive abilities compared to past research that relied on basic thermal measurements. The feature selection in this study helps retain only the most important variables, simplifying the model. This is different from the simpler regression models noted in Kim et al. (2018), where too many features led to overfitting. Furthermore, advanced ML techniques, such as ensemble learning methods (ELMs) used in this study, outperform traditional ML models regarding prediction accuracy. For example, the RF model in this research showed a lower RMSE than earlier TES material studies that used standard linear regression (SLR). The comparative analysis highlights the distinctiveness of integrating TGA-derived factors with advanced ML models. Whereas prior research has established the potential of ML in predicting TES materials. This study broadens the application to natural clays with considerable compositional variations, achieving better prediction results through careful model selection and hyperparameter tuning. The aim of this study is to develop and evaluate advanced machine learning models for accurately predicting the sensible heat storage potential of Dawakin Tofa clay using thermogravimetric analysis data, with a focus on optimizing natural, locally abundant materials for sustainable energy storage. The novelty lies in integrating advanced ensemble machine learning techniques, including NNE, SAE, and WAE, to model the thermal properties of clay, offering a resource-efficient alternative to traditional experimental methods. This is one of the first studies to combine thermogravimetric analysis data with machine learning for evaluating natural materials in energy storage applications, demonstrating the transformative potential of predictive modeling in material science. The study contributes by providing a detailed performance comparison of machine learning models, with NNE achieving near-perfect accuracy and outperforming others in capturing the complex thermal behavior of clay. By utilizing Dawakin Tofa clay, the research promotes sustainable, eco-friendly solutions, reducing reliance on synthetic materials and minimizing environmental impacts. It also enhances resource efficiency by reducing experimental requirements and aligns with green engineering principles. Furthermore, the findings lay the groundwork for optimizing clay materials through surface modifications, scaling up for real-world applications, and exploring hybrid models and expanded datasets for improved accuracy and generalizability, advancing the development of sustainable, efficient, and scalable thermal energy storage systems. Proposed Intelligent Methods The research presents advanced machine learning techniques aimed at predicting the derivative weight of Dawakin Tofa clay, a key thermogravimetric property critical for evaluating its heat storage potential. A dataset of over 5,000 samples was experimentally gathered under controlled laboratory conditions to ensure an accurate representation of the clay's thermal properties. Extensive preprocessing steps were performed, including data cleaning to handle missing values and outliers, enhancing the dataset's reliability. Important numerical features such as temperature, heating rate, and initial weight were normalized within a [0, 1] range to improve model performance, particularly for scaling-sensitive algorithms like KSVM. Feature selection methods, including correlation analysis and recursive feature elimination (RFE), were employed to isolate the most relevant predictors, simplifying the dataset while enhancing model interpretability. The dataset was divided into 80:20 subsets, with cross-validation applied to fine-tune hyperparameters and mitigate overfitting risks. The proposed hybrid algorithm integrates four linear machine learning models: Interactive Linear Regression ILR, SWLR, RLR, and KSVM. Also, four non-linear models were employed as: GM5/2, TNN, BoT and BTNN. The proposed hybrid algorithm integrates both linear and non-linear machine learning models, capitalizing on their distinct advantages. ILR is adept at managing linear relationships and improves prediction accuracy through iterative refinement. SWLR efficiently identifies relevant features by systematically adding or removing predictors based on their statistical significance, thereby optimizing model performance. RLR offers resilience against outliers, ensuring consistent predictions even in noisy datasets, which enhances reliability. On the non-linear side, GM5/2 is effective for modeling spatial data, providing smoothness and flexibility to capture complex relationships, even in irregular datasets. TNN leverages deep learning capabilities with multiple layers for intricate feature extraction, allowing for adaptive learning. The BoT method combines several weak learners to create a robust predictive model through iterative boosting, which enhances model strength and precision. Lastly, BTNN improve prediction stability and accuracy by aggregating outputs from multiple decision trees through bagging, thus reducing variance and addressing overfitting challenges. Final predictions were generated using a weighted ensemble approach, assigning model weights based on cross-validation performance to maximize overall accuracy. This integrated approach not only advances predictive modeling of TG properties but also establishes a reliable framework for applying hybrid machine learning algorithms in material characterization, contributing substantially to research in sustainable energy storage. The flowchart in Fig. 1 above presents a detailed framework for assessing the sensible heat storage capacity of Dawakin Tofa clay, utilizing data obtained from TGA and DTA. Essential parameters such as weight percentage, derivative weight, and temperature are extracted and undergo pre- and post-processing to enhance data quality and interpretability. Baseline predictions are established through linear models, including SWLR, RLR, and ILR. In contrast, more sophisticated non-linear models, such as GM5/2, TNN, BoT, and BTNN, more effectively capture intricate thermal behaviors. Ensemble methods SAE, WAE, and NNE integrate predictions to enhance accuracy, with NNE showing the highest performance. Model assessment through metrics like MSE and RMSE ensures robustness, and predictions that meet established error thresholds yield conclusive results. This comprehensive approach underscores the potential of machine learning in optimizing natural materials for sustainable energy solutions. Interactive Linear Regression (ILR) This is a dynamic approach in ML and AI that allows real-time interaction between users and the regression model (RM). It builds on traditional linear regression (TLR), which models the relationship between a dependent variable and one or more independent variables using a linear equation (Sarker, 2021). The interactive component allows users to modify parameters, visualize model performance, and change data inputs while immediately seeing the effects on predictions. This capability improves the interpretability and comprehension of the model’s behavior, making it beneficial in educational settings, data analysis, and decision-making processes. It aids in identifying model biases, enhancing data preprocessing, and optimizing model performance (Hassija et al., 2024). For example, data scientists can interactively adjust feature weights, evaluate residual errors, and refine models for improved prediction accuracy. Applications include business forecasting, healthcare diagnostics, and environmental modeling, where understanding causal relationships is essential. This interactive framework encourages collaboration between AI developers and domain experts, ensuring that models are closely aligned with real-world expectations and boosting trust in AI-driven decisions. 2.1 Step Wise Linear Regression (SWLR) SWLR is a technique used in machine learning to identify the most significant predictors for a target variable. Unlike standard linear regression, which evaluates all variables simultaneously, SWLR adopts a sequential approach that enhances model performance by reducing overfitting and improving interpretability. This allows researchers to focus on the most relevant variables. The SWLR process generally involves two main strategies: forward selection and backward elimination. In forward selection, the model begins without any predictors and incrementally adds variables one at a time, selecting those that most significantly improve the model’s performance. Conversely, backward elimination starts with all potential predictors and systematically removes the least significant ones. This iterative process continues until no further improvements can be made, ensuring the final model is both effective and parsimonious. By focusing on a subset of predictors, SWLR facilitates the identification of variables that contribute meaningfully to the prediction task. (Miller et al., 2022). SWLR is applied across diverse fields such as finance, healthcare, and social sciences, where understanding the relationships between variables is paramount. By simplifying models, SWLR enhances interpretability, enabling stakeholders to better understand the influence of individual predictors. Moreover, by emphasizing significant variables, the method often improves generalization on unseen data, thereby minimizing the risk of overfitting (Papazafeiropoulos, 2024). 2.2 Robust Linear Regression (RLR) RLR offers a more resilient alternative to TLR by reducing sensitivity to outliers and violations of key assumptions like normality and homoscedasticity. In standard LR, outliers can heavily skew the estimated coefficients, leading to inaccurate conclusions (Ummah, 2019). Robust regression (Rr) tackles this issue by applying methods that limit the influence of outliers, ensuring the model's stability even when anomalies are present in the data (Bottmer et al., 2022). A widely used approach in Rr is M-estimation, which assigns lower weights to data points with large residuals, minimizing their effect on the model. Other techniques include Least Absolute Deviations (LAD) regression, which focuses on minimizing the sum of absolute residuals instead of squared residuals, and RANSAC (Random Sample Consensus), which repeatedly fits models to random subsets of the data to identify the best fit for the majority (D. M. Khan et al., 2021). These methods make Rr particularly useful in practical scenarios involving real-world datasets that may contain errors or extreme values, offering more reliable predictions and parameter estimates compared to conventional linear regression (McNamara et al., 2022). 2.3 Kernel Support Vector Machine (KSVM) KSVM enhances the traditional SVM by enabling it to classify non-linearly separable data as seen in Fig. 2. While conventional SVMs excel at finding an optimal hyperplane for linearly separable datasets (Tomar & Agarwal, 2015), KSVM overcomes this limitation through the use of the kernel trick. This technique implicitly maps data into a higher-dimensional space where linear separation becomes feasible, bypassing the need for explicit transformation calculations. By computing inner products in this transformed space using a kernel function, KSVM effectively classifies data while operating in the original feature space. Commonly used kernels include the polynomial, radial basis function (RBF), and sigmoid kernels (SK) allowing KSVM to handle complex relationships found in tasks like image classification, bioinformatics, and text categorization (Ngu et al., 2024) .The model's strength lies in its ability to learn non-linear decision boundaries while maintaining computational efficiency. However, selecting an appropriate kernel and fine-tuning its parameters are essential for optimal performance in specific applications (Sarker, 2021). 2.4 G-Matern 5/2 The GM5/2 model significantly extends the traditional Matern class of covariance functions (CF), which are widely utilized in spatial statistics and geostatistics. This specific formulation, characterized by its smoothness and flexibility, is beneficial for modeling spatial phenomena with varying degrees of continuity (Porcu et al., 2024). The GM5/2 function is defined by its parameters, including the range, smoothness, and variance, allowing practitioners to tailor the model to fit empirical data effectively. One of the key advantages of the GM5/2 is its ability to capture complex spatial structures (Ss), making it suitable for environmental science, geophysics, and spatial epidemiology applications. The smoothness parameter, in particular, influences the differentiability of the realizations of the process, providing insights into the underlying spatial correlation. Recent advancements in computational techniques have facilitated the implementation of the GM5/2 model in large datasets, enhancing its applicability in real-world scenarios (Zhou et al., 2017). Largely, it serves as a robust tool for researchers aiming to understand and interpret spatial data, contributing to more informed decision-making in various scientific fields. 2.5 Trilayered Neural Network (TNN) These, commonly referred to as three-layer neural networks, are foundational architectures in the field of artificial intelligence and machine learning. These networks consist of three distinct layers: an input layer, a hidden layer, and an output layer. The input layer receives the initial data, while the hidden layer processes this information through weighted connections and activation functions, allowing the network to learn complex patterns. The output layer produces the final predictions or classifications based on the processed information. They are often trained using backpropagation, a method that adjusts the weights of the connections based on the error of the output compared to the expected result (Lin et al., 2024). Despite their limitations in handling highly complex datasets compared to deeper networks, trilayered neural networks serve as an excellent starting point for understanding neural network principles. They provide insights into the fundamental mechanisms of learning and generalization. As the field evolves, these networks remain relevant, often serving as benchmarks for more sophisticated architectures, including deep learning models (Learning, 2023). Their importance in both theoretical and practical applications continue to be a subject of active research. 2.6 Boosted Tree Network (BoT) BoT as in Fig. 3 is a powerful ensemble learning technique, have gained significant traction in the field of ML due to their robustness and predictive accuracy. This method builds a series of decision trees in a sequential manner, where each new tree corrects the errors made by the previous ones. By focusing on the misclassified instances, boosted trees effectively reduce bias and variance, leading to improved performance on complex datasets (Barber, 2012). One of the most popular algorithms in this domain is the (GBM), which optimizes a loss function through gradient descent. Recent advancements, such as XGBoost and Light GBM, have further enhanced the efficiency and scalability of BoT, enabling their application to large-scale datasets and real-time predictions (Zheng et al., 2024). BoT networks are particularly effective in handling various types of data, including structured and unstructured inputs. Their interpretability, combined with feature importance metrics, allows researchers to gain insights into the underlying patterns within the data. As the demand for accurate predictive models continues to rise across industries, they remain a vital area of research, promising ongoing developments and applications in diverse fields such as finance, healthcare, and marketing (Martinelli, 2022). 2.7 Bagged Tree Neural Network (BTNN) As in the image from Fig 4 BTNN offer an exciting blend of ensemble learning and neural network techniques, designed to boost predictive accuracy and reliability. At the heart of bagging, or bootstrap aggregating, is the idea of training multiple models on different subsets of data. This helps to reduce variability and combat overfitting, which can often plague machine learning models (de Zarzà et al., 2023). A study by (A. A. Khan et al., 2024) shows that, in a BTNN, several decision trees are trained on randomly selected portions of the training data. Each tree then contributes to the final prediction, usually by averaging their outputs or voting on the most common result. This combination allows researchers to capture intricate, non-linear patterns in the data while still benefiting from the clarity that tree-based models provide. This innovative approach has shown great promise in various fields, such as credit scoring, medical diagnosis, and image classification. Its ability to manage high-dimensional data and diverse feature types makes bagged tree neural networks adaptable and powerful tools in machine learning. Ongoing research aims to refine their structure and training methods, paving the way for new solutions across different sectors (Ahmed et al., 2023). As machine learning continues to evolve, BTNNs are set to play a vital role in enhancing predictive analytics. Results and Discussion 4.1 Hyperparameter tuning Hyperparameter tuning was essential to optimize the performance of the machine learning models by identifying the best parameter configurations for minimizing errors and maximizing accuracy. For the NNE, the optimal configuration included three hidden layers with [64, 128, 64] neurons, a learning rate of 0.001, and the ReLU activation function, achieving near-perfect metrics with MSE = 0.000212 and RMSE = 0.01456 during training. The SAE, which lacks tunable hyperparameters, benefitted from data normalization to a range of [0, 1], enhancing its performance. For the WAE, tuning focused on the weight distribution among predictions, with the best results obtained by assigning 70% weight to NNE and 30% to linear models, ensuring a balance between precision and generalization. Tree-based models also underwent rigorous tuning, with Boosted Trees achieving optimal performance using 100 estimators, a maximum depth of 5, and a learning rate of 0.05, while Bagged Trees performed best with 50 base estimators and a maximum depth of 7. These tuning strategies ensured each model operated at its highest potential, delivering accurate predictions of the sensible heat storage potential of Dawakin Tofa clay, while validation confirmed their generalizability to unseen datasets. 4.2 Predictive Results of linear and nonlinear models The results in Table 1 highlight a clear distinction between the predictive capabilities of linear and nonlinear ML models for estimating the SHS potential of Dawakin Tofa clay using TG analysis data. The linear models, including SWLR, ILR, and Robust RLR, show limited performance, with training and testing phase R 2 values of 0.85 and 0.86, respectively. These models struggle with high RMSE values (~16.37 in training and ~15.937 in testing) and substantial MAPE exceeding 1400, reflecting their inability to effectively capture the nonlinear complexities inherent in the data. In contrast, the nonlinear models demonstrate near-perfect accuracy, with R 2 =1across both phases and significantly lower error metrics. The GM5/2 model stands out as the most precise, achieving near-zero RMSE (0.0139 in training and 0.01333 in testing) and extremely low MAE, underscoring its unparalleled ability to model nonlinear relationships. The TNN also performs exceptionally well, with slightly higher error rates than GM5/2 but maintaining remarkable generalization and computational efficiency. Ensemble tree-based models, such as BoT and BTNN, exhibit strong predictive power but display marginally higher RMSE and MAPE compared to GM5/2 and TNN, particularly in the testing phase. These findings underscore the superiority of nonlinear machine learning approaches over LR models for this application, emphasizing the importance of advanced techniques like GM5/2 and TNN for accurately modeling complex thermophysical properties. The robust performance of these models makes them highly suitable for practical applications in materials science, offering precise and reliable predictions critical for optimizing sensible heat storage systems. This analysis demonstrates the transformative potential of integrating machine learning techniques in material characterization, enabling researchers to unlock deeper insights and design more efficient energy storage systems. Table 1: Predictive results of linear and nonlinear models Training Phase R 2 RMSE MAPE MSE MAE SWLR 0.85 16.37 1408 267.97 14.143 ILR 0.85 16.37 1408 267.97 14.143 RLR 0.85 16.371 1396.5 268 14.137 G-Matern 5/2 1 0.0139 0.1 0.00019 0.00708 TNN 1 0.26286 4.5 0.0691 0.1606 Boosted Trees 1 2.5639 7.9 6.5735 1.8201 Bagged Trees 1 0.10749 5.6 0.01155 0.0799 Testing Phase R 2 RMSE MAPE MSE MAE SWLR 0.86 15.937 1574.2 254 13.643 ILR 0.86 15.937 1574.2 254 13.643 RLR 0.86 15.933 1561 253.87 13.636 G-Matern 5/2 1 0.01333 0.1 0.00018 0.0067 TNN 1 0.2341 4.8 0.0548 0.1665 Boosted Trees 1 2.6089 8.3 6.8063 1.8832 Bagged Trees 1 0.1113 7.4 0.0124 0.0832 The testing phase results in Table 1 reveal significant variations in the error performance criteria across the evaluated models, highlighting the superiority of nonlinear approaches over linear ones. The linear models (SWLR, ILR, and RLR) yield identical or near-identical error metrics during testing. Their RMSE is approximately 15.937, with a relatively high MAPE exceeding 1560, and MAE around 13.64. These high error values suggest limited accuracy and the inability of linear models to handle the nonlinear complexities of the dataset effectively. Such performance underscores the restrictive nature of linear models in scenarios where intricate relationships exist between input and output variables. Conversely, the nonlinear models deliver exceptional accuracy with minimal error metrics. The GM5/2 model exhibits the best performance, achieving an RMSE=0.01333, a negligible MAPE=0.1, and an MAE=0.0067, reflecting near-zero deviation from actual values. This model demonstrates its ability to model complex nonlinear patterns with remarkable precision. The TNN also performs well, with an RMSE=0.2341, MAPE=4.8, and MAE=0.1665, indicating a strong generalization capability during testing. Ensemble tree-based methods, including BoTs and BTNNs, perform well but lag behind G-M5/2 and TNN. BTNNs, for instance, achieve a low RMSE=0.1113 and an MAE=0.0832, whereas BoTs report slightly higher error rates with an RMSE=2.6089 and MAE=1.8832. The testing phase results clearly demonstrate the significant advantage of nonlinear models, particularly GM5/2 and TNN, over linear methods in minimizing error. These models are well-suited for capturing the intricate relationships in the dataset, leading to highly accurate predictions. On the other hand, the linear models' higher error metrics indicate their inadequacy for handling complex datasets like those generated by TG analysis. This comparison emphasizes the importance of adopting advanced machine learning techniques for applications requiring high precision and reliability. The scatter plots in Fig. 5 illustrate the relationship between observed and predicted weight percentages (W%) for both the training (green background) and testing (red background) phases, offering insights into model performance. In the training phase, the data points closely align with the diagonal line, indicating high predictive accuracy and minimal deviation across models, especially for the NNE). The plots from the testing phase exhibit a similar pattern but reveal a slight increase in dispersion, which reflects the models' capacity to generalize to unseen data. The NNE consistently shows superior alignment in both phases, further supporting its robustness and reliability as emphasized in the manuscript. These visualizations highlight the effectiveness of the machine learning models, particularly NNE, in accurately capturing the thermal characteristics of Dawakin Tofa clay. They reinforce the study's conclusions regarding the potential of advanced models to enhance sustainable energy applications. Fig. 6 plot illustrates the predictive capabilities of the studied models in estimating the correlation between weight percentage (%) and intensity. The linear models (SWLR, RLR, and ILR) display gradual and consistent trends; however, they lack the flexibility necessary to account for non-linear relationships within the data. In contrast, non-linear models:GM5/2, TNN, BoT, and BTNN exhibit enhanced adaptability, effectively capturing more complex curves, especially at elevated weight percentages. The GM5/2 model demonstrates a strong alignment with the observed intensity values, indicating its robustness. The color gradient on the right serves as a visual representation of model hierarchy, with darker shades indicating non-linear models and lighter shades representing linear approaches. This visualization highlights the superiority of non-linear models in accurately capturing intricate relationships, thereby improving predictive performance. The above, Fig. 7 presents the cumulative predictive performance of all the tested models (Linear and Non-Linear), in estimating weight percentage (%) as a function of intensity. The area under the curve for each model is illustrated, with the segments beneath the curves representing the error or deviation for each respective model. GM5/2 model encompasses the largest area, indicating strong performance and excellent alignment with the observed data. Non-linear models such as TNN and BoT exhibit greater adaptability and enhanced predictive accuracy compared to linear models (SWLR, RLR, and ILR), as demonstrated by their closer fit to the cumulative trend and reduced deviations. This graph visually reinforces the manuscript's conclusions that non-linear models, particularly GM5/2 and TNN, surpass linear approaches in effectively capturing the complex thermal behavior of Dawakin Tofa clay. Additionally, the layering emphasizes the potential for error reduction through the application of advanced ensemble methods. 4.3 Predictive results of single models’ ensemble The results in Table 2 showcase the predictive performance of ensemble models: SAE, WAE, and NNE in both the training and testing phases, emphasizing their goodness of fit through Determination Coefficient (DC), Pearson Correlation Coefficient (PCC), and regression equations. During the training phase, SAE and WAE exhibit identical performance with a DC=0.8908 and PCC=0.9438, accompanied by the regression equation Y=0.7065X+18.911Y = 0.7065X + 18.911Y=0.7065X+18.911, indicating a proportional but suboptimal fit between predicted and observed values. In contrast, NNE achieves perfect metrics, with a DC=PCC=1, and a regression equation Y=1X+0.00007Y = 1X + 0.00007Y=1X+0.00007, reflecting an almost ideal alignment with the training data. In the testing phase, SAE and WAE improve significantly, achieving a DC=0.9982 and PCC=0.9991, with their regression equation shifting to Y=7.0486X−17.916Y = 7.0486X - 17.916Y=7.0486X−17.916, reflecting a steeper slope that suggests heightened sensitivity to the testing data, potentially signaling overcompensation. Meanwhile, NNE maintains its superior performance with a DC=0.999 and PCC=0.999, along with a regression equation Y=0.9996X+0.0003Y = 0.9996X + 0.0003Y=0.9996X+0.0003, which demonstrates its ability to generalize accurately without overfitting. These results reveal NNE’s robustness and adaptability, while SAE and WAE, despite showing near-perfect results in the testing phase, display weaker training-phase performance and steeper slopes in the testing phase, raising potential concerns about scaling issues or overcompensation that might require further investigation. Table 2: Ensemble results based on goodness of fit for single models Training Phase DC PCC Equations SAE 0.8908 0.9438 Y=0.7065X+18.911 WAE 0.8908 0.9438 Y=0.7065X+18.911 NNE 1 1 Y=1X+0.00007 Testing Phase DC PCC Equations SAE 0.9982 0.9991 Y=7.0486X-17.916 WAE 0.9982 0.9991 Y=7.0486X-17.916 NNE 0.999 0.999 Y=0.9996X+0.0003 The percentage differences between the ensemble models during the testing phase highlight minor variations in DC and PCC but significant disparities in their regression slopes. SAE and WAE achieve identical DC=0.9982, while NNE slightly outperforms with a DC=0.999, representing a percentage difference of 0.08%, indicating a marginally better fit for NNE. PCC=SAE=WAE=0.9991, whereas NNE=0.999, resulting in a negligible percentage difference of 0.01%, reflecting comparable correlations between predicted and actual values across all models. However, the regression slopes reveal substantial differences; SAE and WAE exhibit steep slopes=7.0486, whereas NNE maintains a near-ideal slope=0.9996, translating to an 85.82% difference. This stark contrast suggests that SAE and WAE may overcompensate or exhibit heightened sensitivity to the testing data, potentially impacting scalability, while NNE demonstrates superior robustness and alignment with the data. Despite these variations in regression behavior, the minor differences in DC and PCC reinforce the high predictive reliability and accuracy of all three models during the testing phase. The plots presented in Fig. 8 above compare the W% with predictions generated by the NNE, SAE, and WAE models. The "Observed W" curve serves as a reference point, illustrating the actual thermal behavior of Dawakin Tofa clay. The "Simulated-NNE" curve closely aligns with the observed data, indicating its superior predictive accuracy and capacity to capture complex patterns. In contrast, the simulations from SAE and WAE exhibit more significant deviations, particularly at the curve's tail, reflecting reduced precision in their predictions. These visualizations support the manuscript's conclusions, highlighting the robustness and reliability of NNE in modeling thermal properties and its potential to enhance sustainable energy applications. 4.4 Predictive results of AI models ensemble Table 3 provides the error metrics; MSE and RMSE for the ensemble models SAE, WAE, and NNE during both the training and testing phases, focusing on their ability to fit nonlinear models. The results highlight significant differences in the error performance across the models. In the training phase, NNE stands out with exceptional performance, achieving an MSE=0.000212 and an RMSE=0.01456, indicating almost negligible errors and a near-perfect fit to the training data. In contrast, SAE performs moderately, with an MSE=0.521145 and RMSE=0.721903, reflecting a stronger but not flawless alignment with the data. WAE, however, performs poorly in the training phase, with an extremely high MSE=3568.649 and RMSE=59.7382, indicating significant deviations and a weak ability to fit the training data effectively. This suggests that WAE struggles with capturing the underlying patterns in the training dataset compared to SAE and NNE. In the testing phase, all models improve significantly, reflecting better generalization. NNE again delivers the best performance, with an MSE=0.0001696 and RMSE=0.01302, demonstrating its superior ability to generalize with minimal error and maintain robustness. SAE follows with an MSE=0.004406 and RMSE=0.066377, showing substantial improvement over its training phase and confirming its reliability in capturing the data structure with moderate accuracy. WAE also improves in the testing phase, achieving an MSE=0.034514 and RMSE=0.18578, but it still lags behind SAE and NNE, indicating that while its predictive power is enhanced during testing, it remains less reliable than the other models. Overall, the comparison reveals that NNE is the most accurate and consistent model across both phases, exhibiting the lowest error metrics and the strongest fit to both training and testing datasets. SAE demonstrates moderate performance, offering reliable generalization despite higher errors than NNE, while WAE, with its exceptionally high errors in the training phase, performs relatively poorly but shows some improvement in the testing phase. This highlights the dominance of NNE in nonlinear model fitting and suggests its robustness and adaptability make it the optimal choice for applications requiring high precision. Table 3 : Ensemble results based on error of fit for nonlinear models Training Phase Testing Phase MSE RMSE MSE RMSE SAE 0.521145 0.721903 0.004406 0.066377 WAE 3568.649 59.7382 0.034514 0.18578 NNE 0.000152 0.012337 0.000254 0.015946 The bar graph as seen in Fig. 9 illustrates the behaviour of three ensemble models SAE, WAE, and NNE with respect to predictive accuracy and error metrics. The NNE demonstrates superior performance, achieving the highest predictive accuracy and exhibiting minimal error bars, which signifies its robustness and reliability in capturing the thermal characteristics of Dawakin Tofa clay. In contrast, SAE shows moderate performance, delivering acceptable yet less consistent results, as evidenced by its slightly larger error margins. Although WAE employs a weighted methodology, it exhibits considerable variability and overall weaker performance relative to NNE, with some improvements noted in specific cases. These results align with the manuscript's conclusions, emphasizing the enhanced effectiveness of non-linear and neural network-based ensembles in accurately predicting the material's sensible heat storage potential. This visualization highlights the importance of model selection in optimizing performance for sustainable energy storage applications. 4.5 Comparison of the results The results across Tables 1, 2, and 3 provides a holistic comparison of the predictive performance of the ensemble models: SAE, WAE, and NNE for nonlinear modeling tasks. These models were assessed based on their goodness of fit (DC, PCC, and regression equations) and error metrics (MSE and RMSE) during both the training and testing phases, highlighting key differences in their predictive capabilities. NNE consistently outperforms the other models, demonstrating exceptional accuracy and reliability. Its goodness of fit metrics in Table 2 show near-perfect performance, with a DC and PCC=1 in the training phase and 0.999 in the testing phase. Furthermore, its regression equation (Y=1X+0.00007Y = 1X + 0.00007Y=1X+0.00007 in training and Y=0.9996X+0.0003Y = 0.9996X + 0.0003Y=0.9996X+0.0003 in testing) reflects an almost ideal alignment between predicted and observed values. The error metrics in Table 3 further confirm its dominance, with MSE=0.000212 in training and 0.0001696 in testing, and RMSE=0.01456 and 0.01302, respectively. These results indicate NNE's unparalleled ability to capture complex nonlinear relationships while maintaining robustness and generalizability across datasets. SAE shows moderate performance, with significant improvements in the testing phase. Its goodness of fit metrics (DC = 0.9982, PCC = 0.9991) and regression slope in Table 2 indicate reliable generalization, though its alignment is less precise than that of NNE. The error metrics in Table 3 reveal an MSE=0.004406 and RMSE=0.066377 in the testing phase, which, while significantly better than its training phase performance (MSE = 0.521145, RMSE = 0.721903), still lag behind those of NNE. SAE demonstrates a strong capacity to generalize predictions despite its relatively weaker performance in capturing the finer details of the data during training. WAE exhibits the weakest performance among the three models, particularly during the training phase, where it struggles with a massive MSE=3568.649 and RMSE=59.7382. While Table 2 shows that WAE achieves comparable goodness of fit metrics (DC = 0.9982, PCC = 0.9991) to SAE in the testing phase, its regression slope and higher error metrics in Table 3 (MSE = 0.034514, RMSE = 0.18578) suggest that its performance, though improved in testing, is less reliable. WAE's steep regression slope in testing also raises concerns about over-sensitivity to the data, highlighting its limitations in capturing complex relationships. In summary, NNE emerges as the most robust and reliable model across all phases, with minimal errors, near-perfect goodness of fit, and the ability to generalize accurately. SAE provides a viable alternative for applications requiring acceptable accuracy, especially given its strong generalization capabilities in the testing phase. In contrast, WAE's variability and weaker performance, particularly in the training phase, make it less suitable for tasks requiring precision. These findings underscore the importance of selecting advanced ensemble models like NNE for complex nonlinear predictive tasks, ensuring both accuracy and robustness for practical applications such as modeling the heat storage potential of materials. The study focuses on predicting the sensible heat storage potential of Dawakin Tofa clay using advanced machine learning models, a task with significant environmental implications. Sensible heat storage systems, especially those utilizing natural and locally abundant materials like Dawakin Tofa clay, play a critical role in enhancing energy efficiency and sustainability in thermal management applications. By accurately predicting the heat storage capacity of this material, the study contributes to the broader goal of optimizing renewable energy systems, reducing dependency on fossil fuels, and promoting energy conservation. One key environmental benefit of this study lies in its contribution to energy storage systems, which are vital for integrating renewable energy sources such as solar and wind into the energy grid. Efficient heat storage materials ensure that excess energy generated during peak production periods can be stored and used later, reducing energy wastage and enhancing the reliability of renewable energy systems. By leveraging locally available clay, the study also supports the use of sustainable and low-cost materials, minimizing the environmental footprint associated with synthetic or imported alternatives. Moreover, the application of machine learning models to predict material performance ensures a more efficient research and development process. Traditional experimental methods for evaluating thermal properties are often resource-intensive, requiring extensive time, energy, and raw materials. The use of advanced predictive tools reduces the need for excessive experimentation, thereby conserving resources and decreasing laboratory emissions. This approach aligns with the principles of green engineering, emphasizing resource efficiency and minimal environmental impact. The study also highlights the potential for utilizing clay, a natural and abundant material, as a sustainable solution for TES. This reduces reliance on environmentally harmful materials and promotes the use of renewable resources. Additionally, the focus on sensible heat storage supports the development of passive energy systems, which are known for their low environmental impact compared to active energy systems that require additional energy inputs for operation. Finally, this research contributes to the fight against climate change by promoting sustainable thermal energy storage solutions that can lower greenhouse gas emissions. Efficient heat storage systems enable better utilization of renewable energy, reducing reliance on carbon-intensive energy sources. Furthermore, by enhancing the performance of natural materials through predictive modeling, the study supports the development of sustainable infrastructure for energy storage, which is crucial for achieving global energy transition goals. In conclusion, this study's environmental implications are far-reaching. It not only advances the scientific understanding of natural heat storage materials but also supports sustainable practices in energy storage, resource conservation, and the promotion of renewable energy technologies. By integrating machine learning with material science, this research paves the way for innovative, eco-friendly solutions to address global energy and environmental challenges. Conclusion This study demonstrated the successful application of advanced machine learning models, including SAE, WAE, and NNE, to predict the sensible heat storage potential of Dawakin Tofa clay using TGA data. Among the models, NNE consistently outperformed others, achieving near-perfect predictive performance in both training and testing phases, with exceptionally low error metrics (MSE = 0.000212 and RMSE = 0.01456 in training, and MSE = 0.0001696 and RMSE = 0.01302 in testing). SAE showed moderate accuracy and reliable generalization, while WAE displayed significant variability and weaker performance, particularly during training, despite improved results in the testing phase. These findings highlight the critical role of robust nonlinear models like NNE in capturing the complex thermal behavior of materials and their potential for practical applications. The environmental implications of this study are substantial, as it emphasizes the use of Dawakin Tofa clay, a natural and locally abundant material, in the development of sustainable energy storage systems. By leveraging machine learning, the research reduces the need for resource-intensive experimental evaluations, promoting resource efficiency, lowering emissions, and aligning with green engineering principles. Moreover, the use of clay as a heat storage material offers a cost-effective and environmentally friendly alternative to synthetic materials, supporting renewable energy integration, reducing energy wastage, and contributing to the mitigation of greenhouse gas emissions. To build on these findings, several recommendations and directions for future work are proposed. Material optimization should be prioritized by exploring surface treatments or composite formation to enhance the thermal properties of Dawakin Tofa clay, making it more efficient for practical applications. Pilot-scale studies are necessary to test the scalability and real-world performance of clay-based thermal energy storage systems, especially in renewable energy applications such as solar or industrial thermal systems. Further, integrating clay into solar thermal systems could unlock its full potential by demonstrating its ability to store and redistribute energy efficiently. Expanding the dataset by including a broader range of clay samples from different regions is essential for validating the predictive models and improving their generalizability. Future research should also explore hybrid machine learning models or deep learning architectures to further refine predictive accuracy and minimize error metrics. Additionally, a comprehensive economic analysis should be undertaken to assess the feasibility and scalability of using clay in thermal energy storage, along with a life cycle analysis (LCA) to evaluate its environmental trade-offs and benefits comprehensively. The study showcases the transformative potential of combining natural materials with cutting-edge machine-learning techniques to advance sustainable energy storage solutions. By addressing the limitations of traditional experimental approaches and emphasizing cost-effective and environmentally conscious alternatives, this research paves the way for developing efficient, scalable, and eco-friendly systems that align with global sustainability goals. Expanding on these findings through material optimization, advanced modeling, and practical implementation will further contribute to achieving a sustainable energy future. References Ahmed, S. F., Alam, M. S. Bin, Hassan, M., Rozbu, M. R., Ishtiak, T., Rafa, N., Mofijur, M., Shawkat Ali, A. B. M., & Gandomi, A. H. (2023). Deep learning modelling techniques: current progress, applications, advantages, and challenges. In Artificial Intelligence Review (Vol. 56, Issue 11). Springer Netherlands. https://doi.org/10.1007/s10462-023-10466-8 Amir, M., Deshmukh, R. G., Khalid, H. M., Said, Z., Raza, A., Muyeen, S. M., Nizami, A. S., Elavarasan, R. M., Saidur, R., & Sopian, K. (2023). Energy storage technologies: An integrated survey of developments, global economical/environmental effects, optimal scheduling model, and sustainable adaption policies. Journal of Energy Storage , 72 (PE), 108694. https://doi.org/10.1016/j.est.2023.108694 Baghbani, A., Abuel-Naga, H., & Shirkavand, D. (2023). Accurately Predicting Quartz Sand Thermal Conductivity Using Machine Learning and Grey-Box AI Models. Geotechnics , 3 (3), 638–660. https://doi.org/10.3390/geotechnics3030035 Bang, H. T., Yoon, S., & Jeon, H. (2020). Application of machine learning methods to predict a thermal conductivity model for compacted bentonite. Annals of Nuclear Energy , 142 , 107395. https://doi.org/10.1016/j.anucene.2020.107395 Barber, D. (2012). Statistics for machine learning. In Bayesian Reasoning and Machine Learning . https://doi.org/10.1017/cbo9780511804779.012 Bisu, A. A., Ahmed, T. G., Ahmad, U. S., & Maiwada, A. D. (2024). A SWOT Analysis Approach for the Development of Photovoltaic (PV) Energy in Northern Nigeria. Cleaner Energy Systems , 9 (June), 100128. https://doi.org/10.1016/j.cles.2024.100128 Bottmer, L., Croux, C., & Wilms, I. (2022). Sparse regression for large data sets with outliers. European Journal of Operational Research , 297 (2), 782–794. https://doi.org/10.1016/j.ejor.2021.05.049 Chekifi, T., & Boukraa, M. (2023). CFD applications for sensible heat storage: A comprehensive review of numerical studies. Journal of Energy Storage , 68 (June), 107893. https://doi.org/10.1016/j.est.2023.107893 Darvishvand, L., Safari, V., Kamkari, B., Alamshenas, M., & Afrand, M. (2022). Machine learning-based prediction of transient latent heat thermal storage in finned enclosures using group method of data handling approach: A numerical simulation. Engineering Analysis with Boundary Elements , 143 (June), 61–77. https://doi.org/10.1016/j.enganabound.2022.06.009 de Zarzà, I., de Curtò, J., Hernández-Orallo, E., & Calafate, C. T. (2023). Cascading and Ensemble Techniques in Deep Learning. Electronics (Switzerland) , 12 (15), 1–18. https://doi.org/10.3390/electronics12153354 Hassija, V., Chamola, V., Mahapatra, A., Singal, A., Goel, D., Huang, K., Scardapane, S., Spinelli, I., Mahmud, M., & Hussain, A. (2024). Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence. Cognitive Computation , 16 (1), 45–74. https://doi.org/10.1007/s12559-023-10179-8 Khan, A. A., Chaudhari, O., & Chandra, R. (2024). A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation. Expert Systems with Applications , 244 (December 2023), 122778. https://doi.org/10.1016/j.eswa.2023.122778 Khan, D. M., Ali, M., Ahmad, Z., Manzoor, S., & Hussain, S. (2021). A New Efficient Redescending M-Estimator for Robust Fitting of Linear Regression Models in the Presence of Outliers. Mathematical Problems in Engineering , 2021 . https://doi.org/10.1155/2021/3090537 Learning, D. (2023). SS symmetry Deep Learning and Neural Networks : Decision-Making . Lin, Y., Endo, Y., Lee, J., & Kamijo, S. (2024). Neural Architecture Search via Trainless Pruning Algorithm: A Bayesian Evaluation of a Network with Multiple Indicators. Electronics (Switzerland) , 13 (22). https://doi.org/10.3390/electronics13224547 Maiwada, A. D., & Abba, S. I. (2024). Development of a Hybrid Intelligence Algorithm to Estimate the Derivative Weight of Dawakin Tofa Clay for Heat Storage Development of a Hybrid Intelligence Algorithm to Estimate the Derivative Weight of Dawakin Tofa Clay for Heat Storage . 1 (2). Mansa, R., & Zou, S. (2021). Thermogravimetric analysis of microplastics: A mini review. Environmental Advances , 5 , 100117. https://doi.org/10.1016/j.envadv.2021.100117 Martinelli, D. D. (2022). Generative machine learning for de novo drug discovery: A systematic review. Computers in Biology and Medicine , 145 (March), 105403. https://doi.org/10.1016/j.compbiomed.2022.105403 McNamara, M. E., Zisser, M., Beevers, C. G., & Shumake, J. (2022). Not just “big” data: Importance of sample size, measurement error, and uninformative predictors for developing prognostic models for digital interventions. Behaviour Research and Therapy , 153 (June 2021), 104086. https://doi.org/10.1016/j.brat.2022.104086 Miller, A., Panneerselvam, J., & Liu, L. (2022). A review of regression and classification techniques for analysis of common and rare variants and gene-environmental factors. Neurocomputing , 489 , 466–485. https://doi.org/10.1016/j.neucom.2021.08.150 Mobarak, M. H., Mimona, M. A., Islam, M. A., Hossain, N., Zohura, F. T., Imtiaz, I., & Rimon, M. I. H. (2023). Scope of machine learning in materials research—A review. Applied Surface Science Advances , 18 (November), 100523. https://doi.org/10.1016/j.apsadv.2023.100523 Ngu, J. C. Y., Yeo, W. S., Thien, T. F., & Nandong, J. (2024). A comprehensive overview of the applications of kernel functions and data-driven models in regression and classification tasks in the context of software sensors. Applied Soft Computing , 164 (July), 111975. https://doi.org/10.1016/j.asoc.2024.111975 Nurazzi, N. M., Asyraf, M. R. M., Rayung, M., Norrrahim, M. N. F., Shazleen, S. S., Rani, M. S. A., Shafi, A. R., Aisyah, H. A., Radzi, M. H. M., Sabaruddin, F. A., Ilyas, R. A., Zainudin, E. S., & Abdan, K. (2021). Thermogravimetric analysis properties of cellulosic natural fiber polymer composites: A review on influence of chemical treatments. Polymers , 13 (16). https://doi.org/10.3390/polym13162710 Papazafeiropoulos, G. (2024). Stepwise Regression for Increasing the Predictive Accuracy of Artificial Neural Networks: Applications in Benchmark and Advanced Problems. Modelling , 5 (1), 153–179. https://doi.org/10.3390/modelling5010009 Porcu, E., Bevilacqua, M., Schaback, R., & Oates, C. J. (2024). The Matérn Model: A Journey Through Statistics, Numerical Analysis and Machine Learning. Statistical Science , 39 (3), 469–492. https://doi.org/10.1214/24-STS923 Pravin, P. S., Tan, J. Z. M., Yap, K. S., & Wu, Z. (2022). Hyperparameter optimization strategies for machine learning-based stochastic energy efficient scheduling in cyber-physical production systems. Digital Chemical Engineering , 4 (July), 100047. https://doi.org/10.1016/j.dche.2022.100047 Sarker, I. H. (2021). Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science , 2 (3), 1–21. https://doi.org/10.1007/s42979-021-00592-x Tiskatine, R., Aharoune, A., Bouirden, L., & Ihlal, A. (2017). Identification of suitable storage materials for solar thermal power plant using selection methodology. Applied Thermal Engineering , 117 , 591–608. https://doi.org/10.1016/j.applthermaleng.2017.01.107 Tomar, D., & Agarwal, S. (2015). Twin Support Vector Machine: A review from 2007 to 2014. Egyptian Informatics Journal , 16 (1), 55–69. https://doi.org/10.1016/j.eij.2014.12.003 Ummah, M. S. (2019). In Sustainability (Switzerland) (Vol. 11, Issue 1). http://scioteca.caf.com/bitstream/handle/123456789/1091/RED2017-Eng-8ene.pdf?sequence=12&isAllowed=y%0Ahttp://dx.doi.org/10.1016/j.regsciurbeco.2008.06.005%0Ahttps://www.researchgate.net/publication/305320484_ Zheng, R., Jia, Y., Ullagaddi, C., Allen, C., Rausch, K., Singh, V., Schnable, J. C., & Kamruzzaman, M. (2024). Optimizing feature selection with gradient boosting machines in PLS regression for predicting moisture and protein in multi-country corn kernels via NIR spectroscopy. Food Chemistry , 456 (June), 140062. https://doi.org/10.1016/j.foodchem.2024.140062 Zhou, L., Pan, S., Wang, J., & Vasilakos, A. V. (2017). Machine learning on big data: Opportunities and challenges. Neurocomputing , 237 (September 2016), 350–361. https://doi.org/10.1016/j.neucom.2017.01.026 Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6081166","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":419196461,"identity":"78b78ae1-a173-49ce-9a50-8f631763f1f4","order_by":0,"name":"Abubakar D. Maiwada","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABNUlEQVRIie3RPWvCQBjA8SccxCWa9cKJfoULgYj07atEMrgIxcUOHRQCTgHXSr+EULhZCMTl2q4HcYiLk0NcioNDL0lph4S+bB3uP91x/HiOOwCV6j+GQQdY5ysv3wCYhpaWR+vfEitE9A8kj/IfiPkYuumJX99Cw9+n/TnrOjzSyXgEnZbwUHaqGbLlPTsUfn9m7HvUmic2ew50smTgWMLTrbBmjBi52MjkXbDnYkk09mrGpMlgsJIEjKroSmKds6kkw7ec3DwtkJ6TqSToeK4SKgkxRCTJqJgyWDWDgnhUeEBqptjb+M5p8w3VjcME45fEf+ARulgybC/5bk7aVdJJArY7xPfUbAwZwZPkahH6WjJml93Wxo+Oh/qHLpK/Awh/bFD5q9rsG1CmZV9EpVKpVJ+9A7adaOji45ZbAAAAAElFTkSuQmCC","orcid":"","institution":"Material Science and Engineering Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia.","correspondingAuthor":true,"prefix":"","firstName":"Abubakar","middleName":"D.","lastName":"Maiwada","suffix":""},{"id":419196462,"identity":"9d7bb0ab-1f97-4d57-8702-a55309c6f356","order_by":1,"name":"Abdullahi A. Adamu","email":"","orcid":"","institution":"Mechanical Engineering Department, Bayero University, Kano, Nigeria","correspondingAuthor":false,"prefix":"","firstName":"Abdullahi","middleName":"A.","lastName":"Adamu","suffix":""},{"id":419196463,"identity":"ffcafe36-1b69-429c-af5d-e06b6acde8c4","order_by":2,"name":"Jamilu Usman","email":"","orcid":"","institution":"Interdisciplinary Research Center for Membrane and Water Security, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia","correspondingAuthor":false,"prefix":"","firstName":"Jamilu","middleName":"","lastName":"Usman","suffix":""},{"id":419196464,"identity":"a461dd33-619d-4965-879e-5c6df71b946e","order_by":3,"name":"Umar D. Maiwada","email":"","orcid":"","institution":"Umaru Musa ‘Yar’adua University, Katsina State","correspondingAuthor":false,"prefix":"","firstName":"Umar","middleName":"D.","lastName":"Maiwada","suffix":""},{"id":419196465,"identity":"867c00e0-b779-4884-819e-ccd2b9d9ae63","order_by":4,"name":"Suleiman Abdulrahman","email":"","orcid":"","institution":"Interdisciplinary Research Center for Construction and Building Materials, King Fahd University of Petroleum \u0026 Minerals, Dhahran 31261, Saudi Arabia","correspondingAuthor":false,"prefix":"","firstName":"Suleiman","middleName":"","lastName":"Abdulrahman","suffix":""},{"id":419196466,"identity":"758c9c49-1e07-4e94-8dfc-89b997597246","order_by":5,"name":"Sani I. Abba","email":"","orcid":"","institution":"Department of Civil Engineering, Prince Mohammad Bin Fahd University, Al Khobar, 31952, Saudi Arabia","correspondingAuthor":false,"prefix":"","firstName":"Sani","middleName":"I.","lastName":"Abba","suffix":""}],"badges":[],"createdAt":"2025-02-21 16:49:04","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":true,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":true},"doi":"10.21203/rs.3.rs-6081166/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6081166/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":77220732,"identity":"df2c194c-ac06-49eb-ba56-284096634df8","added_by":"auto","created_at":"2025-02-26 10:38:06","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":169320,"visible":true,"origin":"","legend":"\u003cp\u003eFlowchart depicting test, analysis and data storage of the methods employed in the study\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/c1fa196c36c40a6895f2520a.png"},{"id":77220734,"identity":"0188fecc-1339-4e01-8959-d8c5c52a5dae","added_by":"auto","created_at":"2025-02-26 10:38:06","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":153328,"visible":true,"origin":"","legend":"\u003cp\u003eSchematic diagram of kernel SVM\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/f1c8b32d5be0a93e13376ccc.png"},{"id":77222082,"identity":"c3d9cd06-1314-4d16-b83d-58a203657e44","added_by":"auto","created_at":"2025-02-26 10:46:06","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":144438,"visible":true,"origin":"","legend":"\u003cp\u003eImage of Boosted Tree Network\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/17abae93b94dfe4bec2af9c4.png"},{"id":77220736,"identity":"846c9b05-a592-41d8-8f3e-00af7cb0bca0","added_by":"auto","created_at":"2025-02-26 10:38:06","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":203861,"visible":true,"origin":"","legend":"\u003cp\u003eImage of Bagged Tree Neural Network\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/97918e39900ef6e648c5b9bf.png"},{"id":77223534,"identity":"b7707c54-8203-49f3-b2bb-5dc06a15cdf5","added_by":"auto","created_at":"2025-02-26 10:54:06","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":214826,"visible":true,"origin":"","legend":"\u003cp\u003eScatter plot between the observed and computed (W%) for (a) training phase (b) testing phase\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/1c94d6ae216b9c6f741970a1.png"},{"id":77222084,"identity":"40bca78c-a735-477d-a311-e37a48528ff4","added_by":"auto","created_at":"2025-02-26 10:46:06","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":89785,"visible":true,"origin":"","legend":"\u003cp\u003ePerformance of Linear and Non-Linear Models for Predicting Weight Percentage vs. Intensity Relationship\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/4c2b72ff189cf7fdba3fae39.png"},{"id":77222085,"identity":"c2bd3eb5-935f-4ace-b779-345df375e204","added_by":"auto","created_at":"2025-02-26 10:46:06","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":213393,"visible":true,"origin":"","legend":"\u003cp\u003eGraph of Linear and Non-Linear Models against Percentage Weight\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/5671d1337d5728e41d81e4eb.png"},{"id":77220740,"identity":"a9b0c954-4b43-42c2-9e94-8df39f8eba8a","added_by":"auto","created_at":"2025-02-26 10:38:06","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":40723,"visible":true,"origin":"","legend":"\u003cp\u003ePlots of\u003cstrong\u003e \u003c/strong\u003eComparative Performance between W% and Simulated Models\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/2fe42b514a9c4a7a15f2953b.png"},{"id":77222089,"identity":"0e073ddd-4f7d-4bdd-9f27-36068758ad8e","added_by":"auto","created_at":"2025-02-26 10:46:06","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":46959,"visible":true,"origin":"","legend":"\u003cp\u003eBar graphs of\u003cstrong\u003e \u003c/strong\u003eerror of ensemble models\u003c/p\u003e","description":"","filename":"9.png","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/b7812630fbb810d3c85f9807.png"},{"id":77225223,"identity":"d8a27260-98c6-448d-880a-9b03153fa2a6","added_by":"auto","created_at":"2025-02-26 11:10:07","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1760430,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6081166/v1/695f3383-fa46-4a86-adc1-f0b0ca1df39b.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eMachine learning-enhanced prediction of sensible heat storage potential in Kano-Nigeria based on thermogravimetric analysis\u003c/strong\u003e\u003c/p\u003e","fulltext":[{"header":"Introduction","content":"\u003cp\u003eThe rising global energy need, along with the urgent requirement for energy systems that are sustainable, has led to increased interest in thermal energy storage (TES) technologies (Amir et al., 2023). Sensible heat storage (SHS), which stores and releases heat through temperature changes in materials, has become a workable and scalable option for different uses, such as solar power plants, heating and cooling in buildings, and industrial operations (Chekifi \u0026amp; Boukraa, 2023). Among the materials being studied for SHS, natural clays are of interest because they are readily available, inexpensive, and environmentally friendly. However, how well clays work as TES materials relies heavily on their thermal properties (Tiskatine et al., 2017), which means there is a need for precise and effective models to assess their performance.\u003c/p\u003e\n\u003cp\u003eAccording to (Bisu et al., 2024), Nigeria, recognized as one of the most populous nations in Africa, is endowed with a wealth of natural resources, particularly substantial reserves of oil and gas. Despite these advantages, the country grapples with a persistent energy crisis that adversely affects both urban and rural populations. This ongoing challenge has catalyzed a renewed emphasis on sustainable development, prompting an increased focus on renewable energy solutions. Renewable energy sources, characterized by their environmental sustainability and abundance, are especially prevalent in the Northern regions of Nigeria. The potential for harnessing solar, wind, and biomass energy in these areas presents a significant opportunity for diversifying the nation\u0026rsquo;s energy portfolio and reducing its reliance on fossil fuels. The transition towards renewable energy in Nigeria is influenced by a multifaceted interplay of policy frameworks, socio-economic factors, and technological advancements. Policymakers are increasingly recognizing the importance of integrating renewable energy into the national energy strategy to address the energy deficit and promote economic growth. Furthermore, socio-economic dynamics, including population growth and urbanization, necessitate urgent action to meet the rising energy demands sustainably. Technological progress plays a crucial role in this transition, as innovations in renewable energy technologies enhance efficiency and reduce costs, making these solutions more accessible. This convergence of policy, socio-economic needs, and technological advancements underscores the urgency of adopting renewable energy strategies to foster sustainable development in Nigeria.\u003c/p\u003e\n\u003cp\u003eDawakin Tofa clay, found a lot in Dawakin Tofa Local Government of Kano State-Nigeria, can be a cheap material for SHS because of its mineral makeup and ability to withstand heat (Maiwada \u0026amp; Abba, 2024). To use it in TES systems, we need to study it carefully, focusing on things like heat transfer, heat capacity, and how well it holds up under heat. Traditional methods like thermogravimetric analysis (TGA), differential scanning calorimetry (DSC), and measuring thermal conductivity are dependable, but they take a long time, use many resources, and often depend on specific samples (Mansa \u0026amp; Zou, 2021). So, it\u0026rsquo;s very important to create models that can predict how well Dawakin Tofa clay will perform thermally. TGA is a known method used to check how materials behave under heat. It gives important information about how materials break down, how much moisture they contain, and their stability when heated (Nurazzi et al., 2021) . In this work, TGA data is used mainly for Machine Learning (ML) models, forming a strong foundation for predicting the SHS potential of Dawakin Tofa clay. Important thermal details, like weight loss at certain temperatures and stability measures, are taken from TGA graphs to help the predictive models. ML methods have changed how predictive modeling works in materials science because they can manage complicated, non-linear connections between variables. By using past data, ML models can find hidden trends and make precise predictions with little need for experiments (Mobarak et al., 2023). Although interest in ML for thermochemical energy storage (TES) is increasing, research on the self-heating substance (SlHS) potential of natural clays, especially Dawakin Tofa clay, is limited.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThis study aims to address this issue by combining ML algorithms with TGA data to improve predictions of Dawakin Tofa clay\u0026apos;s SHS potential. To improve prediction accuracy, a range of ML algorithms are used, such as linear regression (LR), support vector machines (SVM), and deep learning models like artificial neural networks (ANNs). Techniques for selecting features are employed to find the most important input variables, while hyperparameter optimization helps find the best setup for each algorithm. The performance of the models is assessed using common statistical measures like mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), Mean Absolute Percentage Error (MAPE) and coefficient of determination (R\u0026sup2;). Data preprocessing is about cleaning and adjusting TGA-derived thermal parameters so that they work with ML models. We handle outliers and missing data using imputation methods and basic statistical checks. The data is split into training and testing sets in an 80:20 ratio, which helps ensure an unbiased evaluation. All ML models are trained with a k-fold cross-validation method to reduce overfitting. Models were tuned with hyperparameter optimization using grid search and Bayesian optimization methods. Track of performance metrics were kept to find the best algorithms. For model evaluation, prediction accuracy was checked using different statistical measures. MSE, RMSE, MAPE, and MAE are used to evaluate prediction errors, while R\u0026sup2; reflects the model\u0026apos;s explanatory ability (Pravin et al., 2022). A comparison of different ML models helps find the best predictive framework.\u003c/p\u003e\n\u003cp\u003eThis study is innovative as it integrates thermal data obtained from TGA with advanced machine learning algorithms to forecast SHS potential. This method helps overcome limitations in current TES studies by cutting down on the need for many lab experiments, thus allowing for better and broader evaluations of natural clays. The expected outcomes of this work include creating a framework to predict the SHS capacity of Dawakin Tofa clay, finding important thermal parameters that affect SHS performance, and showing how ML can help in TES materials research. Additionally, this study highlights the wider possibilities of using ML in material design and optimization for sustainable energy. \u0026nbsp;This research combines experimental thermal data with advanced machine learning techniques to provide a more precise, effective, and scalable assessment of the specific heat storage (SpHS) capabilities of natural clays. The results aim to help develop cost-effective and sustainable TES systems, aiding the global shift towards cleaner and more robust energy solutions.\u003c/p\u003e\n\u003cp\u003eSome studies have looked at using ML methods to predict thermal features of TES materials, allowing for comparisons. For instance, A study by (Darvishvand et al., 2022) employed computational fluid dynamics (CFD) simulations alongside machine learning models to predict the melting processes of phase change materials in thermal storage units. The study demonstrated the effectiveness of ANNs in forecasting thermal behaviors in engineered systems. A study by (Baghbani et al., 2023) on the introduction of genetic programming (GP) as a grey-box artificial intelligence method for predicting quartz sand thermal conductivity. The study demonstrated that CRRF (classification and regression random forest) achieved the highest prediction accuracy with a coefficient of determination R\u003csup\u003e2\u003c/sup\u003e=0.993 and a MAE = 0.045. GP achieved excellent predictive results with R\u003csup\u003e2\u003c/sup\u003e=0.986R and MAE = 0.063, proving its reliability as a grey-box model. ANN, using Levenberg\u0026ndash;Marquardt and Bayesian regularization algorithms, obtained R\u003csup\u003e2\u003c/sup\u003e=0.916 and MAE = 0.151. In contrast, multiple linear regression (MLR) showed the weakest performance with R\u003csup\u003e2\u003c/sup\u003e=0.737and MAE = 0.300.\u003c/p\u003e\n\u003cp\u003eIn his study (Bang et al., 2020) found that Traditional linear regression models had a root RMSE of 0.1186, but machine learning methods significantly reduced this value. Among the tested models, Gaussian Process Regression (GPR) with an exponential kernel achieved an RMSE of 0.07338, while ensemble learning using XGBoost delivered the best performance with an RMSE of 0.07197. These models managed data uncertainty, noise, and outliers more effectively than traditional models, which are often limited by strict statistical assumptions and data distribution constraints. This capability underscores the essential role of machine learning in optimizing energy storage system design and ensuring operational safety. Incorporating TGA data into ML models can enhance predictive abilities compared to past research that relied on basic thermal measurements. The feature selection in this study helps retain only the most important variables, simplifying the model. This is different from the simpler regression models noted in Kim et al. (2018), where too many features led to overfitting. \u0026nbsp;Furthermore, advanced ML techniques, such as ensemble learning methods (ELMs) used in this study, outperform traditional ML models regarding prediction accuracy. For example, the RF model in this research showed a lower RMSE than earlier TES material studies that used standard linear regression (SLR). The comparative analysis highlights the distinctiveness of integrating TGA-derived factors with advanced ML models. Whereas prior research has established the potential of ML in predicting TES materials. This study broadens the application to natural clays with considerable compositional variations, achieving better prediction results through careful model selection and hyperparameter tuning.\u003c/p\u003e\n\u003cp\u003eThe aim of this study is to develop and evaluate advanced machine learning models for accurately predicting the sensible heat storage potential of Dawakin Tofa clay using thermogravimetric analysis data, with a focus on optimizing natural, locally abundant materials for sustainable energy storage. The novelty lies in integrating advanced ensemble machine learning techniques, including NNE, SAE, and WAE, to model the thermal properties of clay, offering a resource-efficient alternative to traditional experimental methods. This is one of the first studies to combine thermogravimetric analysis data with machine learning for evaluating natural materials in energy storage applications, demonstrating the transformative potential of predictive modeling in material science. The study contributes by providing a detailed performance comparison of machine learning models, with NNE achieving near-perfect accuracy and outperforming others in capturing the complex thermal behavior of clay. By utilizing Dawakin Tofa clay, the research promotes sustainable, eco-friendly solutions, reducing reliance on synthetic materials and minimizing environmental impacts. It also enhances resource efficiency by reducing experimental requirements and aligns with green engineering principles. Furthermore, the findings lay the groundwork for optimizing clay materials through surface modifications, scaling up for real-world applications, and exploring hybrid models and expanded datasets for improved accuracy and generalizability, advancing the development of sustainable, efficient, and scalable thermal energy storage systems.\u003c/p\u003e"},{"header":"Proposed Intelligent Methods","content":"\u003cp\u003eThe research presents advanced machine learning techniques aimed at predicting the derivative weight of Dawakin Tofa clay, a key thermogravimetric property critical for evaluating its heat storage potential. A dataset of over 5,000 samples was experimentally gathered under controlled laboratory conditions to ensure an accurate representation of the clay\u0026apos;s thermal properties. Extensive preprocessing steps were performed, including data cleaning to handle missing values and outliers, enhancing the dataset\u0026apos;s reliability. Important numerical features such as temperature, heating rate, and initial weight were normalized within a [0, 1] range to improve model performance, particularly for scaling-sensitive algorithms like KSVM. Feature selection methods, including correlation analysis and recursive feature elimination (RFE), were employed to isolate the most relevant predictors, simplifying the dataset while enhancing model interpretability. The dataset was divided into 80:20 subsets, with cross-validation applied to fine-tune hyperparameters and mitigate overfitting risks.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe proposed hybrid algorithm integrates four linear machine learning models: Interactive Linear Regression ILR, SWLR, RLR, and KSVM. Also, four non-linear models were employed as: GM5/2, TNN, BoT and BTNN. The proposed hybrid algorithm integrates both linear and non-linear machine learning models, capitalizing on their distinct advantages. ILR is adept at managing linear relationships and improves prediction accuracy through iterative refinement. SWLR efficiently identifies relevant features by systematically adding or removing predictors based on their statistical significance, thereby optimizing model performance. RLR offers resilience against outliers, ensuring consistent predictions even in noisy datasets, which enhances reliability. On the non-linear side, GM5/2 is effective for modeling spatial data, providing smoothness and flexibility to capture complex relationships, even in irregular datasets. TNN leverages deep learning capabilities with multiple layers for intricate feature extraction, allowing for adaptive learning. The BoT method combines several weak learners to create a robust predictive model through iterative boosting, which enhances model strength and precision. Lastly, BTNN improve prediction stability and accuracy by aggregating outputs from multiple decision trees through bagging, thus reducing variance and addressing overfitting challenges. Final predictions were generated using a weighted ensemble approach, assigning model weights based on cross-validation performance to maximize overall accuracy. This integrated approach not only advances predictive modeling of TG properties but also establishes a reliable framework for applying hybrid machine learning algorithms in material characterization, contributing substantially to research in sustainable energy storage.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe flowchart in Fig. 1 above presents a detailed framework for assessing the sensible heat storage capacity of Dawakin Tofa clay, utilizing data obtained from TGA and DTA. Essential parameters such as weight percentage, derivative weight, and temperature are extracted and undergo pre- and post-processing to enhance data quality and interpretability. Baseline predictions are established through linear models, including SWLR, RLR, and ILR. In contrast, more sophisticated non-linear models, such as GM5/2, TNN, BoT, and BTNN, more effectively capture intricate thermal behaviors. Ensemble methods SAE, WAE, and NNE integrate predictions to enhance accuracy, with NNE showing the highest performance. Model assessment through metrics like MSE and RMSE ensures robustness, and predictions that meet established error thresholds yield conclusive results. This comprehensive approach underscores the potential of machine learning in optimizing natural materials for sustainable energy solutions.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInteractive Linear Regression (ILR)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis is a dynamic approach in ML and AI that allows real-time interaction between users and the regression model (RM). It builds on traditional linear regression (TLR), which models the relationship between a dependent variable and one or more independent variables using a linear equation (Sarker, 2021). The interactive component allows users to modify parameters, visualize model performance, and change data inputs while immediately seeing the effects on predictions. This capability improves the interpretability and comprehension of the model\u0026rsquo;s behavior, making it beneficial in educational settings, data analysis, and decision-making processes. It aids in identifying model biases, enhancing data preprocessing, and optimizing model performance (Hassija et al., 2024). For example, data scientists can interactively adjust feature weights, evaluate residual errors, and refine models for improved prediction accuracy. Applications include business forecasting, healthcare diagnostics, and environmental modeling, where understanding causal relationships is essential. This interactive framework encourages collaboration between AI developers and domain experts, ensuring that models are closely aligned with real-world expectations and boosting trust in AI-driven decisions.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.1\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eStep Wise Linear Regression (SWLR)\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSWLR is a technique used in machine learning to identify the most significant predictors for a target variable. Unlike standard linear regression, which evaluates all variables simultaneously, SWLR adopts a sequential approach that enhances model performance by reducing overfitting and improving interpretability. This allows researchers to focus on the most relevant variables. The SWLR process generally involves two main strategies: forward selection and backward elimination. In forward selection, the model begins without any predictors and incrementally adds variables one at a time, selecting those that most significantly improve the model\u0026rsquo;s performance. Conversely, backward elimination starts with all potential predictors and systematically removes the least significant ones. This iterative process continues until no further improvements can be made, ensuring the final model is both effective and parsimonious. By focusing on a subset of predictors, SWLR facilitates the identification of variables that contribute meaningfully to the prediction task. (Miller et al., 2022). SWLR is applied across diverse fields such as finance, healthcare, and social sciences, where understanding the relationships between variables is paramount. By simplifying models, SWLR enhances interpretability, enabling stakeholders to better understand the influence of individual predictors. Moreover, by emphasizing significant variables, the method often improves generalization on unseen data, thereby minimizing the risk of overfitting (Papazafeiropoulos, 2024).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.2\u0026nbsp;Robust Linear Regression (RLR)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eRLR offers a more resilient alternative to TLR by reducing sensitivity to outliers and violations of key assumptions like normality and homoscedasticity. In standard LR, outliers can heavily skew the estimated coefficients, leading to inaccurate conclusions (Ummah, 2019). Robust regression (Rr) tackles this issue by applying methods that limit the influence of outliers, ensuring the model\u0026apos;s stability even when anomalies are present in the data (Bottmer et al., 2022). A widely used approach in Rr is M-estimation, which assigns lower weights to data points with large residuals, minimizing their effect on the model. Other techniques include Least Absolute Deviations (LAD) regression, which focuses on minimizing the sum of absolute residuals instead of squared residuals, and RANSAC (Random Sample Consensus), which repeatedly fits models to random subsets of the data to identify the best fit for the majority (D. M. Khan et al., 2021). These methods make Rr particularly useful in practical scenarios involving real-world datasets that may contain errors or extreme values, offering more reliable predictions and parameter estimates compared to conventional linear regression \u0026nbsp;(McNamara et al., 2022).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.3\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eKernel Support Vector Machine (KSVM)\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eKSVM enhances the traditional SVM by enabling it to classify non-linearly separable data as seen in Fig. 2. While conventional SVMs excel at finding an optimal hyperplane for linearly separable datasets (Tomar \u0026amp; Agarwal, 2015), KSVM overcomes this limitation through the use of the kernel trick. This technique implicitly maps data into a higher-dimensional space where linear separation becomes feasible, bypassing the need for explicit transformation calculations. By computing inner products in this transformed space using a kernel function, KSVM effectively classifies data while operating in the original feature space. \u0026nbsp;Commonly used kernels include the polynomial, radial basis function (RBF), and sigmoid kernels (SK) allowing KSVM to handle complex relationships found in tasks like image classification, bioinformatics, and text categorization (Ngu et al., 2024) .The model\u0026apos;s strength lies in its ability to learn non-linear decision boundaries while maintaining computational efficiency. However, selecting an appropriate kernel and fine-tuning its parameters are essential for optimal performance in specific applications (Sarker, 2021).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.4\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eG-Matern 5/2\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe GM5/2 model significantly extends the traditional Matern class of covariance functions (CF), which are widely utilized in spatial statistics and geostatistics. This specific formulation, characterized by its smoothness and flexibility, is beneficial for modeling spatial phenomena with varying degrees of continuity (Porcu et al., 2024). The GM5/2 function is defined by its parameters, including the range, smoothness, and variance, allowing practitioners to tailor the model to fit empirical data effectively. One of the key advantages of the GM5/2 is its ability to capture complex spatial structures (Ss), making it suitable for environmental science, geophysics, and spatial epidemiology applications. The smoothness parameter, in particular, influences the differentiability of the realizations of the process, providing insights into the underlying spatial correlation. Recent advancements in computational techniques have facilitated the implementation of the GM5/2 model in large datasets, enhancing its applicability in real-world scenarios \u0026nbsp;(Zhou et al., 2017). Largely, it serves as a robust tool for researchers aiming to understand and interpret spatial data, contributing to more informed decision-making in various scientific fields.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.5\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eTrilayered Neural Network (TNN)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThese, commonly referred to as three-layer neural networks, are foundational architectures in the field of artificial intelligence and machine learning. These networks consist of three distinct layers: an input layer, a hidden layer, and an output layer. The input layer receives the initial data, while the hidden layer processes this information through weighted connections and activation functions, allowing the network to learn complex patterns. The output layer produces the final predictions or classifications based on the processed information. They are often trained using backpropagation, a method that adjusts the weights of the connections based on the error of the output compared to the expected result (Lin et al., 2024). Despite their limitations in handling highly complex datasets compared to deeper networks, trilayered neural networks serve as an excellent starting point for understanding neural network principles. They provide insights into the fundamental mechanisms of learning and generalization. As the field evolves, these networks remain relevant, often serving as benchmarks for more sophisticated architectures, including deep learning models (Learning, 2023). Their importance in both theoretical and practical applications continue to be a subject of active research.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.6\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eBoosted Tree Network (BoT)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eBoT as in Fig. 3 is a powerful ensemble learning technique, have gained significant traction in the field of ML due to their robustness and predictive accuracy. This method builds a series of decision trees in a sequential manner, where each new tree corrects the errors made by the previous ones. By focusing on the misclassified instances, boosted trees effectively reduce bias and variance, leading to improved performance on complex datasets (Barber, 2012). One of the most popular algorithms in this domain is the (GBM), which optimizes a loss function through gradient descent. Recent advancements, such as XGBoost and Light GBM, have further enhanced the efficiency and scalability of BoT, enabling their application to large-scale datasets and real-time predictions (Zheng et al., 2024). BoT networks are particularly effective in handling various types of data, including structured and unstructured inputs. Their interpretability, combined with feature importance metrics, allows researchers to gain insights into the underlying patterns within the data. As the demand for accurate predictive models continues to rise across industries, they remain a vital area of research, promising ongoing developments and applications in diverse fields such as finance, healthcare, and marketing (Martinelli, 2022).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.7\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eBagged Tree Neural Network (BTNN)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAs in the image from Fig 4 BTNN offer an exciting blend of ensemble learning and neural network techniques, designed to boost predictive accuracy and reliability. At the heart of bagging, or bootstrap aggregating, is the idea of training multiple models on different subsets of data. This helps to reduce variability and combat overfitting, which can often plague machine learning models (de Zarz\u0026agrave; et al., 2023). A study by (A. A. Khan et al., 2024) shows that, in a BTNN, several decision trees are trained on randomly selected portions of the training data. Each tree then contributes to the final prediction, usually by averaging their outputs or voting on the most common result. This combination allows researchers to capture intricate, non-linear patterns in the data while still benefiting from the clarity that tree-based models provide. This innovative approach has shown great promise in various fields, such as credit scoring, medical diagnosis, and image classification. Its ability to manage high-dimensional data and diverse feature types makes bagged tree neural networks adaptable and powerful tools in machine learning. Ongoing research aims to refine their structure and training methods, paving the way for new solutions across different sectors (Ahmed et al., 2023). As machine learning continues to evolve, BTNNs are set to play a vital role in enhancing predictive analytics.\u003c/p\u003e"},{"header":"Results and Discussion","content":"\u003cp\u003e\u003cstrong\u003e4.1 Hyperparameter tuning\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eHyperparameter tuning was essential to optimize the performance of the machine learning models by identifying the best parameter configurations for minimizing errors and maximizing accuracy. For the NNE, the optimal configuration included three hidden layers with [64, 128, 64] neurons, a learning rate of 0.001, and the ReLU activation function, achieving near-perfect metrics with MSE = 0.000212 and RMSE = 0.01456 during training. The SAE, which lacks tunable hyperparameters, benefitted from data normalization to a range of [0, 1], enhancing its performance. For the WAE, tuning focused on the weight distribution among predictions, with the best results obtained by assigning 70% weight to NNE and 30% to linear models, ensuring a balance between precision and generalization. Tree-based models also underwent rigorous tuning, with Boosted Trees achieving optimal performance using 100 estimators, a maximum depth of 5, and a learning rate of 0.05, while Bagged Trees performed best with 50 base estimators and a maximum depth of 7. These tuning strategies ensured each model operated at its highest potential, delivering accurate predictions of the sensible heat storage potential of Dawakin Tofa clay, while validation confirmed their generalizability to unseen datasets.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.2 Predictive Results of linear and nonlinear models\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe results in Table 1 highlight a clear distinction between the predictive capabilities of linear and nonlinear ML models for estimating the SHS potential of Dawakin Tofa clay using TG analysis data. The linear models, including SWLR, ILR, and Robust RLR, show limited performance, with training and testing phase R\u003csup\u003e2\u003c/sup\u003e values of 0.85 and 0.86, respectively. These models struggle with high RMSE values (~16.37 in training and ~15.937 in testing) and substantial MAPE exceeding 1400, reflecting their inability to effectively capture the nonlinear complexities inherent in the data. In contrast, the nonlinear models demonstrate near-perfect accuracy, with R\u003csup\u003e2\u003c/sup\u003e=1across both phases and significantly lower error metrics. The GM5/2 model stands out as the most precise, achieving near-zero RMSE (0.0139 in training and 0.01333 in testing) and extremely low MAE, underscoring its unparalleled ability to model nonlinear relationships. The TNN also performs exceptionally well, with slightly higher error rates than GM5/2 but maintaining remarkable generalization and computational efficiency. Ensemble tree-based models, such as BoT and BTNN, exhibit strong predictive power but display marginally higher RMSE and MAPE compared to GM5/2 and TNN, particularly in the testing phase. These findings underscore the superiority of nonlinear machine learning approaches over LR models for this application, emphasizing the importance of advanced techniques like GM5/2 and TNN for accurately modeling complex thermophysical properties. The robust performance of these models makes them highly suitable for practical applications in materials science, offering precise and reliable predictions critical for optimizing sensible heat storage systems. This analysis demonstrates the transformative potential of integrating machine learning techniques in material characterization, enabling researchers to unlock deeper insights and design more efficient energy storage systems.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 1:\u0026nbsp;\u003c/strong\u003ePredictive results of linear and nonlinear models\u003c/p\u003e\n\u003cdiv align=\"Left\"\u003e\n \u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"407\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"bottom\" style=\"width: 128px;\"\u003e\n \u003cp\u003eTraining Phase\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003eR\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003eRMSE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003eMAPE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eMSE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eMAE\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eSWLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e0.85\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e16.37\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e1408\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e267.97\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e14.143\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eILR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e0.85\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e16.37\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e1408\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e267.97\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e14.143\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eRLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e0.85\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e16.371\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e1396.5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e268\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e14.137\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eG-Matern 5/2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e0.0139\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.00019\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.00708\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eTNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e0.26286\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e4.5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.0691\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.1606\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eBoosted Trees\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e2.5639\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e7.9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e6.5735\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e1.8201\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eBagged Trees\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e0.10749\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e5.6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.01155\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.0799\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"bottom\" style=\"width: 128px;\"\u003e\n \u003cp\u003eTesting Phase\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003eR\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003eRMSE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003eMAPE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eMSE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eMAE\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eSWLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e0.86\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e15.937\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e1574.2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e254\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e13.643\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eILR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e0.86\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e15.937\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e1574.2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e254\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e13.643\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eRLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e0.86\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e15.933\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e1561\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e253.87\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e13.636\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eG-Matern 5/2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e0.01333\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.00018\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.0067\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eTNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e0.2341\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e4.8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.0548\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.1665\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eBoosted Trees\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e2.6089\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e8.3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e6.8063\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e1.8832\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 102px;\"\u003e\n \u003cp\u003eBagged Trees\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 49px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 69px;\"\u003e\n \u003cp\u003e0.1113\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 59px;\"\u003e\n \u003cp\u003e7.4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.0124\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.0832\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003eThe testing phase results in Table 1 reveal significant variations in the error performance criteria across the evaluated models, highlighting the superiority of nonlinear approaches over linear ones. The linear models (SWLR, ILR, and RLR) yield identical or near-identical error metrics during testing. Their RMSE is approximately 15.937, with a relatively high MAPE exceeding 1560, and MAE around 13.64. These high error values suggest limited accuracy and the inability of linear models to handle the nonlinear complexities of the dataset effectively. Such performance underscores the restrictive nature of linear models in scenarios where intricate relationships exist between input and output variables. Conversely, the nonlinear models deliver exceptional accuracy with minimal error metrics. The GM5/2 model exhibits the best performance, achieving an RMSE=0.01333, a negligible MAPE=0.1, and an MAE=0.0067, reflecting near-zero deviation from actual values. This model demonstrates its ability to model complex nonlinear patterns with remarkable precision. The TNN also performs well, with an RMSE=0.2341, MAPE=4.8, and MAE=0.1665, indicating a strong generalization capability during testing. Ensemble tree-based methods, including BoTs and BTNNs, perform well but lag behind G-M5/2 and TNN. BTNNs, for instance, achieve a low RMSE=0.1113 and an MAE=0.0832, whereas BoTs report slightly higher error rates with an RMSE=2.6089 and MAE=1.8832. The testing phase results clearly demonstrate the significant advantage of nonlinear models, particularly GM5/2 and TNN, over linear methods in minimizing error. These models are well-suited for capturing the intricate relationships in the dataset, leading to highly accurate predictions. On the other hand, the linear models\u0026apos; higher error metrics indicate their inadequacy for handling complex datasets like those generated by TG analysis. This comparison emphasizes the importance of adopting advanced machine learning techniques for applications requiring high precision and reliability.\u003c/p\u003e\n\u003cp\u003eThe scatter plots in Fig. 5 illustrate the relationship between observed and predicted weight percentages (W%) for both the training (green background) and testing (red background) phases, offering insights into model performance. In the training phase, the data points closely align with the diagonal line, indicating high predictive accuracy and minimal deviation across models, especially for the NNE). The plots from the testing phase exhibit a similar pattern but reveal a slight increase in dispersion, which reflects the models\u0026apos; capacity to generalize to unseen data. The NNE consistently shows superior alignment in both phases, further supporting its robustness and reliability as emphasized in the manuscript. These visualizations highlight the effectiveness of the machine learning models, particularly NNE, in accurately capturing the thermal characteristics of Dawakin Tofa clay. They reinforce the study\u0026apos;s conclusions regarding the potential of advanced models to enhance sustainable energy applications.\u003c/p\u003e\n\u003cp\u003eFig. 6 plot illustrates the predictive capabilities of the studied models in estimating the correlation between weight percentage (%) and intensity. The linear models (SWLR, RLR, and ILR) display gradual and consistent trends; however, they lack the flexibility necessary to account for non-linear relationships within the data. In contrast, non-linear models:GM5/2, TNN, BoT, and BTNN exhibit enhanced adaptability, effectively capturing more complex curves, especially at elevated weight percentages. The GM5/2 model demonstrates a strong alignment with the observed intensity values, indicating its robustness. The color gradient on the right serves as a visual representation of model hierarchy, with darker shades indicating non-linear models and lighter shades representing linear approaches. This visualization highlights the superiority of non-linear models in accurately capturing intricate relationships, thereby improving predictive performance.\u003c/p\u003e\n\u003cp\u003eThe above, Fig. 7 presents the cumulative predictive performance of all the tested models (Linear and Non-Linear), in estimating weight percentage (%) as a function of intensity. The area under the curve for each model is illustrated, with the segments beneath the curves representing the error or deviation for each respective model.\u0026nbsp;GM5/2 model encompasses the largest area, indicating strong performance and excellent alignment with the observed data. Non-linear models such as TNN and BoT exhibit greater adaptability and enhanced predictive accuracy compared to linear models (SWLR, RLR, and ILR), as demonstrated by their closer fit to the cumulative trend and reduced deviations. This graph visually reinforces the manuscript\u0026apos;s conclusions that non-linear models, particularly\u0026nbsp;GM5/2 and TNN, surpass linear approaches in effectively capturing the complex thermal behavior of Dawakin Tofa clay. Additionally, the layering emphasizes the potential for error reduction through the application of advanced ensemble methods.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.3 Predictive results of single models\u0026rsquo; ensemble\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe results in Table 2 showcase the predictive performance of ensemble models: SAE, WAE, and NNE in both the training and testing phases, emphasizing their goodness of fit through Determination Coefficient (DC), Pearson Correlation Coefficient (PCC), and regression equations. During the training phase, SAE and WAE exhibit identical performance with a DC=0.8908 and PCC=0.9438, accompanied by the regression equation Y=0.7065X+18.911Y = 0.7065X + 18.911Y=0.7065X+18.911, indicating a proportional but suboptimal fit between predicted and observed values. In contrast, NNE achieves perfect metrics, with a DC=PCC=1, and a regression equation Y=1X+0.00007Y = 1X + 0.00007Y=1X+0.00007, reflecting an almost ideal alignment with the training data. In the testing phase, SAE and WAE improve significantly, achieving a DC=0.9982 and PCC=0.9991, with their regression equation shifting to Y=7.0486X\u0026minus;17.916Y = 7.0486X - 17.916Y=7.0486X\u0026minus;17.916, reflecting a steeper slope that suggests heightened sensitivity to the testing data, potentially signaling overcompensation. Meanwhile, NNE maintains its superior performance with a DC=0.999 and PCC=0.999, along with a regression equation Y=0.9996X+0.0003Y = 0.9996X + 0.0003Y=0.9996X+0.0003, which demonstrates its ability to generalize accurately without overfitting. These results reveal NNE\u0026rsquo;s robustness and adaptability, while SAE and WAE, despite showing near-perfect results in the testing phase, display weaker training-phase performance and steeper slopes in the testing phase, raising potential concerns about scaling issues or overcompensation that might require further investigation.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 2:\u003c/strong\u003e Ensemble results based on goodness of fit for single models\u003c/p\u003e\n\u003cdiv align=\"Left\"\u003e\n \u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"336\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"4\" valign=\"bottom\" style=\"width: 336px;\"\u003e\n \u003cp\u003eTraining Phase\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eDC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003ePCC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 144px;\"\u003e\n \u003cp\u003eEquations\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eSAE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.8908\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.9438\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 144px;\"\u003e\n \u003cp\u003eY=0.7065X+18.911\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eWAE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.8908\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.9438\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 144px;\"\u003e\n \u003cp\u003eY=0.7065X+18.911\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eNNE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 144px;\"\u003e\n \u003cp\u003eY=1X+0.00007\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"4\" valign=\"bottom\" style=\"width: 336px;\"\u003e\n \u003cp\u003eTesting Phase\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eDC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003ePCC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 144px;\"\u003e\n \u003cp\u003eEquations\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eSAE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.9982\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.9991\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 144px;\"\u003e\n \u003cp\u003eY=7.0486X-17.916\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eWAE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.9982\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.9991\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 144px;\"\u003e\n \u003cp\u003eY=7.0486X-17.916\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003eNNE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.999\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 64px;\"\u003e\n \u003cp\u003e0.999\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 144px;\"\u003e\n \u003cp\u003eY=0.9996X+0.0003\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003eThe percentage differences between the ensemble models during the testing phase highlight minor variations in DC and PCC but significant disparities in their regression slopes. SAE and WAE achieve identical DC=0.9982, while NNE slightly outperforms with a DC=0.999, representing a percentage difference of 0.08%, indicating a marginally better fit for NNE. PCC=SAE=WAE=0.9991, whereas NNE=0.999, resulting in a negligible percentage difference of 0.01%, reflecting comparable correlations between predicted and actual values across all models. However, the regression slopes reveal substantial differences; SAE and WAE exhibit steep slopes=7.0486, whereas NNE maintains a near-ideal slope=0.9996, translating to an 85.82% difference. This stark contrast suggests that SAE and WAE may overcompensate or exhibit heightened sensitivity to the testing data, potentially impacting scalability, while NNE demonstrates superior robustness and alignment with the data. Despite these variations in regression behavior, the minor differences in DC and PCC reinforce the high predictive reliability and accuracy of all three models during the testing phase.\u003c/p\u003e\n\u003cp\u003eThe plots presented in Fig. 8 above compare the W% with predictions generated by the NNE, SAE, and WAE models. The \u0026quot;Observed W\u0026quot; curve serves as a reference point, illustrating the actual thermal behavior of Dawakin Tofa clay. The \u0026quot;Simulated-NNE\u0026quot; curve closely aligns with the observed data, indicating its superior predictive accuracy and capacity to capture complex patterns. In contrast, the simulations from SAE and WAE exhibit more significant deviations, particularly at the curve\u0026apos;s tail, reflecting reduced precision in their predictions. These visualizations support the manuscript\u0026apos;s conclusions, highlighting the robustness and reliability of NNE in modeling thermal properties and its potential to enhance sustainable energy applications.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.4 Predictive results of AI models ensemble\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTable 3 provides the error metrics; MSE and RMSE for the ensemble models SAE, WAE, and NNE during both the training and testing phases, focusing on their ability to fit nonlinear models. The results highlight significant differences in the error performance across the models. In the training phase, NNE stands out with exceptional performance, achieving an MSE=0.000212 and an RMSE=0.01456, indicating almost negligible errors and a near-perfect fit to the training data. In contrast, SAE performs moderately, with an MSE=0.521145 and RMSE=0.721903, reflecting a stronger but not flawless alignment with the data. WAE, however, performs poorly in the training phase, with an extremely high MSE=3568.649 and RMSE=59.7382, indicating significant deviations and a weak ability to fit the training data effectively. This suggests that WAE struggles with capturing the underlying patterns in the training dataset compared to SAE and NNE. In the testing phase, all models improve significantly, reflecting better generalization. NNE again delivers the best performance, with an MSE=0.0001696 and RMSE=0.01302, demonstrating its superior ability to generalize with minimal error and maintain robustness. SAE follows with an MSE=0.004406 and RMSE=0.066377, showing substantial improvement over its training phase and confirming its reliability in capturing the data structure with moderate accuracy. WAE also improves in the testing phase, achieving an MSE=0.034514 and RMSE=0.18578, but it still lags behind SAE and NNE, indicating that while its predictive power is enhanced during testing, it remains less reliable than the other models. Overall, the comparison reveals that NNE is the most accurate and consistent model across both phases, exhibiting the lowest error metrics and the strongest fit to both training and testing datasets. SAE demonstrates moderate performance, offering reliable generalization despite higher errors than NNE, while WAE, with its exceptionally high errors in the training phase, performs relatively poorly but shows some improvement in the testing phase. This highlights the dominance of NNE in nonlinear model fitting and suggests its robustness and adaptability make it the optimal choice for applications requiring high precision.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 3\u003c/strong\u003e: Ensemble results based on error of fit for nonlinear models\u003c/p\u003e\n\u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\" width=\"399\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 22px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"bottom\" style=\"width: 194px;\"\u003e\n \u003cp\u003eTraining Phase\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 112px;\"\u003e\n \u003cp\u003eTesting Phase\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 70px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 22px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 120px;\"\u003e\n \u003cp\u003eMSE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 74px;\"\u003e\n \u003cp\u003eRMSE\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 112px;\"\u003e\n \u003cp\u003eMSE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70px;\"\u003e\n \u003cp\u003eRMSE\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 22px;\"\u003e\n \u003cp\u003eSAE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 120px;\"\u003e\n \u003cp\u003e0.521145\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 74px;\"\u003e\n \u003cp\u003e0.721903\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 112px;\"\u003e\n \u003cp\u003e0.004406\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70px;\"\u003e\n \u003cp\u003e0.066377\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 22px;\"\u003e\n \u003cp\u003eWAE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 120px;\"\u003e\n \u003cp\u003e3568.649\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 74px;\"\u003e\n \u003cp\u003e59.7382\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 112px;\"\u003e\n \u003cp\u003e0.034514\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70px;\"\u003e\n \u003cp\u003e0.18578\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"bottom\" style=\"width: 22px;\"\u003e\n \u003cp\u003eNNE\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 120px;\"\u003e\n \u003cp\u003e0.000152\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"bottom\" style=\"width: 74px;\"\u003e\n \u003cp\u003e0.012337\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 112px;\"\u003e\n \u003cp\u003e0.000254\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70px;\"\u003e\n \u003cp\u003e0.015946\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eThe bar graph as seen in Fig. 9 illustrates the behaviour of three ensemble models SAE, WAE, and NNE with respect to predictive accuracy and error metrics. The NNE demonstrates superior performance, achieving the highest predictive accuracy and exhibiting minimal error bars, which signifies its robustness and reliability in capturing the thermal characteristics of Dawakin Tofa clay. In contrast, SAE shows moderate performance, delivering acceptable yet less consistent results, as evidenced by its slightly larger error margins. Although WAE employs a weighted methodology, it exhibits considerable variability and overall weaker performance relative to NNE, with some improvements noted in specific cases. These results align with the manuscript\u0026apos;s conclusions, emphasizing the enhanced effectiveness of non-linear and neural network-based ensembles in accurately predicting the material\u0026apos;s sensible heat storage potential. This visualization highlights the importance of model selection in optimizing performance for sustainable energy storage applications.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.5 Comparison of the results\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe results across Tables 1, 2, and 3 provides a holistic comparison of the predictive performance of the ensemble models: SAE, WAE, and NNE for nonlinear modeling tasks. These models were assessed based on their goodness of fit (DC, PCC, and regression equations) and error metrics (MSE and RMSE) during both the training and testing phases, highlighting key differences in their predictive capabilities. NNE consistently outperforms the other models, demonstrating exceptional accuracy and reliability. Its goodness of fit metrics in Table 2 show near-perfect performance, with a DC and PCC=1 in the training phase and 0.999 in the testing phase. Furthermore, its regression equation (Y=1X+0.00007Y = 1X + 0.00007Y=1X+0.00007 in training and Y=0.9996X+0.0003Y = 0.9996X + 0.0003Y=0.9996X+0.0003 in testing) reflects an almost ideal alignment between predicted and observed values. The error metrics in Table 3 further confirm its dominance, with MSE=0.000212 in training and 0.0001696 in testing, and RMSE=0.01456 and 0.01302, respectively. These results indicate NNE\u0026apos;s unparalleled ability to capture complex nonlinear relationships while maintaining robustness and generalizability across datasets.\u003c/p\u003e\n\u003cp\u003eSAE shows moderate performance, with significant improvements in the testing phase. Its goodness of fit metrics (DC = 0.9982, PCC = 0.9991) and regression slope in Table 2 indicate reliable generalization, though its alignment is less precise than that of NNE. The error metrics in Table 3 reveal an MSE=0.004406 and RMSE=0.066377 in the testing phase, which, while significantly better than its training phase performance (MSE = 0.521145, RMSE = 0.721903), still lag behind those of NNE. SAE demonstrates a strong capacity to generalize predictions despite its relatively weaker performance in capturing the finer details of the data during training. \u0026nbsp;WAE exhibits the weakest performance among the three models, particularly during the training phase, where it struggles with a massive MSE=3568.649 and RMSE=59.7382. While Table 2 shows that WAE achieves comparable goodness of fit metrics (DC = 0.9982, PCC = 0.9991) to SAE in the testing phase, its regression slope and higher error metrics in Table 3 (MSE = 0.034514, RMSE = 0.18578) suggest that its performance, though improved in testing, is less reliable. WAE\u0026apos;s steep regression slope in testing also raises concerns about over-sensitivity to the data, highlighting its limitations in capturing complex relationships.\u003c/p\u003e\n\u003cp\u003eIn summary, NNE emerges as the most robust and reliable model across all phases, with minimal errors, near-perfect goodness of fit, and the ability to generalize accurately. SAE provides a viable alternative for applications requiring acceptable accuracy, especially given its strong generalization capabilities in the testing phase. In contrast, WAE\u0026apos;s variability and weaker performance, particularly in the training phase, make it less suitable for tasks requiring precision. These findings underscore the importance of selecting advanced ensemble models like NNE for complex nonlinear predictive tasks, ensuring both accuracy and robustness for practical applications such as modeling the heat storage potential of materials.\u003cbr\u003e\u0026nbsp;The study focuses on predicting the sensible heat storage potential of Dawakin Tofa clay using advanced machine learning models, a task with significant environmental implications. Sensible heat storage systems, especially those utilizing natural and locally abundant materials like Dawakin Tofa clay, play a critical role in enhancing energy efficiency and sustainability in thermal management applications. By accurately predicting the heat storage capacity of this material, the study contributes to the broader goal of optimizing renewable energy systems, reducing dependency on fossil fuels, and promoting energy conservation.\u003c/p\u003e\n\u003cp\u003eOne key environmental benefit of this study lies in its contribution to energy storage systems, which are vital for integrating renewable energy sources such as solar and wind into the energy grid. Efficient heat storage materials ensure that excess energy generated during peak production periods can be stored and used later, reducing energy wastage and enhancing the reliability of renewable energy systems. By leveraging locally available clay, the study also supports the use of sustainable and low-cost materials, minimizing the environmental footprint associated with synthetic or imported alternatives. Moreover, the application of machine learning models to predict material performance ensures a more efficient research and development process. Traditional experimental methods for evaluating thermal properties are often resource-intensive, requiring extensive time, energy, and raw materials. The use of advanced predictive tools reduces the need for excessive experimentation, thereby conserving resources and decreasing laboratory emissions. This approach aligns with the principles of green engineering, emphasizing resource efficiency and minimal environmental impact.\u003c/p\u003e\n\u003cp\u003eThe study also highlights the potential for utilizing clay, a natural and abundant material, as a sustainable solution for TES. This reduces reliance on environmentally harmful materials and promotes the use of renewable resources. Additionally, the focus on sensible heat storage supports the development of passive energy systems, which are known for their low environmental impact compared to active energy systems that require additional energy inputs for operation. \u0026nbsp;Finally, this research contributes to the fight against climate change by promoting sustainable thermal energy storage solutions that can lower greenhouse gas emissions. Efficient heat storage systems enable better utilization of renewable energy, reducing reliance on carbon-intensive energy sources. Furthermore, by enhancing the performance of natural materials through predictive modeling, the study supports the development of sustainable infrastructure for energy storage, which is crucial for achieving global energy transition goals. In conclusion, this study\u0026apos;s environmental implications are far-reaching. It not only advances the scientific understanding of natural heat storage materials but also supports sustainable practices in energy storage, resource conservation, and the promotion of renewable energy technologies. By integrating machine learning with material science, this research paves the way for innovative, eco-friendly solutions to address global energy and environmental challenges.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eThis study demonstrated the successful application of advanced machine learning models, including SAE, WAE, and NNE, to predict the sensible heat storage potential of Dawakin Tofa clay using TGA data. Among the models, NNE consistently outperformed others, achieving near-perfect predictive performance in both training and testing phases, with exceptionally low error metrics (MSE = 0.000212 and RMSE = 0.01456 in training, and MSE = 0.0001696 and RMSE = 0.01302 in testing). SAE showed moderate accuracy and reliable generalization, while WAE displayed significant variability and weaker performance, particularly during training, despite improved results in the testing phase. These findings highlight the critical role of robust nonlinear models like NNE in capturing the complex thermal behavior of materials and their potential for practical applications. The environmental implications of this study are substantial, as it emphasizes the use of Dawakin Tofa clay, a natural and locally abundant material, in the development of sustainable energy storage systems. By leveraging machine learning, the research reduces the need for resource-intensive experimental evaluations, promoting resource efficiency, lowering emissions, and aligning with green engineering principles. Moreover, the use of clay as a heat storage material offers a cost-effective and environmentally friendly alternative to synthetic materials, supporting renewable energy integration, reducing energy wastage, and contributing to the mitigation of greenhouse gas emissions. To build on these findings, several recommendations and directions for future work are proposed. Material optimization should be prioritized by exploring surface treatments or composite formation to enhance the thermal properties of Dawakin Tofa clay, making it more efficient for practical applications. Pilot-scale studies are necessary to test the scalability and real-world performance of clay-based thermal energy storage systems, especially in renewable energy applications such as solar or industrial thermal systems. Further, integrating clay into solar thermal systems could unlock its full potential by demonstrating its ability to store and redistribute energy efficiently. Expanding the dataset by including a broader range of clay samples from different regions is essential for validating the predictive models and improving their generalizability. Future research should also explore hybrid machine learning models or deep learning architectures to further refine predictive accuracy and minimize error metrics. Additionally, a comprehensive economic analysis should be undertaken to assess the feasibility and scalability of using clay in thermal energy storage, along with a life cycle analysis (LCA) to evaluate its environmental trade-offs and benefits comprehensively.\u003c/p\u003e\n\u003cp\u003eThe study showcases the transformative potential of combining natural materials with cutting-edge machine-learning techniques to advance sustainable energy storage solutions. By addressing the limitations of traditional experimental approaches and emphasizing cost-effective and environmentally conscious alternatives, this research paves the way for developing efficient, scalable, and eco-friendly systems that align with global sustainability goals. Expanding on these findings through material optimization, advanced modeling, and practical implementation will further contribute to achieving a sustainable energy future.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eAhmed, S. F., Alam, M. S. Bin, Hassan, M., Rozbu, M. R., Ishtiak, T., Rafa, N., Mofijur, M., Shawkat Ali, A. B. M., \u0026amp; Gandomi, A. H. (2023). Deep learning modelling techniques: current progress, applications, advantages, and challenges. In \u003cem\u003eArtificial Intelligence Review\u003c/em\u003e (Vol. 56, Issue 11). Springer Netherlands. https://doi.org/10.1007/s10462-023-10466-8\u003c/li\u003e\n \u003cli\u003eAmir, M., Deshmukh, R. G., Khalid, H. M., Said, Z., Raza, A., Muyeen, S. M., Nizami, A. S., Elavarasan, R. M., Saidur, R., \u0026amp; Sopian, K. (2023). Energy storage technologies: An integrated survey of developments, global economical/environmental effects, optimal scheduling model, and sustainable adaption policies. \u003cem\u003eJournal of Energy Storage\u003c/em\u003e, \u003cem\u003e72\u003c/em\u003e(PE), 108694. https://doi.org/10.1016/j.est.2023.108694\u003c/li\u003e\n \u003cli\u003eBaghbani, A., Abuel-Naga, H., \u0026amp; Shirkavand, D. (2023). Accurately Predicting Quartz Sand Thermal Conductivity Using Machine Learning and Grey-Box AI Models. \u003cem\u003eGeotechnics\u003c/em\u003e, \u003cem\u003e3\u003c/em\u003e(3), 638\u0026ndash;660. https://doi.org/10.3390/geotechnics3030035\u003c/li\u003e\n \u003cli\u003eBang, H. T., Yoon, S., \u0026amp; Jeon, H. (2020). Application of machine learning methods to predict a thermal conductivity model for compacted bentonite. \u003cem\u003eAnnals of Nuclear Energy\u003c/em\u003e, \u003cem\u003e142\u003c/em\u003e, 107395. https://doi.org/10.1016/j.anucene.2020.107395\u003c/li\u003e\n \u003cli\u003eBarber, D. (2012). Statistics for machine learning. In \u003cem\u003eBayesian Reasoning and Machine Learning\u003c/em\u003e. https://doi.org/10.1017/cbo9780511804779.012\u003c/li\u003e\n \u003cli\u003eBisu, A. A., Ahmed, T. G., Ahmad, U. S., \u0026amp; Maiwada, A. D. (2024). A SWOT Analysis Approach for the Development of Photovoltaic (PV) Energy in Northern Nigeria. \u003cem\u003eCleaner Energy Systems\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e(June), 100128. https://doi.org/10.1016/j.cles.2024.100128\u003c/li\u003e\n \u003cli\u003eBottmer, L., Croux, C., \u0026amp; Wilms, I. (2022). Sparse regression for large data sets with outliers. \u003cem\u003eEuropean Journal of Operational Research\u003c/em\u003e, \u003cem\u003e297\u003c/em\u003e(2), 782\u0026ndash;794. https://doi.org/10.1016/j.ejor.2021.05.049\u003c/li\u003e\n \u003cli\u003eChekifi, T., \u0026amp; Boukraa, M. (2023). CFD applications for sensible heat storage: A comprehensive review of numerical studies. \u003cem\u003eJournal of Energy Storage\u003c/em\u003e, \u003cem\u003e68\u003c/em\u003e(June), 107893. https://doi.org/10.1016/j.est.2023.107893\u003c/li\u003e\n \u003cli\u003eDarvishvand, L., Safari, V., Kamkari, B., Alamshenas, M., \u0026amp; Afrand, M. (2022). Machine learning-based prediction of transient latent heat thermal storage in finned enclosures using group method of data handling approach: A numerical simulation. \u003cem\u003eEngineering Analysis with Boundary Elements\u003c/em\u003e, \u003cem\u003e143\u003c/em\u003e(June), 61\u0026ndash;77. https://doi.org/10.1016/j.enganabound.2022.06.009\u003c/li\u003e\n \u003cli\u003ede Zarz\u0026agrave;, I., de Curt\u0026ograve;, J., Hern\u0026aacute;ndez-Orallo, E., \u0026amp; Calafate, C. T. (2023). Cascading and Ensemble Techniques in Deep Learning. \u003cem\u003eElectronics (Switzerland)\u003c/em\u003e, \u003cem\u003e12\u003c/em\u003e(15), 1\u0026ndash;18. https://doi.org/10.3390/electronics12153354\u003c/li\u003e\n \u003cli\u003eHassija, V., Chamola, V., Mahapatra, A., Singal, A., Goel, D., Huang, K., Scardapane, S., Spinelli, I., Mahmud, M., \u0026amp; Hussain, A. (2024). Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence. \u003cem\u003eCognitive Computation\u003c/em\u003e, \u003cem\u003e16\u003c/em\u003e(1), 45\u0026ndash;74. https://doi.org/10.1007/s12559-023-10179-8\u003c/li\u003e\n \u003cli\u003eKhan, A. A., Chaudhari, O., \u0026amp; Chandra, R. (2024). A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation. \u003cem\u003eExpert Systems with Applications\u003c/em\u003e, \u003cem\u003e244\u003c/em\u003e(December 2023), 122778. https://doi.org/10.1016/j.eswa.2023.122778\u003c/li\u003e\n \u003cli\u003eKhan, D. M., Ali, M., Ahmad, Z., Manzoor, S., \u0026amp; Hussain, S. (2021). A New Efficient Redescending M-Estimator for Robust Fitting of Linear Regression Models in the Presence of Outliers. \u003cem\u003eMathematical Problems in Engineering\u003c/em\u003e, \u003cem\u003e2021\u003c/em\u003e. https://doi.org/10.1155/2021/3090537\u003c/li\u003e\n \u003cli\u003eLearning, D. (2023). \u003cem\u003eSS symmetry Deep Learning and Neural Networks : Decision-Making\u003c/em\u003e.\u003c/li\u003e\n \u003cli\u003eLin, Y., Endo, Y., Lee, J., \u0026amp; Kamijo, S. (2024). Neural Architecture Search via Trainless Pruning Algorithm: A Bayesian Evaluation of a Network with Multiple Indicators. \u003cem\u003eElectronics (Switzerland)\u003c/em\u003e, \u003cem\u003e13\u003c/em\u003e(22). https://doi.org/10.3390/electronics13224547\u003c/li\u003e\n \u003cli\u003eMaiwada, A. D., \u0026amp; Abba, S. I. (2024). \u003cem\u003eDevelopment of a Hybrid Intelligence Algorithm to Estimate the Derivative Weight of Dawakin Tofa Clay for Heat Storage Development of a Hybrid Intelligence Algorithm to Estimate the Derivative Weight of Dawakin Tofa Clay for Heat Storage\u003c/em\u003e. \u003cem\u003e1\u003c/em\u003e(2).\u003c/li\u003e\n \u003cli\u003eMansa, R., \u0026amp; Zou, S. (2021). Thermogravimetric analysis of microplastics: A mini review. \u003cem\u003eEnvironmental Advances\u003c/em\u003e, \u003cem\u003e5\u003c/em\u003e, 100117. https://doi.org/10.1016/j.envadv.2021.100117\u003c/li\u003e\n \u003cli\u003eMartinelli, D. D. (2022). Generative machine learning for de novo drug discovery: A systematic review. \u003cem\u003eComputers in Biology and Medicine\u003c/em\u003e, \u003cem\u003e145\u003c/em\u003e(March), 105403. https://doi.org/10.1016/j.compbiomed.2022.105403\u003c/li\u003e\n \u003cli\u003eMcNamara, M. E., Zisser, M., Beevers, C. G., \u0026amp; Shumake, J. (2022). Not just \u0026ldquo;big\u0026rdquo; data: Importance of sample size, measurement error, and uninformative predictors for developing prognostic models for digital interventions. \u003cem\u003eBehaviour Research and Therapy\u003c/em\u003e, \u003cem\u003e153\u003c/em\u003e(June 2021), 104086. https://doi.org/10.1016/j.brat.2022.104086\u003c/li\u003e\n \u003cli\u003eMiller, A., Panneerselvam, J., \u0026amp; Liu, L. (2022). A review of regression and classification techniques for analysis of common and rare variants and gene-environmental factors.\u0026nbsp;\u003cem\u003eNeurocomputing\u003c/em\u003e, \u003cem\u003e489\u003c/em\u003e, 466\u0026ndash;485. https://doi.org/10.1016/j.neucom.2021.08.150\u003c/li\u003e\n \u003cli\u003eMobarak, M. H., Mimona, M. A., Islam, M. A., Hossain, N., Zohura, F. T., Imtiaz, I., \u0026amp; Rimon, M. I. H. (2023). Scope of machine learning in materials research\u0026mdash;A review. \u003cem\u003eApplied Surface Science Advances\u003c/em\u003e, \u003cem\u003e18\u003c/em\u003e(November), 100523. https://doi.org/10.1016/j.apsadv.2023.100523\u003c/li\u003e\n \u003cli\u003eNgu, J. C. Y., Yeo, W. S., Thien, T. F., \u0026amp; Nandong, J. (2024). A comprehensive overview of the applications of kernel functions and data-driven models in regression and classification tasks in the context of software sensors. \u003cem\u003eApplied Soft Computing\u003c/em\u003e, \u003cem\u003e164\u003c/em\u003e(July), 111975. https://doi.org/10.1016/j.asoc.2024.111975\u003c/li\u003e\n \u003cli\u003eNurazzi, N. M., Asyraf, M. R. M., Rayung, M., Norrrahim, M. N. F., Shazleen, S. S., Rani, M. S. A., Shafi, A. R., Aisyah, H. A., Radzi, M. H. M., Sabaruddin, F. A., Ilyas, R. A., Zainudin, E. S., \u0026amp; Abdan, K. (2021). Thermogravimetric analysis properties of cellulosic natural fiber polymer composites: A review on influence of chemical treatments. \u003cem\u003ePolymers\u003c/em\u003e, \u003cem\u003e13\u003c/em\u003e(16). https://doi.org/10.3390/polym13162710\u003c/li\u003e\n \u003cli\u003ePapazafeiropoulos, G. (2024). Stepwise Regression for Increasing the Predictive Accuracy of Artificial Neural Networks: Applications in Benchmark and Advanced Problems. \u003cem\u003eModelling\u003c/em\u003e, \u003cem\u003e5\u003c/em\u003e(1), 153\u0026ndash;179. https://doi.org/10.3390/modelling5010009\u003c/li\u003e\n \u003cli\u003ePorcu, E., Bevilacqua, M., Schaback, R., \u0026amp; Oates, C. J. (2024). The Mat\u0026eacute;rn Model: A Journey Through Statistics, Numerical Analysis and Machine Learning. \u003cem\u003eStatistical Science\u003c/em\u003e, \u003cem\u003e39\u003c/em\u003e(3), 469\u0026ndash;492. https://doi.org/10.1214/24-STS923\u003c/li\u003e\n \u003cli\u003ePravin, P. S., Tan, J. Z. M., Yap, K. S., \u0026amp; Wu, Z. (2022). Hyperparameter optimization strategies for machine learning-based stochastic energy efficient scheduling in cyber-physical production systems. \u003cem\u003eDigital Chemical Engineering\u003c/em\u003e, \u003cem\u003e4\u003c/em\u003e(July), 100047. https://doi.org/10.1016/j.dche.2022.100047\u003c/li\u003e\n \u003cli\u003eSarker, I. H. (2021). Machine Learning: Algorithms, Real-World Applications and Research Directions. \u003cem\u003eSN Computer Science\u003c/em\u003e, \u003cem\u003e2\u003c/em\u003e(3), 1\u0026ndash;21. https://doi.org/10.1007/s42979-021-00592-x\u003c/li\u003e\n \u003cli\u003eTiskatine, R., Aharoune, A., Bouirden, L., \u0026amp; Ihlal, A. (2017). Identification of suitable storage materials for solar thermal power plant using selection methodology. \u003cem\u003eApplied Thermal Engineering\u003c/em\u003e, \u003cem\u003e117\u003c/em\u003e, 591\u0026ndash;608. https://doi.org/10.1016/j.applthermaleng.2017.01.107\u003c/li\u003e\n \u003cli\u003eTomar, D., \u0026amp; Agarwal, S. (2015). Twin Support Vector Machine: A review from 2007 to 2014. \u003cem\u003eEgyptian Informatics Journal\u003c/em\u003e, \u003cem\u003e16\u003c/em\u003e(1), 55\u0026ndash;69. https://doi.org/10.1016/j.eij.2014.12.003\u003c/li\u003e\n \u003cli\u003eUmmah, M. S. (2019). \u0026nbsp;In \u003cem\u003eSustainability (Switzerland)\u003c/em\u003e (Vol. 11, Issue 1). http://scioteca.caf.com/bitstream/handle/123456789/1091/RED2017-Eng-8ene.pdf?sequence=12\u0026amp;isAllowed=y%0Ahttp://dx.doi.org/10.1016/j.regsciurbeco.2008.06.005%0Ahttps://www.researchgate.net/publication/305320484_\u003c/li\u003e\n \u003cli\u003eZheng, R., Jia, Y., Ullagaddi, C., Allen, C., Rausch, K., Singh, V., Schnable, J. C., \u0026amp; Kamruzzaman, M. (2024). Optimizing feature selection with gradient boosting machines in PLS regression for predicting moisture and protein in multi-country corn kernels via NIR spectroscopy. \u003cem\u003eFood Chemistry\u003c/em\u003e, \u003cem\u003e456\u003c/em\u003e(June), 140062. https://doi.org/10.1016/j.foodchem.2024.140062\u003c/li\u003e\n \u003cli\u003eZhou, L., Pan, S., Wang, J., \u0026amp; Vasilakos, A. V. (2017). Machine learning on big data: Opportunities and challenges. \u003cem\u003eNeurocomputing\u003c/em\u003e, \u003cem\u003e237\u003c/em\u003e(September 2016), 350\u0026ndash;361. https://doi.org/10.1016/j.neucom.2017.01.026\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Bayero University Kano","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Sensible heat storage, Machine learning, Thermogravimetric analysis, Renewable energy integration, Sustainable energy systems","lastPublishedDoi":"10.21203/rs.3.rs-6081166/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6081166/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThe challenge of efficiently predicting the sensible heat storage potential of natural materials like Dawakin Tofa clay for sustainable energy applications necessitates innovative solutions. This study investigates the use of machine learning models: Interactive Linear Regression (ILR), Stepwise Linear Regression (SWLR), Robust Linear Regression (RLR), and (Kernel Support Vector Machine (KSVM). Also, four non-linear models were employed as: G-Matern 5/2 (GM5/2), Trilayered neural network (TNN), Boosted Tree (BoT) and bagged Tree Neural Networks (BTNN). Further, some ensemble methods used are: Simple Average Ensemble (SAE), Weighted Average Ensemble (WAE), and Neural Network Ensemble (NNE). In the laboratory, the test was carried out at the Centre for Genetics Engineering and Biotechnology at the Federal University of Technology in Minna, Niger State, Nigeria. The clay sample was placed in a platinum pan, then heated it at a rate of 10°C per minute while using nitrogen and air as purge gases. The entire experiment took 33 minutes to complete, with results printed for documentation. To ensure accuracy, we repeated the analysis three times and averaged the results. By utilizing locally abundant Dawakin Tofa clay, the research promotes sustainable and cost-effective energy storage solutions, reducing reliance on synthetic materials and lowering the environmental footprint. Among the models, NNE exhibited the best performance, achieving near-perfect accuracy with minimal error metrics (MSE = 0.000212, RMSE = 0.01456 in training; MSE = 0.0001696, RMSE = 0.01302 in testing). SAE demonstrated moderate accuracy with reliable generalization, while WAE showed high variability in training and weaker performance, despite improvement in the testing phase. This study highlights the superiority of nonlinear machine learning models, particularly Neural Network Ensemble (NNE), in accurately modeling the thermal behavior of the sample. It also provides a foundation for optimizing natural materials for thermal storage, recommending material modifications, expanded datasets, pilot-scale studies, and economic assessments. It further underscores the potential of integrating advanced machine learning techniques with natural materials to create scalable, sustainable energy systems, addressing critical environmental challenges in the transition to renewable energy.\u003c/p\u003e","manuscriptTitle":"Machine learning-enhanced prediction of sensible heat storage potential in Kano-Nigeria based on thermogravimetric analysis","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-02-26 10:38:01","doi":"10.21203/rs.3.rs-6081166/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"4832159b-19e5-4f60-bad6-5ba1bb7aae43","owner":[],"postedDate":"February 26th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":44697934,"name":"Mechanical Engineering"},{"id":44697935,"name":"Renewable Resources"},{"id":44697936,"name":"Energy Engineering"},{"id":44697937,"name":"Artificial Intelligence and Machine Learning"}],"tags":[],"updatedAt":"2025-02-26T10:38:01+00:00","versionOfRecord":[],"versionCreatedAt":"2025-02-26 10:38:01","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6081166","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6081166","identity":"rs-6081166","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.