Enhancing Infiltration Rate Predictions with Hybrid Machine Learning and Empirical Models: Addressing Challenges in Southern India | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Enhancing Infiltration Rate Predictions with Hybrid Machine Learning and Empirical Models: Addressing Challenges in Southern India Mooganayakanakote Veeranna Ramaswamy, Yashas Kumar Hanumapura Kumaraswamy, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4869876/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 22 Feb, 2025 Read the published version in Acta Geophysica → Version 1 posted 6 You are reading this latest preprint version Abstract Despite the success of machine learning (ML) in many disciplines, its application in hydrology, especially in water-scarce regions, faces challenges due to the lack of interpretability and physical consistency. This study addresses these challenges by integrating established empirical hydrological models with ML techniques to predict infiltration rates in water-scarce regions of southern India. Data from 199 observations across 11 sites, including soil characteristics and infiltration measurements, were used to parameterize traditional models like Philip's, Horton's, and Kostiakov's, which were then combined with Artificial Neural Networks (ANN) and the MissForest (MF) algorithm to form hybrid models. The results demonstrate that hybrid models, particularly those based on Philip's model, significantly improve prediction accuracy (R²: 0.76–0.92, RMSE: 0.08–0.2 cm/min, and LCE: 0.11–0.71 with more predictors) across all target sites while retaining interpretability. This approach leverages the strengths of both empirical models and machine learning, addressing the limitations of each. The study highlights that while empirical models are data-driven and may introduce uncertainties, combining them with ML techniques can enhance predictive power and provide a more robust understanding of infiltration dynamics. This is particularly valuable in regions where direct measurement is challenging. The hybrid models facilitate accurate predictions using minimal data from readily accessible locations, offering a practical solution for effective water resource management and soil conservation in semi-arid and data-scarce regions. By blending empirical knowledge with machine learning algorithms, this approach not only improves accuracy but also enhances the physical meaningfulness of hydrological models, providing a balanced and innovative solution to hydrological modeling challenges. Soil Infiltration prediction Infiltration models Artificial Neural Network (ANN) MissForest Hybrid hydrological model Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 1. Introduction In the fields of hydrology, irrigation, and drainage engineering, the soil infiltration process plays a fundamental role. In general, infiltration refers to the vertical and lateral movement of water from the soil surface down through the layers. Initially, when water is applied, whether through rainfall or irrigation, it rapidly penetrates the soil at a high rate, termed the potential infiltration rate or maximum infiltration capacity. However, over time, this rate stabilizes and reaches a constant value known as the saturated infiltration rate or steady-state infiltration capacity. Understanding this transition from rapid to stable infiltration rates is crucial for analysing water movement dynamics within soil profiles. Describing infiltration process proves challenging due to its complex nature, particularly under isotropic and heterogeneous soil conditions (He et al. 2024). The complex process of infiltration is profoundly influenced by various factors, including soil depth, geomorphological features, hydraulic properties, and climatic conditions. Among these factors, the arrangement of soil particles and the moisture content within the soil layers stand out as crucial determinants of its ability to absorb and retain water during rainfall or irrigation events (Arya et al. 1999; Dexter and Richard 2009). A nuanced understanding of these intricacies is indispensable for crafting strategies to effectively mitigate soil erosion, manage groundwater recharge, and optimize the design and management of irrigation and hydrological systems (Mattar et al. 2015). Over the years, researchers have grappled with the accurate assessment of infiltration rates due to the spatial and temporal variability inherent in field measurements (Mahapatra et al. 2020). This variability stems from the heterogeneity of soil properties, which are further complicated by equipment limitations, logistical constraints, and environmental interferences. Limited access in remote locations adds another layer of difficulty. Despite these challenges, many authors have developed a variety of equations to predict infiltration, encompassing physical, semi-empirical, and empirical equations. Examples include Green and Ampt (1911), Philip (1969), Smith (1972), Smith and Parlange (1978), Horton (1941), Holtan (1961), Overton (1964), Richards (1931), Kostiakov (1932), modified Green-Ampt (Wang et al. 1999), modified Kostiakov (Haverkamp et al. 1988), among others. While these equations offer valuable predictive tools, their reliance on simplifying assumptions such as (i) homogeneity of soil and (ii) constant soil moisture content, along with limited applicability to specific soil and environmental conditions, can introduce inaccuracies. This highlights a key drawback in their use. In addition, numerical and physically based models such as SWAT (Soil and Water Assessment Tool), MIKE SHE (MIKE Surface-water Hydrology), among others are renowned for accurately predicting infiltration processes. However, acquiring data with high spatial resolutions, especially soil data type required to run these models proves challenging, particularly for the large heterogeneous catchments (Christiaens and Feyen 2001). To overcome these limitations, our study integrates these established empirical models with machine learning (ML) techniques, specifically Artificial Neural Networks (ANN) and the MissForest (MF) algorithm. This hybrid approach leverages the strengths of both empirical and ML models, addressing the simplifying assumptions and enhancing prediction accuracy while maintaining interpretability. By combining empirical knowledge with advanced data-driven techniques, we mitigate the challenges of data acquisition and model applicability, providing a more robust and practical solution for predicting infiltration in water-scarce and data-scarce regions. Ongoing advancements in measurement techniques and data analysis methodologies offer a glimmer of hope. Soft computing and data driven methods, including Artificial Neural Networks (ANN), Random Forest (RF), Multi-Linear Regression (MLR), Support Vector Regressor (SVR), among others have emerged as powerful tools in hydrology and irrigation engineering, addressing various complex challenges (Sayari et al. 2021; Ahmed et al. 2024; Chen et al. 2024; Teshome et al. 2024). In hydrology, they excel in predictive analytics, flood and drought prediction, and water quality assessment. In parallel, within irrigation engineering, they optimize schedules, estimate crop water needs, assess system performance, and recognize patterns (Sidhu et al. 2020). Notably, the literature on machine learning (ML) methods for soil water infiltration remains limited. However, ML algorithms shine in several areas, making accurate predictions, enhancing performance over time, unveiling concealed patterns within intricate datasets, and automating tasks, thus offering multifaceted benefits. Sy (2006) applied the ANN to model infiltration using data from plot-scale rainfall simulator experiments. The research highlighted the efficiency of ANN in capturing infiltration dynamics, with soil moisture and hydraulic conductivity identified as critical factors. Furthermore, compared to traditional methods such as Philip and Green-Ampt, ANN exhibited superior accuracy in predicting cumulative infiltration. Sayari et al. (2021) compared five artificial intelligence (AI) models and their integrative versions with the Firefly Algorithm (FA) to forecast infiltrated water in furrow irrigation system. Utilizing data from both literature sources and field experiments conducted in Iran, the study incorporated key input parameters including furrow length, inflow rate, advance time, cross-sectional area of inflow, and infiltration opportunity time. Evaluation metrics highlighted the significant enhancement in accuracy achieved by integrating FA. These findings underscore the potential of AI models in refining complex hydrological processes. Sihag et al. (2019) evaluated the performance of Adaptive Neuro-Fuzzy Inference System (ANFIS), SVM, and RF models in estimating cumulative infiltration and infiltration rate in arid areas of Iran, concluding that SVM, particularly with radial basis kernel function, outperforms ANFIS and RF. In a subsequent study, Sihag et al. (2020) compared ANN, Gaussian process (GP), Gene Expression Programming (GEP), and Generalized Neural network (GRNN) to estimate soil infiltration rates, finding that ANN with specific parameters achieves higher correlation coefficients than other algorithms. Singh et al. (2017) evaluated the performance of RF, ANN and M5P Model Tree techniques in predicting the infiltration rate. It was reported that the RF acted better in providing a closer estimation than ANN and M5P model tree. According to the literatures, there are no universally acceptable algorithm that fits all site-specific scenarios. Many predictive algorithms, including ANN, can have a time-consuming training process due to their sensitivity to hyperparameter selection. ANN continue to demonstrate remarkable predictive accuracy in estimating infiltration rates. Additionally, they showcase their adaptability and robustness, even when confronted with small datasets (Jia and Culver 2006). One more ML technique which operates on the RF algorithm is MissForest (MF). It excels with small datasets due to its ability to handle missing data, heterogeneous data, and its efficient training process compared to other algorithms (Ispirova et al. 2020; Naranjo-Fernández et al. 2020; Bikše et al. 2023). Given the success of Random Forest in predicting infiltration in numerous studies, algorithms derived from its present promising successors. However, the 'fit_transform()' function from the 'missingpy' library in Python remains largely unexplored in soil infiltration prediction, signifying a notable research gap in the field. According to Parchami-Araghi et al. (2013) and Sy (2006), coupling ML techniques with physical-empirical based models results in reliable infiltration prediction. Therefore, the aim of this study is to develop hybrid ML and hydrological models for predicting infiltration rates. This will be achieved by using in-situ observations of gravel (%), sand (%), clay and silt (%), and soil moisture content (%) as predictor variables, measured from soil samples collected at surface, 50 cm, and 100 cm depth. The study will also focus on assessing the robustness of the hybrid models, particularly with scenarios involving an increasing number of predictor variables collected from various regions of Southern India. 2. Study area A total of eleven infiltration points from various regions of Southern India (Fig. 1 ) were deliberately selected for this study to encompass a range of soil and climatic conditions. These points are situated within the premises of the Indian Institute of Astrophysics (IIA), a government institution dedicated to interplanetary observations. Specifically, four of the eleven points are located within the Hoskote Bengaluru Campus (site 1), another four are situated at the Kavalur Tamil Nadu campus (site 2), and the remaining three are found at the Gauribidanur Karnataka Campus (site 3). According to the Food and Agriculture Organization (FAO) soil database, site 1 is predominantly characterized by sandy clay loam texture, while site 2 and site 3 exhibit clay loam texture. Additionally, it is noted that the upper layer of the soil (SOL_Z) at these sites is less than 300mm in depth. Hoskote experiences an average annual rainfall of 843 mm with maximum and minimum temperature ranges from 33.6ºC to 15ºC, while Gauribidanur receives 694 mm of average annual rainfall with temperature ranges from 40ºC to 10ºC. Kavalur has an average rainfall of 917 mm, and temperatures range from 40.4 ºC to 18.5 ºC as per the Census of India (CIA) handbook 2011. The climates at these different sites are characterized by seasonally dry tropical savanna climate and semi-arid climate, providing a diverse range of soil and climatic conditions for the infiltration work. This information underscores the deliberate selection of the infiltration sites within the Indian Institute of Astrophysics (IIA) premises to capture a variety of environmental conditions for comprehensive study. 3. Methodology 3.1 Field measurement and data collection A comprehensive study spanning various climate regions in southern India involved diligent data collection from nearly 199 observations. These observations included measurements of infiltration rates ( \(\:f\left(t\right)\) ) and cumulative infiltration ( \(\:F\left(t\right)\) ) across 11 distinct sites. The detailed descriptions of all 11 infiltration test points are provided in Table 1 and the infiltration observations \(\:f\left(t\right)\) and \(\:F\left(t\right)\) recoded at each test point are illustrated in Fig. 2 . Furthermore, soil characteristics were extensively examined, with close to 33 observations gathered at each site, encompassing percentages of gravel, sand, and combined silt and clay content, as well as moisture content. This detailed dataset, collected at surface level, 50 cm, and 100 cm depths, provides a robust foundation for investigating of infiltration processes. Table 1 Detail description of the infiltration test points Location Test point ID Latitude Longitude Elevation above MSL (m) Vegetation cover Land use Hoskote Bengaluru Campus H1 13° 6'47.00"N 77°48'45.00"E 937 Sparse Shrubs Mixed Use H2 13° 6'43.34"N 77°48'53.86"E 931 Lawn Mixed Use H3 13° 6'50.51"N 77°48'38.71"E 941 Mixed Forest Mixed Use H4 13° 6'51.55"N 77°48'43.89"E 938 Sparse Shrubs Mixed Use Kavalur Tamil Nadu campus K1 12°34'33.62"N 78°49'17.59"E 718 Sparse Shrubs Mixed Use K2 12°34'27.76"N 78°49'13.50"E 724 Mixed Forest Forest K3 12°34'38.30"N 78°49'20.85"E 723 Sparse Shrubs Mixed Use K4 12°34'42.66"N 78°49'30.19"E 716 Meadow Grass Mixed Use Gauribidanur Karnataka Campus G1 13°36'12.85"N 77°25'43.17"E 725 Meadow Grass Mixed Use G2 13°36'9.97"N 77°25'37.39"E 724 Bare Soil Mixed Use G3 13°36'8.69"N 77°25'45.40"E 723 Sparse Shrubs Mixed Use The field process involved using a double ring infiltrometer made of a metal plate that is 10 mm thick with a depth of 42 cm, and inner and outer ring diameters of 25 cm and 48 cm, respectively. The upper existing soil layer is removed to eliminate surface irregularities and organic matter, ensuring accurate infiltration measurements. Both rings of the infiltrometer are driven simultaneously into the ground to a depth of 5 cm using a wooden plank and hammer. The observations are carried out in the inner ring to ensure that the infiltration measurements reflect the vertical downward movement of water into the soil strata, providing a more precise observation of soil water conductivity. The outer ring is used to control the lateral movement of water from the inner ring, which is one of the error sources of this type of infiltrometer. Observations were taken at time intervals of 2, 3, 5, 10, 15, 30, 45, 60, 90, and 120 minutes, and continued until the infiltration reached a steady rate. To determine the percentages of gravel, sand, combined silt, and clay content, as well as the moisture content, various sample types were collected from different depths at each test point. Three messy samples, each weighing approximately 100 g, and six samples, each weighing approximately 1000 g, were collected from each test point (collectively 99 samples). The messy samples were used for moisture content analysis, while the other samples were used for determining the percentages of gravel, sand, and combined silt and clay content. The 100 g messy samples was divided into two equal parts, each weighing 50 g, and the moisture content present in each sample was determined using the ASTM Standard Test Method for Laboratory Determination of Water Content of Soil Sample by Mass (ASTM Standard D2216–19 2019). To distinguish the size of soil particles, the 1000 g soil samples were oven-dried for 24 hours at 105℃. Subsequently, the dried samples were sieved using 4.75 mm and 0.0075 mm sieves to determine the percentage of gravel (particles greater than 4.75 mm), sand (particles ranging from 4.75 to 0.0075 mm), and combined silt and clay (particles less than 0.0075 mm). The experimental results of two samples collected from the same depth are averaged and used for further study. The infiltration rate measured with the double ring infiltrometer serves as the foundation for the formulation of infiltration models. These models, including the Philip’s (Philip 1969), Horton’s (Horton 1941), and Kostiakov’s (Kostiakov 1932), can be parameterized through fitting procedures by utilizing the observed infiltration data. The principal equation employed in the models is detailed in the Table 2 , along with a comprehensive summary of each associated parameter. Analysing these models, the governing parameters that describe the infiltration process for a specific soil and land-use condition can be driven and which will be used for further training and testing of ML techniques. Figure 3 shows the detailed method of research and evaluation of hybrid ML and traditional infiltration models. Table 2 Approximate equations for infiltration rate derived from both theoretical principles and empirical observations Model Equation Model Parameters Application Context Philip (1969) \(\:f\left(t\right)=s{t}^{-1/2}+2A\) s is sorptivity ( \(\:L{T}^{-0.5}\) ), which is the capacity of the soil to absorb water due to capillarity. A is transmissivity factor ( \(\:L{T}^{-1}\) ), indicating the rate at which water moves through the soil under a unit gradient, \(\:t\) is the time of infiltration (T). Suitable for short-term infiltration studies and homogeneous soils. Horton (1941) \(\:f\left(t\right)={i}_{c}+m\left({e}^{-{K}_{h}t}\right)\) \(\:{i}_{c}\) ( \(\:L{T}^{-1}\) ), is the steady rate or ultimate infiltration capacity, \(\:m=\left({i}_{o}-{i}_{c}\right),\:\) where \(\:{i}_{o}\) ( \(\:L{T}^{-1}\) ), is the infiltration capacity at \(\:t=0\) , \(\:{K}_{h}\) is an empirical soil constant. Ideal for predicting infiltration capacity over time, especially in varied soil conditions. Kostiakov (1932) \(\:f\left(t\right)=\left(ab\right){t}^{(b-1)}\) \(\:a>0\:and\:0<b<1\) are empirical dimensionless constants Commonly used for irrigation studies and quick field estimations. 3.2 Machine leaning algorithm 3.2.1 Artificial Neural Network (ANN) As inspired by the intricate connections of human neurons, the ANN serve as computational tools in hydrology, mimicking the natural complexity to model and predict water-related phenomena (ASCE Task Committee on Application of Artificial Neural Networks in Hydrology. 2000b, a). The ANN consist of one input layer, one or multiple hidden layers and one output layer. These layers process the information fed as input variables in interconnected processing components called nodes or neurons. Neurons in adjacent layers are allied through weighted connections, which function as communication channels. The connections between nodes represent weights (W), which determine the strength of influence one neuron has on neurons in the subsequent layer. Additionally, biases (B) are constant values added to the weighted sum of inputs for each neuron. These weights and biases enhance the flexibility of the ANN model to fit the input variables. Inputs to a neuron are multiplied by their corresponding weights, summed, and then processed through a transfer function, which controls the signal strength relayed through the neuron's output. Hidden and output layers in neural networks use special functions called "activation functions" to introduce non-linearity. This enables the network to learn complex patterns in the data that simple linear models cannot capture. Popular choices include Rectified Linear Unit (ReLU, known for its simplicity and efficiency) and the Hyperbolic Tangent function (tanh, which is similar to the sigmoid function but faster to compute and with different learning behavior). These functions are essential for the network's ability to capture intricate relationships within the data (Dubey et al. 2022). In our approach, a custom Python script utilizing the scikit-learn library was used to develop the multilayer feed-forward ANN model with a back-propagation training algorithm, specifically the \(\:{\prime\:}\text{M}\text{L}\text{P}\text{R}\text{e}\text{g}\text{r}\text{e}\text{s}\text{s}\text{o}\text{r}{\prime\:}\) . The data, loaded from an Excel sheet, consisted of various features and target variables. The available data was split such that the last row was reserved for testing, while the rest was used for training. Hyperparameter tuning was conducted using \(\:{\prime\:}\text{G}\text{r}\text{i}\text{d}\text{S}\text{e}\text{a}\text{r}\text{c}\text{h}\text{C}\text{V}{\prime\:}\) with a parameter grid that included variations in hidden layer sizes, activation functions, and solvers. The best estimator was selected based on the mean squared error scoring. This approach ensured optimal prediction accuracy for the target variables. Also, we explored various scenarios to test the performance of the ANN model by incorporating data from multiple neighbouring test points and a target test point. This approach, which we refer to as a hybrid model integrating different infiltration models such as Philip's, Horton's, and Kostikov's methods with ANN. As an example, the 22 input parameters used were the percentages of gravel ( \(\:G\) ), sand ( \(\:S\) ), silt and clay ( \(\:CS\) ), and moisture content ( \(\:M\) ) for all test points (1, 2, 3, and 4) and, as well as the sorptivity ( \(\:s\) ) and transmissivity factor ( \(\:A\) ) for the nearby points (1, 2, and 3). The output parameters were the sorptivity ( \(\:{s}_{t}\) ) and transmissivity factor ( \(\:{A}_{t}\) ) for the target test point (t). Therefore, there were 22 nodes in the input layer and two in the output layer. The suggested structure was 22-j-2, where j is the number of nodes in the hidden layer (Fig. 4 ). In a similar manner, we tested the ANN model using Horton's and Kostikov's methods as hybrid models, integrating their respective parameters to evaluate and enhance the network's prediction capabilities across different scenarios. 3.2.2 MissForest (MF) Stekhoven and Bühlmann (2012) introduced the MF algorithm, an iterative imputation method that is an enhanced version of the RF algorithm. RF algorithm ((Breiman 2001) grow many decision trees and average their results. However, averaging can mask underlying variability and interactions in the data, leading to biased imputations, which MF addresses by iteratively capturing these complex relationships for more accurate imputation. To illustrate the concept, consider an example of a hybrid MF algorithm combined with Philip's model. In each iteration of the adapted MF algorithm, aimed at imputing missing values of sorptivity ( \(\:{s}_{t}\) ) and transmissivity factor ( \(\:{A}_{t}\) ) at the target test site ‘t’, the process begins by preparing the data from an Excel file. Features \(\:G,\:S,\:CS\) and \(\:M\) and targets \(\:{s}_{t}\) and \(\:{A}_{t}\) are separated, with the last row reserved for testing. Specifically, \(\:{X}_{train}=df.iloc[:\:-1,\::\:-2]\) includes all rows except the last one, excluding the last two columns, and \(\:{y}_{train}=df.iloc[:\:-1,\::\:-2]\) includes the last two columns for all rows except the last one. The MF model fits a RF model with \(\:{P}_{k}\sim\{G,\:S,\:CS,\:M\}\) , where \(\:{P}_{k}\) is either \(\:{s}_{t}\) or \(\:{A}_{t}\) , using data from rows at three nearby test points that do not have missing values. These test points include data like \(\:{G}_{1},\:{\:S}_{1},\:{CS}_{1},\) \(\:{M}_{1}\) , \(\:{s}_{1}\) and \(\:{A}_{1}\) for test point 1, and similarly for test points 2 and 3. A grid search optimizes hyperparameters such as the ‘ \(\:n\_estimators\) ’, \(\:{\prime\:}\text{m}\text{a}\text{x}\_features{\prime\:}\) , \(\:{\prime\:}\text{m}\text{a}\text{x}\_depth{\prime\:}\) and \(\:{\prime\:}random\_state{\prime\:}\) . Once the best RF model is identified, it is used to impute missing values in the last row of the test data. This involves initializing the MF model, using \(\:{\prime\:}\text{G}\text{r}\text{i}\text{d}\text{S}\text{e}\text{a}\text{r}\text{c}\text{h}\text{C}\text{V}{\prime\:}\) to fit the model and find the best parameters, imputing missing values in the test data, and predicting \(\:{s}_{t}\) and \(\:{A}_{t}\) for the last row. The best parameters' values are estimated using the mean squared error, ensuring accurate predictions. Initial mean imputation provides a baseline, and iterative refinement improves the predictions based on the optimized model. Similarly, models were built using Horton's and Kostikov's methods, combining the classical infiltration models with the MF algorithm to robustly impute and predict the missing values under different scenarios with an increase in nearby stations' data. 3.3 Performance evaluation criteria The infiltration rates obtained from the hybrid models for varying target test points were evaluated by computing three standard statistical performance indicators and one graphical indicator. These indicators were the coefficient of determination ( \(\:{R}^{2})\) , Root Mean Square Error ( \(\:RMSE\) ), Legates’s Coefficient of Efficiency (LCE), and Taylor diagram. The three statistical indicators were expressed as: \(\:{R}^{2}=\:\left[\frac{{\left[\sum\:_{i=1}^{N}\left({p}_{i}-{P}_{m}\right)\left({o}_{i}-{O}_{m}\right)\right]}^{2}}{\sum\:_{i=1}^{N}{\left({p}_{i}-{P}_{m}\right)}^{2}\:\sum\:_{i=1}^{N}{\left({o}_{i}-{O}_{m}\right)}^{2}}\right]\) (1) \(\:RMSE={\left[\frac{\sum\:_{i=1}^{N}{({p}_{i}-{o}_{i})}^{2}}{N}\right]}^{1/2}\) (2) \(\:LCE=1-\frac{\left[\sum\:_{i=1}^{N}\left|{p}_{i}-{o}_{i}\right|\right]}{\left[\sum\:_{i=1}^{N}\left|{o}_{i}-{O}_{m}\right|\right]}\) (3) where \(\:{o}_{i}\) is the i-th observed infiltration rate, \(\:{p}_{i}\) is the i-th predicted infiltration rate. The mean values of the observed and predicted rates, each consisting of N values, are represented as \(\:{O}_{m}\) and \(\:{P}_{m}\) respectively. R² is a commonly used metric to measure the degree of correlation between predicted and observed values, where an ideal R² value close to 1 indicates a strong match. To quantify the error between these values, RMSE is frequently employed, expressed in the same units as the observed values; lower RMSE values suggest better predictive accuracy. Additionally, LCE provides a dimensionless measure of model prediction accuracy relative to observed values, with values near 1 indicating near-perfect agreement (Legates and McCabe 1999). A Taylor diagram offers a comprehensive visual representation of model performance by combining RMSE, R², and standard deviation (SD) into a single polar plot (Taylor 2001). This tool's significant advantage is its ability to compare multiple model predictions on a single plot, providing a more holistic view than individual summary statistics. In this study, a Python script was developed to generate Taylor diagrams, facilitating rapid assessment of model predictions relative to observed values and enabling efficient comparison of multiple models. 4. Results and Discussions 4.1. Predictor variables from lab-measured soil parameters The soil parameters for different test points, including gravel (G), sand (S), silt and clay (SC), and moisture content (M), were measured at three different depths: at the surface, 50 cm, and 100 cm below the surface. The data obtained from the laboratory analysis revealed notable variability across different test points and depths (Table 3 ). The Gravel content (G) varied significantly, with surface levels ranging from 14% at H3 to 47.8% at K4, generally decreasing with soil depth. Sand content (S) showed notable variability, with the highest surface content at H3 (85.2%) and increasing with depth, particularly at K4 (92.5% at 100 cm). Silt and clay (SC) content was relatively low across all test points and depths, with surface levels ranging from 0.0% at K1, K2 and K3 to 4.1% at G3. Moisture content (M) varied widely, with surface levels from 2.1% at G1 to 14.2% at K3, generally increasing with depth, reaching up to 15.0% at H2 at 100 cm. Site-specific observations showed that Hoskote Station (H), with elevations from 941 to 931 m and mixed-use land, exhibited a decrease in gravel content and an increase in moisture content with depth. Kavalur Station (K), in a hilly area with elevations from 724 to 716 m, showed high surface gravel content and low clay and silt, indicative of well-drained conditions. Gauribidanur Station (G), with elevations from 725 to 723 m and known for water scarcity, exhibited high sand content at depth and low moisture, highlighting its water scarcity issues. These variations in soil parameters across different sites and depths reflect the heterogeneity of soil properties in the study area, which is crucial for understanding soil behaviour and management in semi-arid regions. Table 3 Soil parameters from laboratory analysis for training hybrid models Test point \(\:{G}^{a}\) (%) \(\:{S}^{b}\) (%) \(\:{SC}^{c}\) (%) \(\:{M}^{d}\) (%) \(\:G\) (%) \(\:S\) (%) \(\:SC\) (%) \(\:M\) (%) \(\:G\) (%) \(\:S\) (%) \(\:SC\) (%) \(\:M\) (%) At surface At 50 cm below surface At 100 cm below surface \(\:H1\) 23.2 76.1 0.7 10.9 17 81.1 1.9 8.8 11.9 86.2 1.9 9.8 \(\:H2\) 17.7 81.2 1.1 7.8 12.9 84.9 2.2 12.1 6.2 90.2 3.6 15 \(\:H3\) 14 85.2 0.8 9.4 14.6 83.6 1.8 11.8 22.5 74.7 2.8 8.7 \(\:H4\) 42.9 55.9 1.2 9.2 23.3 76.3 0.4 11.5 14.7 82.5 2.8 9.6 \(\:H5\) 26.6 71.8 1.6 9.5 13.1 86.2 0.7 12.4 22.5 76.7 0.8 12.2 \(\:K1\) 39.8 60.2 0 5.8 15 84 1 10.6 11.8 82.2 6 4.2 \(\:K2\) 44.8 55.2 0 13.3 19.8 80 0.2 10.6 19.1 80.2 0.7 7.2 \(\:K3\) 32.4 67.6 0 14.2 26.7 73.1 0.2 16.4 25.8 74.2 0 8.6 \(\:K4\) 47.8 51.3 0.9 12 32 68 0 11.1 7.1 92.5 0.4 9 \(\:G1\) 25.5 72.9 1.6 2.1 48.5 51.1 0.4 4.5 12.8 85.7 1.5 4.6 \(\:G2\) 33.5 65.1 1.4 2.5 33.2 65.8 1 5.4 7.6 91.4 1 5.2 \(\:G3\) 25.4 70.5 4.1 8.2 14.6 83 2.4 7.8 5.9 89.6 4.5 3.6 These results align with other studies on soil variability in semi-arid regions. Yuan et al. (2024) emphasized the influence of land use and topography on soil properties, noting that urban and disturbed areas tend to exhibit higher variability in surface soil composition due to anthropogenic interferences. Similarly, Bonanomi et al. (2024) and Qiu et al. (2001) highlighted the impact of elevation and land use on soil moisture content, with lower moisture levels typically observed in hilly and well-drained areas, and higher moisture retention in flatter, less disturbed regions. These studies support the observed patterns in the current analysis, where moisture content increases with depth and is higher in flatter areas like Hoskote and Gauribidanur compared to the hilly Kavalur. Moreover, the high sand content observed at deeper levels in Gauribidanur is consistent with findings by Ceballos et al. (2002), reporting similar trends in semi-arid regions, where deeper soil layers often exhibit higher sand fractions due to historical deposition processes. This high sand content correlates with the low moisture retention capacity, exacerbating water scarcity issues in these areas. In contrast, the mixed-use and forest land use at Kavalur contribute to lower moisture levels and higher gravel content, typical of well-drained, hilly terrains as described by Lado et al. (2004). The soil parameters obtained from this study will be used as predictor variables to train and test hybrid ML and infiltration models to predict infiltration rates. Recent studies have demonstrated the efficacy of ML techniques in hydrological modeling, significantly enhancing prediction accuracy (Mosavi et al. 2018). Integrating these parameters into hybrid ML and hydrological models will substantially improve the accuracy of infiltration rate predictions, which is crucial for effective water resource management and soil conservation in semi-arid regions. The heterogeneity in soil properties observed in this study underscores the necessity for tailored approaches in model training and validation to account for site-specific characteristics, ensuring robust predictions. This approach aligns with Salvadore et al. (2015), emphasizing the importance of developing site-specific models in heterogeneous environments to significantly boost predictive accuracy and management effectiveness. 4.2. Derived infiltration parameters for hybrid model development 4.2.1 Sorptivity and transmissivity from Philip's model The application of Philip's model to the observed infiltration rates \(\:f\left(t\right)\) against \(\:{t}^{-0.5}\) for infiltration test sites provided key parameters: sorptivity ( \(\:s\) ) and transmissivity factor ( \(\:A\) ) (Fig. 5 ). The Analysis resulted that H1 had the highest sorptivity (13.378), indicating rapid initial infiltration due to its high sand content (76.1%) and low clay content (0.7%) (Table 3 ). K3 and K2 also exhibited high sorptivity values (12.720 and 4.962, respectively), corresponding to their relatively high sand content (67.6% and 55.2%, respectively). In contrast, G3 had the lowest sorptivity (1.497), aligning with its relatively high silt and clay content (4.1%) and high moisture content (8.2%). The transmissivity factors further supported these findings, with H1 and H3 displaying high values (0.529 and 1.124, respectively), indicating efficient water movement through the sandy soil. Conversely, G3's very low transmissivity factor (0.001) suggested its soil structure hinders water movement, consistent with its higher silt and clay content. The sandy soils observed at H1 and K3 generally exhibit higher infiltration rates (Fig. 2 ) due to larger pore spaces that facilitate rapid water movement. These findings aligns with observations by Allaire et al. (2009) and Manns et al. (2024), which indicate that coarse-textured soils like sands promote greater hydraulic conductivity due to their interconnected macropores. Conversely, soils with higher silt and clay content, along with higher moisture content (e.g., G3), show reduced infiltration rates. The smaller particle sizes and larger specific surface area in these soils create a more tortuous path for water, hindering hydraulic conductivity (Mantoglou and Gelhar 1987). Additionally, subsurface moisture content significantly impacts a soil's ability to absorb and transmit water after initial infiltration. Dry soils, like those observed in H1 with a low moisture content (8.8% at 50 cm and 9.8% at 100 cm), exhibit higher sorptivity. This translates to a greater capacity for prolonged water uptake. Conversely, soils with higher soil moisture (e.g., H2 with 12.1% at 50 cm and 15% at 100 cm) demonstrate lower sorptivity. This can be attributed to reduced capillary forces driving water absorption, as reported by Rosenbom et al. (2009). 4.2.2 Initial infiltration capacity and soil empirical constant from Horton’s model The scatter plots showed in Fig. 6 illustrate the fit of Horton's infiltration model to the observed infiltration data from various sites. Each plot gives the logarithmic difference in infiltration capacity ( \(\:ln\left({i}_{o}-{i}_{c}\right)\) ) against time, along with the derived parameters \(\:{K}_{h}\) (empirical soil constant or Horton's decay coefficient) and \(\:{i}_{o}-{i}_{c}\) (the initial infiltration potential). The parameters were estimated from the linear fits to the logarithmic infiltration data, indicating the rate of decrease in infiltration capacity over time and the initial infiltration potential. For the Hoskote sites, the \(\:{K}_{h}\) values ranged from 0.015 to 0.031, with \(\:{i}_{o}-{i}_{c}\) values ranging from 0.253 to 4.253. The Kavalur sites exhibited \(\:{K}_{h}\) values between 0.019 and 0.048, and \(\:{i}_{o}-{i}_{c}\) values from 0.418 to 2.094. The Gauribidanur sites demonstrated \(\:{K}_{h}\) values ranging from 0.025 to 0.039, with \(\:{i}_{o}-{i}_{c}\) values from 0.215 to 0.527. Here, the decay constants and initial infiltration potentials serve as primary indicators of soil infiltration performance. The varied decay constants across different sites underscore the influence of site-specific factors. For example, the higher decay constants observed at H4, K4, G1, and G3 suggest quicker saturation and reduced infiltration rates that decrease asymptotically to reach the basic or steady-state over time (Fig. 2 ), which could be indicative of higher silt and clay content in the upper profile of the soil (Table 3 ). These observations align with the findings of Adhikary et al. (2008), who reported a significant inverse relationship between silt and clay content and infiltration rates, demonstrating that increased silt and clay concentrations substantially impede soil permeability and water infiltration. Additionally, the initial infiltration potential offers valuable insights into the soil's initial response to water application, a key parameter for designing efficient irrigation and drainage management (Gjettermann et al. 1997). The high sand content observed in Hoskote corresponds to its high initial infiltration potential. This reinforces the notion that larger pore spaces characteristic of sandy soils promote faster infiltration. Sandy soils, dominated by large, interconnected macropores, typically exhibit higher initial infiltration potential due to their enhanced hydraulic conductivity (Zhang and Schaap 2019). This rapid infiltration minimizes surface runoff and promotes deep percolation, which in turn is crucial for groundwater recharge in the vadose zone (Shanafield and Cook 2014; He et al. 2024). 4.2.3 Empirical dimensionless constants from Kostiakov’s model The Fig. 7 utilizes scatter plots to showcase the application of Kostiakov's infiltration model to field observations of cumulative infiltration ( \(\:F\left(t\right)\) ) across various test sites. The x-axis depicts the natural logarithm of time ( \(\:\text{l}\text{n}t\) ), while the y-axis represents the natural logarithm of cumulative infiltration ( \(\:\text{ln}F\left(t\right)\) ). This logarithmic transformation facilitates a clearer visualization of the relationship between infiltration and time, enabling a comprehensive analysis of infiltration patterns at each location. The successful fitting of the model to the data isolates the initial phase of rapid infiltration, characterized by the parameter ' \(\:a\) ' and the subsequent decline in infiltration rate, represented by ' \(\:b\) '. These parameters offer valuable information regarding the hydraulic properties of the soil at each site, reflecting the impact of soil texture and structure on water infiltration. H1 and K3 exhibited the highest \(\:a\) values (7.600 and 7.200, respectively), indicating rapid initial infiltration consistent with their high sand content (76.1% for H1 and 67.6% for K3), which promotes higher hydraulic conductivity. Conversely, G1 and G3 showed the lowest \(\:a\) values (0.353 and 0.800, respectively), reflecting slower initial infiltration rates due to higher silt and clay content, which hinder infiltration. The \(\:b\) values ranged between 0.579 (G3) and 0.843 (H2), indicating a moderate decrease in infiltration rate over time across all sites. The consistent \(\:b\) values suggest a uniform decrease in infiltration rate over time, reflecting similar soil behaviour in terms of infiltration rate reduction. This could be attributed to factors like pore clogging by fine particles, as reported by Mantoglou and Gelhar (1987) in their modeling of water flow in stratified soils. Additionally, the gradual saturation of the upper soil horizons, as infiltration progresses, can contribute to the observed decrease in the infiltration rate. 4.3. Evaluation of hybrid machine learning and hydrological models for infiltration rate prediction under varying scenarios For a comprehensive and unbiased evaluation, two target test sites (H4 and K4) were randomly chosen. This approach minimizes potential site-specific biases and provides a more generalizable assessment of the models' performance. A diverse set of predictor variables is utilized, including both direct soil data and features extracted from existing hydrological models. This broadens the model's understanding of infiltration processes, leading to a more robust evaluation that's less susceptible to site-specific quirks. The following section delves into the details of each hybrid model's performance, providing insights into their effectiveness for predicting infiltration rates 4.3.1 Hybrid ANN and hydrological models keeping H4 as a target site The evaluation of hybrid ANN and hydrological models for predicting infiltration rates keeping H4 has target site reveals significant insights. Increasing the number of predictor sites enhances the models' ability to accurately predict the observed infiltration rates. This trend is visually represented in Fig. 8 a, where prediction curves for models with 7 and 10 predictor sites align more closely with the observed infiltration rates compared to those with only 3 predictor sites. Quantitatively, the ANN + Horton model demonstrates the most consistent improvement across various error metrics with increasing predictor sites, followed by the ANN + Philip and ANN + Kostiakov models. For the ANN + Horton model, the \(\:{\text{R}}^{2}\) increases from 0.85 with 3 predictor sites to 0.94 with 10 predictor sites, indicating a significant enhancement in model performance (Fig. 8 b). The \(\:\text{R}\text{M}\text{S}\text{E}\) decreases from 0.36 to 0.08 cm/min, highlighting reduced prediction errors. Additionally, the \(\:\text{L}\text{C}\text{E}\) improves from − 0.42 to 0.76, reflecting enhanced model efficiency and predictive accuracy. The Taylor diagram corroborates these findings, showing that with more predictor sites, the model points move closer to the reference point, indicating better agreement with observed values (Fig. 8 c). Although the ANN + Philip model maintains a high \(\:{\text{R}}^{2}\) from 0.87 to 0.91 across different scenarios, its \(\:\text{R}\text{M}\text{S}\text{E}\) decreases from 1.60 to 0.08 cm/min, and \(\:\text{L}\text{C}\text{E}\) shifts from − 5.90 to 0.71, indicating substantial improvements as the ANN + Horton model. The ANN + Kostiakov model shows a high \(\:{\text{R}}^{2}\) , decreasing \(\:\text{R}\text{M}\text{S}\text{E}\) from 1.34 to 0.09 cm/min, and \(\:\text{L}\text{C}\text{E}\) improving from − 4.41 to 0.67, with improved performance on the Taylor diagram, though less consistent compared to the ANN + Horton model. 4.3.2 Hybrid MF and hydrological models keeping H4 as a target station The assessment of hybrid MF and hydrological models for predicting infiltration rates at the target station H4 is shown in the Fig. 9 . Similar to hybrid ANN models, incorporating more predictor sites will enhances the models' ability to predict the infiltration rates (as shown in the Fig. 9 a). The MF + Philip model excels \(\:{\text{R}}^{2}\) value of 0.89, 0.91, and 0.92 for scenarios 3, 7, and 10 predictor sites. As illustrated in Fig. 9 b and c, with increasing predictor sites demonstrably benefits this model, significantly reducing errors ( \(\:\text{R}\text{M}\text{S}\text{E}\) from 1.57 to 0.1 cm/min) and improving efficiency ( \(\:\text{L}\text{C}\text{E}\) from − 5.87 to 0.62), as upheld by the Taylor diagram's convergence towards the reference point. The MF + Horton model exhibits a more nuanced behaviour. While its \(\:{\text{R}}^{2}\) starts well (0.87 with 3 sites), it dips slightly (0.84 with 10 sites) with more predictors, suggesting potential overfitting. However, this is outweighed by a substantial decrease in errors ( \(\:\text{R}\text{M}\text{S}\text{E}\) from 1.0 to 0.19 cm/min) and a clear improvement in efficiency ( \(\:\text{L}\text{C}\text{E}\) from − 3.71 to 0.47) as the number of sites increases. The Taylor diagram aligns with this, showing the model better matching observations with more predictors. Finally, the MF + Kostiakov model thrives with increasing predictor sites. The \(\:{\text{R}}^{2}\) steadily increases from 0.88 to 0.9, indicating a stronger grasp of the data's variability. Similarly, the \(\:\text{R}\text{M}\text{S}\text{E}\) descends (from 1.59 to 0.12 cm/min), highlighting a considerable error reduction. The \(\:\text{L}\text{C}\text{E}\) showcases a remarkable improvement as well (from − 6.31 to 0.69), demonstrating enhanced model efficiency and accuracy. As with the other models, the Taylor diagram confirms this positive trend. 4.3.3 Hybrid ANN and hydrological models keeping K4 as a target station Analysing predicted infiltration rates at K4 reveals that incorporating more predictor sites (7 and 10) significantly improves the accuracy of hybrid ANN and hydrological models, as evident in Fig. 10 a, where models produce prediction curves closely aligned with observed data (particularly ANN + Philip, whose curves progressively approach observed rates). This is further supported by bar plot of error measures (Fig. 10 b): ANN + Philip exhibits substantial increase in \(\:{\text{R}}^{2}\) (0.72 to 0.80), drastic reduction in \(\:\text{R}\text{M}\text{S}\text{E}\) (1.05 to 0.11 cm/min), and improved \(\:\text{L}\text{C}\text{E}\) (-5.38 to 0.57), indicating enhanced efficiency and accuracy. While improvements in ANN + Horton and ANN + Kostiakov models are less pronounced, some metrics improve with more predictor sites, while others show contrasting behaviour. For instance, ANN + Horton's \(\:{\text{R}}^{2}\) decreases from 0.59 (3 sites) to 0.54 (10 sites), \(\:\text{R}\text{M}\text{S}\text{E}\) and \(\:\text{L}\text{C}\text{E}\) increases from 0.16 to 0.33 cm/min and − 0.05 to -0.92. Similarly, ANN + Kostiakov's \(\:{\text{R}}^{2}\) reduces from 0.72 to 0.67, and \(\:\text{R}\text{M}\text{S}\text{E}\) decreases from 0.34 to 0.11 cm/min. \(\:\text{L}\text{C}\text{E}\) values improve from − 0.49 to 0.63. The Taylor diagram (Fig. 10 c) reinforces this, demonstrating that model points move closer to the reference point with more data, signifying better agreement with observed values. Overall, for Kavalur site 4, increasing predictor sites generally enhances the model performance, with ANN + Philip exhibiting the most significant improvements. 4.3.4 Hybrid MF and hydrological models keeping K4 as a target station The Fig. 11 illustrates the prediction accuracy of hybrid MF and hydrological models for infiltration rates at K4. The results reveal a clear benefit for the MF + Philip model, which exhibits a substantial decrease in \(\:\text{R}\text{M}\text{S}\text{E}\) from 0.87 to 0.20 cm/min, indicating a significant reduction in prediction error. Additionally, the \(\:\text{L}\text{C}\text{E}\) values show a marked improvement from − 3.58 to 0.13, reflecting enhanced model efficiency and accuracy. While \(\:{\text{R}}^{2}\) values show some variability, the overall trend suggests improvement for MF + Philip with more predictor sites. The \(\:{\text{R}}^{2}\) values exhibit inconsistent changes in both MF + Horton and MF + Kostikov model. \(\:\text{R}\text{M}\text{S}\text{E}\) generally, improves in MF + Horton (0.31 to 0.22 cm/min) but with a slight increase at 10 sites. \(\:\text{L}\text{C}\text{E}\) shows mixed improvements, suggesting some performance gains but inconsistencies. Similarly, MF + Kostiakov's \(\:{\text{R}}^{2}\) shows a slight decrease, while \(\:\text{R}\text{M}\text{S}\text{E}\) improves (0.51 to 0.47 cm/min). \(\:\text{L}\text{C}\text{E}\) initially improves but worsens slightly with more data. The Taylor diagram (Fig. 11 c) reinforces these findings. MF + Philip displays the most consistent movement towards the reference point, indicating improved agreement with observed values. Conversely, MF + Horton and MF + Kostiakov models show less consistent trends. As the study aims to predict infiltration rates at sites using minimal data, such as soil parameters (% gravel, % sand, % silt + clay, % moisture), leveraging datasets from accessible locations. This approach is particularly valuable for locations where direct measurement is impractical. However, some models (e.g., ANN + Horton and MF + Horton) exhibit mixed results with increasing predictor sites. This could be due to overfitting and the inherent complexity of soil-water interactions not fully captured by the models. To comprehensively evaluate model performance, we employed different error measures. Previous research has demonstrated significant improvements in predictive accuracy by integrating ML techniques with hydrological models. This study aligns with the trend in data-driven hydrological modeling, emphasizing the importance of combining physical process-based empirical models with ML approaches for enhanced prediction. As shown in Fig. 12 , standalone infiltration models (Horton's, Philip's, Kostiakov's) exhibited limited success in predicting infiltration rates compared to the superior performance achieved by the hybrid models (particularly ANN + Philip) at both target sites. Althoff et al. (2021), Xu et al. (2024) and Young et al. (2017) also highlighted that hybrid models can effectively capture complex hydrological processes that traditional models cannot. Similarly, Zubelzu et al. (2024) underscored the benefits of ML algorithms in improving the accuracy of hydrological predictions, especially in data-scarce regions. Furthermore, the adaptability of ML-based hybrid models to the inherent variability and uncertainties within hydrological data is particularly valuable for predicting infiltration rates across diverse soil and climatic conditions, where conventional models often struggle (Parchami-Araghi et al. 2013). Among the evaluated models, hybrids incorporating Philip's equation (ANN + Philip and MF + Philip) demonstrated superior performance in predicting infiltration rates at remote sites. Philip's model, a well-established hydrological approach, describes infiltration as a function of time and initial soil moisture. Its robust mathematical foundation provides a solid representation of infiltration processes, making it a strong foundation for integration with ML models. Furthermore, the superior performance of these Philip's model-based hybrids can be attributed to their ability to leverage the strengths of both approaches. The combination of an ANN or MF with Philip's model enhances predictive accuracy. While Philip's model offers a strong foundation, ANN and MF can capture complex, non-linear relationships in the data that the physical model alone might miss (Sy 2006). This synergy between physical process understanding and data-driven techniques ultimately leads to improved prediction capabilities. In contrast, the Horton and Kostiakov models, while useful, have limitations that might explain their relatively poorer performance when combined with ML models. Both Horton and Kostiakov models are primarily empirical and might not generalize well across different environments, which can limit the hybrid models' ability to capture the full complexity of infiltration processes (Parchami-Araghi et al. 2013). To overcome these limitations, the integration of ML techniques can be tailored to enhance the empirical models' flexibility and generalizability. By using ANN and MF, these hybrid models can identify and adjust for site-specific factors and non-linear relationships that the empirical models alone may miss. Additionally, incorporating techniques such as cross-validation and sensitivity analysis can help quantify and reduce uncertainties, ensuring more robust and reliable predictions across diverse environments. This combined approach not only leverages the empirical models' simplicity and ease of use but also enriches them with the adaptability and precision of machine learning, leading to better performance in various hydrological contexts. 5. Conclusions This study successfully integrated traditional hydrological models with ML techniques to predict infiltration rates in semi-arid regions of southern India. Hybrid models, particularly those based on Philip's equation, outperformed standalone traditional methods by leveraging both physical understanding and ML's predictive power. For instance, the ANN-Philip model achieved impressive accuracy (R², RMSE, and LCE of 0.91, 0.08 cm/min, and 0.71 at target site H4 and 0.92, 0.1 cm/min, and 0.62 at K4). A robust dataset with detailed soil characteristics from various depths and locations was crucial for model training. The study further demonstrates the value of spatially diverse data, as models trained with data from more sites exhibited higher performance. Accounting for the inherent variability in semi-arid soils was essential for robust predictions. Importantly, the hybrid models can predict infiltration rates at remote sites with minimal data, making them a valuable tool for areas with limited measurement capabilities. By integrating theory-guided data science with physics-informed ML, these hybrid models offer interpretable and accurate predictions, overcoming a major limitation of traditional approaches in hydrology. This novel approach has the potential to significantly improve water resource management and soil conservation in semi-arid and data-scarce regions. However, the study observed that hybrid models struggled to predict chaotic patterns, such as sudden dips and peaks in infiltration rates, especially at the start of the process. These fluctuations, likely due to initial soil conditions, water repellency, or micro-variations in soil texture, were not well captured. To overcome these limitations and address the concerns regarding the integration of empirical models with ML, future research should aim to improve the models' ability to handle these chaotic changes by incorporating more detailed data, utilizing advanced ML techniques, and enhancing training algorithms. Specifically, integrating cross-validation and sensitivity analysis can help quantify and reduce uncertainties, ensuring more robust and reliable predictions. Furthermore, exploring advanced ML techniques such as recurrent neural networks (RNNs) or long short-term memory (LSTM) networks could better capture temporal dynamics and sudden variations in infiltration rates. Additionally, using high-resolution temporal and spatial data can help in understanding and modeling initial soil conditions and micro-variations more accurately. This combined approach not only leverages the empirical models' simplicity but also enriches them with the adaptability and precision of machine learning, leading to better performance in various hydrological contexts. By addressing these challenges, the prediction accuracy and reliability of hybrid models can be further enhanced across diverse conditions. Declarations Acknowledgements The authors would like to express their sincere gratitude to the Director of the Indian Institute of Astrophysics, Government of India, for granting access to the campus to perform the infiltration tests. This support was instrumental in the successful completion of our study. Funding This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Conflict of Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Data availability statement The data and code used in this study will be made available upon request. References Adhikary PP, Chakraborty D, Kalra N, et al (2008) Pedotransfer functions for predicting the hydraulic properties of Indian soils. Aust J Soil Res 46:476–484. https://doi.org/10.1071/SR07042 Ahmed AA, Sayed S, Abdoulhalik A, et al (2024) Applications of machine learning to water resources management: A review of present status and future opportunities. J Clean Prod 441:140715. https://doi.org/10.1016/j.jclepro.2024.140715 Allaire SE, Roulier S, Cessna AJ (2009) Quantifying preferential flow in soils: A review of different techniques. J Hydrol 378:179–204. https://doi.org/10.1016/j.jhydrol.2009.08.013 Althoff D, Bazameb HC, Nascimentob JG (2021) Untangling hybrid hydrological models with explainable artificial intelligence. H2Open J 4:13–28. https://doi.org/10.2166/H2OJ.2021.066 Arya LM, Leij FJ, Shouse PJ, van Genuchten MT (1999) Relationship between the Hydraulic Conductivity Function and the Particle‐Size Distribution. Soil Sci Soc Am J 63:1063–1070. https://doi.org/10.2136/sssaj1999.6351063x ASCE Task Committee on Application of Artificial Neural Networks in Hydrology. (2000a) Artificial Neural Networks in Hydrology. I: Preliminary Concepts. J Hydrol Eng 5:115–123. https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(115) ASCE Task Committee on Application of Artificial Neural Networks in Hydrology. (2000b) Artificial Neural Networks in Hydrology. II: Hydrologic Applications. J Hydrol Eng 5:124–137. https://doi.org/https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(124) ASTM Standard D2216–19 (2019) Standard Test Methods for Laboratory Determination of Water (Moisture) Content of Soil and Rock by Mass. West Conshohocken, PA Bikše J, Retike I, Haaf E, Kalvāns A (2023) Assessing automated gap imputation of regional scale groundwater level data sets with typical gap patterns. J Hydrol 620: 129424. https://doi.org/10.1016/j.jhydrol.2023.129424 Bonanomi G, Motti R, Abd-ElGawad AM, Idbella M (2024) Soil water repellency along elevation gradients: The role of climate, land use and soil chemistry. Geoderma 443:116847. https://doi.org/10.1016/j.geoderma.2024.116847 Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324 Ceballos A, Martı́nez-Fernández J, Santos F, Alonso P (2002) Soil-water behaviour of sandy soils under semi-arid conditions in the Duero Basin (Spain). J Arid Environ 51:501–519. https://doi.org/10.1006/jare.2002.0973 Chen G, Hou J, Liu Y, et al (2024) Urban inundation rapid prediction method based on multi-machine learning algorithm and rain pattern analysis. J Hydrol 633:131059. https://doi.org/10.1016/j.jhydrol.2024.131059 Christiaens K, Feyen J (2001) Analysis of uncertainties associated with different methods to determine soil hydraulic properties and their propagation in the distributed hydrological MIKE SHE model. J Hydrol 246:63–81. https://doi.org/10.1016/S0022-1694(01)00345-6 Dexter AR, Richard G (2009) The saturated hydraulic conductivity of soils with n-modal pore size distributions. Geoderma 154:76–85. https://doi.org/10.1016/j.geoderma.2009.09.015 Dubey SR, Singh SK, Chaudhuri BB (2022) Activation functions in deep learning: A comprehensive survey and benchmark. Neurocomputing 503:92–108. https://doi.org/10.1016/j.neucom.2022.06.111 Gjettermann B, Nielsen KL, Petersen CT, et al (1997) Preferential flow in sandy loam soils as affected by irrigation intensity. Soil Technol 11:139–152. https://doi.org/10.1016/S0933-3630(97)00001-9 Haverkamp R, Kutilek M, Parlange J-Y, et al (1988) Infiltration under ponded conditions: 2. Infiltration equationstested for parameter time-dependence and predictive use. Soil Sci 145:317–329 He Y, Wang Y, Liu Y, et al (2024) Focus on the nonlinear infiltration process in deep vadose zone. Earth-Science Rev 252:104719. https://doi.org/10.1016/j.earscirev.2024.104719 Heber Green W, Ampt GA (1911) Studies on Soil Phyics. J Agric Sci 4:1–24. https://doi.org/10.1017/S0021859600001441 Holtan HN (1961) A concept for infiltration estimates in watershed engineering, 41st edn. Agricultural Research Service, US Department of Agriculture Horton RE (1941) An Approach Toward a Physical Interpretation of Infiltration‐Capacity. Soil Sci Soc Am J 5:399–417. https://doi.org/10.2136/sssaj1941.036159950005000C0075x Ispirova G, Eftimov T, Seljak BK (2020) Evaluating missing value imputation methods for food composition databases. Food Chem Toxicol 141:111368. https://doi.org/10.1016/j.fct.2020.111368 Jia Y, Culver TB (2006) Bootstrapped artificial neural networks for synthetic flow generation with a small data sample. J Hydrol 331:580–590. https://doi.org/10.1016/j.jhydrol.2006.06.005 Kostiakov AN (1932) On the dynamics of the coefficient of water-percolation in soils and on the necessity of studying it from a dynamic point of view for purposes of amelioration. Trans 6th Cong Int Soil Sci Russ Part A 17–21 Lado M, Paz A, Ben-Hur M (2004) Organic Matter and Aggregate‐Size Interactions in Saturated Hydraulic Conductivity. Soil Sci Soc Am J 68:234–242. https://doi.org/10.2136/sssaj2004.2340 Legates DR, McCabe GJ (1999) Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour Res 35:233–241. https://doi.org/10.1029/1998WR900018 Mahapatra S, Jha MK, Biswal S, Senapati D (2020) Assessing Variability of Infiltration Characteristics and Reliability of Infiltration Models in a Tropical Sub-humid Region of India. Sci Rep 10:1–18. https://doi.org/10.1038/s41598-020-58333-8 Manns HR, Jiang Y, Parkin G (2024) Soil pores in preferential flow terminology and permeability equations. Vadose Zo J 1–12. https://doi.org/10.1002/vzj2.20365 Mantoglou A, Gelhar LW (1987) Effective hydraulic conductivities of transient unsaturated flow in stratified soils. Water Resour Res 23:57–67. https://doi.org/10.1029/WR023i001p00057 Mattar MA, Alazba AA, Zin El-Abedin TK (2015) Forecasting furrow irrigation infiltration using artificial neural networks. Agric Water Manag 148:63–71. https://doi.org/10.1016/j.agwat.2014.09.015 Mosavi A, Ozturk P, Chau KW (2018) Flood prediction using machine learning models: Literature review. Water (Switzerland) 10:1–40. https://doi.org/10.3390/w10111536 Naranjo-Fernández N, Guardiola-Albert C, Aguilera H, et al (2020) Clustering groundwater level time series of the exploited almonte-marismas aquifer in southwest Spain. Water (Switzerland) 12:1–20. https://doi.org/10.3390/W12041063 Overton D (1964) Mathematical refinement of an infiltration equation for watershed engineering. Agricultural Research Service, US Department of Agriculture Parchami-Araghi F, Mirlatifi SM, Ghorbani Dashtaki S, Mahdian MH (2013) Point estimation of soil water infiltration process using Artificial Neural Networks for some calcareous soils. J Hydrol 481:35–47. https://doi.org/10.1016/j.jhydrol.2012.12.007 Philip JR (1969) Theory of Infiltration. In: Advances in Hydroscience. Academic PRESS, INC., pp 215–296 Qiu Y, Fu B, Wang J, Chen L (2001) Soil moisture variation in relation to topography and land use in a hillslope catchment of the Loess Plateau, China. J Hydrol 240:243–263. https://doi.org/10.1016/S0022-1694(00)00362-0 Richards LA (1931) Capillary conduction of liquids through porous mediums. J Appl Phys 1:318–333. https://doi.org/10.1063/1.1745010 Rosenbom AE, Therrien R, Refsgaard JC, et al (2009) Numerical analysis of water and solute transport in variably-saturated fractured clayey till. J Contam Hydrol 104:137–152. https://doi.org/10.1016/j.jconhyd.2008.09.001 Salvadore E, Bronders J, Batelaan O (2015) Hydrological modelling of urbanized catchments: A review and future directions. J Hydrol 529:62–81. https://doi.org/10.1016/j.jhydrol.2015.06.028 Sayari S, Mahdavi-Meymand A, Zounemat-Kermani M (2021) Irrigation water infiltration modeling using machine learning. Comput Electron Agric 180:105921. https://doi.org/10.1016/j.compag.2020.105921 Shanafield M, Cook PG (2014) Transmission losses, infiltration and groundwater recharge through ephemeral and intermittent streambeds: A review of applied methods. J Hydrol 511:518–529. https://doi.org/10.1016/j.jhydrol.2014.01.068 Sidhu RK, Kumar R, Rana PS (2020) Machine learning based crop water demand forecasting using minimum climatological data. Multimed Tools Appl 79:13109–13124. https://doi.org/10.1007/s11042-019-08533-w Sihag P, Singh B, Sepah Vand A, Mehdipour V (2020) Modeling the infiltration process with soft computing techniques. ISH J Hydraul Eng 26:138–152. https://doi.org/10.1080/09715010.2018.1464408 Sihag P, Singh VP, Angelaki A, et al (2019) Modelling of infiltration using artificial intelligence techniques in semi-arid Iran. Hydrol Sci J 64:1647–1658. https://doi.org/10.1080/02626667.2019.1659965 Singh B, Sihag P, Singh K (2017) Modelling of impact of water quality on infiltration rate of soil by random forest regression. Model Earth Syst Environ 3:999–1004. https://doi.org/10.1007/s40808-017-0347-3 Smith RE (1972) The infiltration envelope: Results from a theoretical infiltrometer. J Hydrol 17:1–22. https://doi.org/10.1016/0022-1694(72)90063-7 Smith RE, Parlange J ‐Y (1978) A parameter‐efficient hydrologic infiltration model. Water Resour Res 14:533–538. https://doi.org/10.1029/WR014i003p00533 Stekhoven DJ, Bühlmann P (2012) Missforest-Non-parametric missing value imputation for mixed-type data. Bioinformatics 28:112–118. https://doi.org/10.1093/bioinformatics/btr597 Sy NL (2006) Modelling the infiltration process with a multi-layer perceptron artificial neural network. Hydrol Sci J 51:3–20. https://doi.org/10.1623/hysj.51.1.3 Taylor KE (2001) Summarizing multiple aspects of model performance in a single diagram. J Geophys Res Atmos 106:7183–7192. https://doi.org/10.1029/2000JD900719 Teshome FT, Bayabil HK, Schaffer B, et al (2024) Simulating soil hydrologic dynamics using crop growth and machine learning models. Comput Electron Agric 224:109186. https://doi.org/10.1016/j.compag.2024.109186 Wang Q, Shao M, Horton R (1999) Modified Green and Ampt models for layered soil infiltration and muddy water infiltration. Soil Sci 164:445–453 Xu W, Chen J, Corzo G, et al (2024) Coupling Deep Learning and Physically Based Hydrological Models for Monthly Streamflow Predictions. Water Resour Res 60:1–25. https://doi.org/10.1029/2023WR035618 Young CC, Liu WC, Wu MC (2017) A physically based and machine learning hybrid approach for accurate rainfall-runoff modeling during extreme typhoon events. Appl Soft Comput J 53:205–216. https://doi.org/10.1016/j.asoc.2016.12.052 Yuan J, Yao Y, Guan Y, et al (2024) Effects of land use patterns on soil properties and nitrous oxide flux on a semi-arid environmental conditions of Loess Plateau China. Glob Ecol Conserv 51:e02899. https://doi.org/10.1016/j.gecco.2024.e02899 Zhang Y, Schaap MG (2019) Estimation of saturated hydraulic conductivity with pedotransfer functions: A review. J Hydrol 575:1011–1030. https://doi.org/10.1016/j.jhydrol.2019.05.058 Zubelzu S, Ghalkha A, Ben Issaid C, et al (2024) Coupling machine learning and physical modelling for predicting runoff at catchment scale. J Environ Manage 354:120404. https://doi.org/10.1016/j.jenvman.2024.120404 Supplementary Files Supplementarymaterials.docx Cite Share Download PDF Status: Published Journal Publication published 22 Feb, 2025 Read the published version in Acta Geophysica → Version 1 posted Editorial decision: Major revisions 16 Oct, 2024 Reviewers agreed at journal 16 Aug, 2024 Reviewers invited by journal 16 Aug, 2024 Editor invited by journal 15 Aug, 2024 Editor assigned by journal 09 Aug, 2024 First submitted to journal 06 Aug, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4869876","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":341156170,"identity":"762e43fa-9bb1-451e-9f49-b021b33eb181","order_by":0,"name":"Mooganayakanakote Veeranna Ramaswamy","email":"","orcid":"","institution":"University Visvesvaraya College of Engineering Department of Civil Engineering","correspondingAuthor":false,"prefix":"","firstName":"Mooganayakanakote","middleName":"Veeranna","lastName":"Ramaswamy","suffix":""},{"id":341156171,"identity":"e18f80aa-bfce-494b-95c3-bd475756f298","order_by":1,"name":"Yashas Kumar Hanumapura Kumaraswamy","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA6UlEQVRIiWNgGAWjYHACNhAhx8/ffABIS8gQrcVYcsaxBJAWHqK1JG44kGMAYhDWYnD8+LMHP/4cZtxw4MznVzdqLHgY2A8f3YBXy5kcc8PetsPMkod7t1nnHAM6jCct7QZeLQdy2CR4Gw6z8R04u80YyOYBescMv5bzz59J/vlzmIfhQM4z45x/xGi5kWAmzcN2WELgQA7z49w2IrRI3nhjJi3blm4ADGQz5tw+CR42Qn7hO5/+TPLNH+v6fv7mx59zvtXJ8bMfPoZXi8IBBJtNAkziUw4C8g0INvMHQqpHwSgYBaNgZAIAccxNVOEFK20AAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0003-2653-6172","institution":"National Institute of Technology Karnataka","correspondingAuthor":true,"prefix":"","firstName":"Yashas","middleName":"Kumar Hanumapura","lastName":"Kumaraswamy","suffix":""},{"id":341156172,"identity":"12fb6d51-684f-4433-bd29-4d3e334c4ee6","order_by":2,"name":"Varshini Jaganatha Reddy","email":"","orcid":"","institution":"Northeastern University - Boston Campus: Northeastern University","correspondingAuthor":false,"prefix":"","firstName":"Varshini","middleName":"Jaganatha","lastName":"Reddy","suffix":""},{"id":341156173,"identity":"12fb97ed-2e2d-44da-a7e3-32f89b4d0a13","order_by":3,"name":"Shivakumar J Nyamathi","email":"","orcid":"","institution":"University Visvesvaraya College of Engineering Department of Civil Engineering","correspondingAuthor":false,"prefix":"","firstName":"Shivakumar","middleName":"J","lastName":"Nyamathi","suffix":""}],"badges":[],"createdAt":"2024-08-06 16:25:22","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4869876/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4869876/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1007/s11600-025-01535-3","type":"published","date":"2025-02-22T15:57:41+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":64409943,"identity":"8d6043c7-ba4a-4e8e-aacf-f0a4463a80c3","added_by":"auto","created_at":"2024-09-12 19:18:38","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":6721999,"visible":true,"origin":"","legend":"\u003cp\u003eGeographical map the studied area\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/8648d6f230983d73afb01c17.png"},{"id":64410111,"identity":"613f33c2-1ac1-4170-837d-1f072fe6dde8","added_by":"auto","created_at":"2024-09-12 19:26:38","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1392052,"visible":true,"origin":"","legend":"\u003cp\u003eInfiltration rate and cumulative infiltration recorded at different test points\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/198830b99229d9c73a9fc4aa.png"},{"id":64409947,"identity":"3751d7e0-f635-466c-986f-e8a183ceb6cc","added_by":"auto","created_at":"2024-09-12 19:18:38","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":7311897,"visible":true,"origin":"","legend":"\u003cp\u003eDiagram for method of research and evaluation of coupled ML and traditional infiltration techniques\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/8986a8283a778f7df345f519.png"},{"id":64409942,"identity":"f8254199-56da-4289-a7ea-62a8877173d9","added_by":"auto","created_at":"2024-09-12 19:18:38","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":7011671,"visible":true,"origin":"","legend":"\u003cp\u003eConfiguration of the ANN model for predicting useful parameters required for an infiltration model (e.g., Philip’s model) using neighbouring site data\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/1597cf374f6a3ba7eea4fd6d.png"},{"id":64409939,"identity":"86fab366-548b-4ecf-aa4b-92b964d0a4a7","added_by":"auto","created_at":"2024-09-12 19:18:38","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":1405431,"visible":true,"origin":"","legend":"\u003cp\u003eParameters of Philip's infiltration model: Sorptivity and transmissivity\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/281dc1135b0120c0fbe197a7.png"},{"id":64409941,"identity":"807e9f78-804d-4201-b992-7549659b9e4e","added_by":"auto","created_at":"2024-09-12 19:18:38","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":1497183,"visible":true,"origin":"","legend":"\u003cp\u003eParameters of Horton’s infiltration model: Initial infiltration capacity and decay coefficient\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/4b5654d76b153a92f54bfe95.png"},{"id":64410113,"identity":"8021e60e-efc8-495e-b46b-e8687f8dae35","added_by":"auto","created_at":"2024-09-12 19:26:39","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":1294575,"visible":true,"origin":"","legend":"\u003cp\u003eParameters of Kostiakov’s infiltration model: Empirical dimensionless constants a and b\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/2572c1095dd795ba9a902625.png"},{"id":64410112,"identity":"dceddcda-a963-475a-a17f-f5a4ffa5b251","added_by":"auto","created_at":"2024-09-12 19:26:38","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":1943502,"visible":true,"origin":"","legend":"\u003cp\u003ePerformance of hybrid ANN and hydrological models for predicting infiltration rates, with H4 as the target site, (a) prediction of infiltration rates by various hybrid model, (b) error measure by statistical indicators, (c) error measure by Taylor diagram\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/1159c86e957cfcaa258b287d.png"},{"id":64409944,"identity":"9157dc00-de48-4903-994a-cc08eb9ea132","added_by":"auto","created_at":"2024-09-12 19:18:38","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":2093794,"visible":true,"origin":"","legend":"\u003cp\u003ePerformance of hybrid MF and hydrological models for predicting infiltration rates, with H4 as the target site, (a) prediction of infiltration rates by various hybrid model, (b) error measure by statistical indicators, (c) error measure by Taylor diagram\u003c/p\u003e","description":"","filename":"9.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/64ad64472a6e04035da09e88.png"},{"id":64409946,"identity":"9f1edc6f-3b99-4831-9097-25e7ca10c428","added_by":"auto","created_at":"2024-09-12 19:18:38","extension":"png","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":1059091,"visible":true,"origin":"","legend":"\u003cp\u003ePerformance of hybrid ANN and hydrological models for predicting infiltration rates, with K4 as the target site, (a) prediction of infiltration rates by various hybrid model, (b) error measure by statistical indicators, (c) error measure by Taylor diagram\u003c/p\u003e","description":"","filename":"10.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/86c3de66d96ffd6e794d5827.png"},{"id":64409948,"identity":"4ada5298-fddd-4000-9299-55336f348dfd","added_by":"auto","created_at":"2024-09-12 19:18:39","extension":"png","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":2076684,"visible":true,"origin":"","legend":"\u003cp\u003ePerformance of hybrid MF and hydrological models for predicting infiltration rates, with K4 as the target site, (a) prediction of infiltration rates by various hybrid model, (b) error measure by statistical indicators, (c) error measure by Taylor diagram\u003c/p\u003e","description":"","filename":"11.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/9a81229c252b8bf6cafd5cca.png"},{"id":64409950,"identity":"2b7635ef-7c66-443a-ab31-f44276b669d6","added_by":"auto","created_at":"2024-09-12 19:18:39","extension":"png","order_by":12,"title":"Figure 12","display":"","copyAsset":false,"role":"figure","size":1254848,"visible":true,"origin":"","legend":"\u003cp\u003ePerformance of Hydrological Models in Predicting Infiltration Rates, (a) prediction at target site H4, (b) prediction error at target site H4, (c) prediction at target site K4, (d) prediction error at target site K4\u003c/p\u003e","description":"","filename":"12.png","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/be972e1ea561f18f3b68a206.png"},{"id":77052640,"identity":"ca66f948-6604-4407-ad8c-3efb99cf1dda","added_by":"auto","created_at":"2025-02-24 16:19:51","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":36414245,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/71a42a88-72b3-400b-95c3-5061f4154144.pdf"},{"id":64409951,"identity":"8e7dfbd4-b1f1-4477-866b-b1aa9bce1648","added_by":"auto","created_at":"2024-09-12 19:18:39","extension":"docx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":46130445,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterials.docx","url":"https://assets-eu.researchsquare.com/files/rs-4869876/v1/feff89e5c216b210a37f4e02.docx"}],"financialInterests":"","formattedTitle":"Enhancing Infiltration Rate Predictions with Hybrid Machine Learning and Empirical Models: Addressing Challenges in Southern India","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eIn the fields of hydrology, irrigation, and drainage engineering, the soil infiltration process plays a fundamental role. In general, infiltration refers to the vertical and lateral movement of water from the soil surface down through the layers. Initially, when water is applied, whether through rainfall or irrigation, it rapidly penetrates the soil at a high rate, termed the potential infiltration rate or maximum infiltration capacity. However, over time, this rate stabilizes and reaches a constant value known as the saturated infiltration rate or steady-state infiltration capacity. Understanding this transition from rapid to stable infiltration rates is crucial for analysing water movement dynamics within soil profiles. Describing infiltration process proves challenging due to its complex nature, particularly under isotropic and heterogeneous soil conditions (He et al. 2024). The complex process of infiltration is profoundly influenced by various factors, including soil depth, geomorphological features, hydraulic properties, and climatic conditions. Among these factors, the arrangement of soil particles and the moisture content within the soil layers stand out as crucial determinants of its ability to absorb and retain water during rainfall or irrigation events (Arya et al. 1999; Dexter and Richard 2009). A nuanced understanding of these intricacies is indispensable for crafting strategies to effectively mitigate soil erosion, manage groundwater recharge, and optimize the design and management of irrigation and hydrological systems (Mattar et al. 2015).\u003c/p\u003e \u003cp\u003eOver the years, researchers have grappled with the accurate assessment of infiltration rates due to the spatial and temporal variability inherent in field measurements (Mahapatra et al. 2020). This variability stems from the heterogeneity of soil properties, which are further complicated by equipment limitations, logistical constraints, and environmental interferences. Limited access in remote locations adds another layer of difficulty.\u003c/p\u003e \u003cp\u003eDespite these challenges, many authors have developed a variety of equations to predict infiltration, encompassing physical, semi-empirical, and empirical equations. Examples include Green and Ampt (1911), Philip (1969), Smith (1972), Smith and Parlange (1978), Horton (1941), Holtan (1961), Overton (1964), Richards (1931), Kostiakov (1932), modified Green-Ampt (Wang et al. 1999), modified Kostiakov (Haverkamp et al. 1988), among others. While these equations offer valuable predictive tools, their reliance on simplifying assumptions such as (i) homogeneity of soil and (ii) constant soil moisture content, along with limited applicability to specific soil and environmental conditions, can introduce inaccuracies. This highlights a key drawback in their use. In addition, numerical and physically based models such as SWAT (Soil and Water Assessment Tool), MIKE SHE (MIKE Surface-water Hydrology), among others are renowned for accurately predicting infiltration processes. However, acquiring data with high spatial resolutions, especially soil data type required to run these models proves challenging, particularly for the large heterogeneous catchments (Christiaens and Feyen 2001). To overcome these limitations, our study integrates these established empirical models with machine learning (ML) techniques, specifically Artificial Neural Networks (ANN) and the MissForest (MF) algorithm. This hybrid approach leverages the strengths of both empirical and ML models, addressing the simplifying assumptions and enhancing prediction accuracy while maintaining interpretability. By combining empirical knowledge with advanced data-driven techniques, we mitigate the challenges of data acquisition and model applicability, providing a more robust and practical solution for predicting infiltration in water-scarce and data-scarce regions.\u003c/p\u003e \u003cp\u003eOngoing advancements in measurement techniques and data analysis methodologies offer a glimmer of hope. Soft computing and data driven methods, including Artificial Neural Networks (ANN), Random Forest (RF), Multi-Linear Regression (MLR), Support Vector Regressor (SVR), among others have emerged as powerful tools in hydrology and irrigation engineering, addressing various complex challenges (Sayari et al. 2021; Ahmed et al. 2024; Chen et al. 2024; Teshome et al. 2024). In hydrology, they excel in predictive analytics, flood and drought prediction, and water quality assessment. In parallel, within irrigation engineering, they optimize schedules, estimate crop water needs, assess system performance, and recognize patterns (Sidhu et al. 2020). Notably, the literature on machine learning (ML) methods for soil water infiltration remains limited. However, ML algorithms shine in several areas, making accurate predictions, enhancing performance over time, unveiling concealed patterns within intricate datasets, and automating tasks, thus offering multifaceted benefits.\u003c/p\u003e \u003cp\u003eSy (2006) applied the ANN to model infiltration using data from plot-scale rainfall simulator experiments. The research highlighted the efficiency of ANN in capturing infiltration dynamics, with soil moisture and hydraulic conductivity identified as critical factors. Furthermore, compared to traditional methods such as Philip and Green-Ampt, ANN exhibited superior accuracy in predicting cumulative infiltration.\u003c/p\u003e \u003cp\u003eSayari et al. (2021) compared five artificial intelligence (AI) models and their integrative versions with the Firefly Algorithm (FA) to forecast infiltrated water in furrow irrigation system. Utilizing data from both literature sources and field experiments conducted in Iran, the study incorporated key input parameters including furrow length, inflow rate, advance time, cross-sectional area of inflow, and infiltration opportunity time. Evaluation metrics highlighted the significant enhancement in accuracy achieved by integrating FA. These findings underscore the potential of AI models in refining complex hydrological processes.\u003c/p\u003e \u003cp\u003eSihag et al. (2019) evaluated the performance of Adaptive Neuro-Fuzzy Inference System (ANFIS), SVM, and RF models in estimating cumulative infiltration and infiltration rate in arid areas of Iran, concluding that SVM, particularly with radial basis kernel function, outperforms ANFIS and RF. In a subsequent study, Sihag et al. (2020) compared ANN, Gaussian process (GP), Gene Expression Programming (GEP), and Generalized Neural network (GRNN) to estimate soil infiltration rates, finding that ANN with specific parameters achieves higher correlation coefficients than other algorithms. Singh et al. (2017) evaluated the performance of RF, ANN and M5P Model Tree techniques in predicting the infiltration rate. It was reported that the RF acted better in providing a closer estimation than ANN and M5P model tree.\u003c/p\u003e \u003cp\u003eAccording to the literatures, there are no universally acceptable algorithm that fits all site-specific scenarios. Many predictive algorithms, including ANN, can have a time-consuming training process due to their sensitivity to hyperparameter selection. ANN continue to demonstrate remarkable predictive accuracy in estimating infiltration rates. Additionally, they showcase their adaptability and robustness, even when confronted with small datasets (Jia and Culver 2006). One more ML technique which operates on the RF algorithm is MissForest (MF). It excels with small datasets due to its ability to handle missing data, heterogeneous data, and its efficient training process compared to other algorithms (Ispirova et al. 2020; Naranjo-Fern\u0026aacute;ndez et al. 2020; Bikše et al. 2023). Given the success of Random Forest in predicting infiltration in numerous studies, algorithms derived from its present promising successors. However, the 'fit_transform()' function from the 'missingpy' library in Python remains largely unexplored in soil infiltration prediction, signifying a notable research gap in the field. According to Parchami-Araghi et al. (2013) and Sy (2006), coupling ML techniques with physical-empirical based models results in reliable infiltration prediction. Therefore, the aim of this study is to develop hybrid ML and hydrological models for predicting infiltration rates. This will be achieved by using in-situ observations of gravel (%), sand (%), clay and silt (%), and soil moisture content (%) as predictor variables, measured from soil samples collected at surface, 50 cm, and 100 cm depth. The study will also focus on assessing the robustness of the hybrid models, particularly with scenarios involving an increasing number of predictor variables collected from various regions of Southern India.\u003c/p\u003e"},{"header":"2. Study area","content":"\u003cp\u003eA total of eleven infiltration points from various regions of Southern India (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e) were deliberately selected for this study to encompass a range of soil and climatic conditions. These points are situated within the premises of the Indian Institute of Astrophysics (IIA), a government institution dedicated to interplanetary observations. Specifically, four of the eleven points are located within the Hoskote Bengaluru Campus (site 1), another four are situated at the Kavalur Tamil Nadu campus (site 2), and the remaining three are found at the Gauribidanur Karnataka Campus (site 3). According to the Food and Agriculture Organization (FAO) soil database, site 1 is predominantly characterized by sandy clay loam texture, while site 2 and site 3 exhibit clay loam texture. Additionally, it is noted that the upper layer of the soil (SOL_Z) at these sites is less than 300mm in depth.\u003c/p\u003e \u003cp\u003eHoskote experiences an average annual rainfall of 843 mm with maximum and minimum temperature ranges from 33.6\u0026ordm;C to 15\u0026ordm;C, while Gauribidanur receives 694 mm of average annual rainfall with temperature ranges from 40\u0026ordm;C to 10\u0026ordm;C. Kavalur has an average rainfall of 917 mm, and temperatures range from 40.4 \u0026ordm;C to 18.5 \u0026ordm;C as per the Census of India (CIA) handbook 2011. The climates at these different sites are characterized by seasonally dry tropical savanna climate and semi-arid climate, providing a diverse range of soil and climatic conditions for the infiltration work. This information underscores the deliberate selection of the infiltration sites within the Indian Institute of Astrophysics (IIA) premises to capture a variety of environmental conditions for comprehensive study.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"3. Methodology","content":"\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Field measurement and data collection\u003c/h2\u003e \u003cp\u003eA comprehensive study spanning various climate regions in southern India involved diligent data collection from nearly 199 observations. These observations included measurements of infiltration rates (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:f\\left(t\\right)\\)\u003c/span\u003e\u003c/span\u003e) and cumulative infiltration (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:F\\left(t\\right)\\)\u003c/span\u003e\u003c/span\u003e) across 11 distinct sites. The detailed descriptions of all 11 infiltration test points are provided in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e and the infiltration observations \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:f\\left(t\\right)\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:F\\left(t\\right)\\)\u003c/span\u003e\u003c/span\u003e recoded at each test point are illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. Furthermore, soil characteristics were extensively examined, with close to 33 observations gathered at each site, encompassing percentages of gravel, sand, and combined silt and clay content, as well as moisture content. This detailed dataset, collected at surface level, 50 cm, and 100 cm depths, provides a robust foundation for investigating of infiltration processes.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDetail description of the infiltration test points\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLocation\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTest point ID\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eLatitude\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eLongitude\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eElevation above MSL\u003c/p\u003e \u003cp\u003e(m)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eVegetation cover\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eLand use\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eHoskote Bengaluru Campus\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eH1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e13\u0026deg; 6'47.00\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e77\u0026deg;48'45.00\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e937\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eSparse Shrubs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eH2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e13\u0026deg; 6'43.34\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e77\u0026deg;48'53.86\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e931\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eLawn\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eH3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e13\u0026deg; 6'50.51\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e77\u0026deg;48'38.71\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e941\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eMixed Forest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eH4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e13\u0026deg; 6'51.55\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e77\u0026deg;48'43.89\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e938\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eSparse Shrubs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eKavalur Tamil Nadu campus\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eK1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e12\u0026deg;34'33.62\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e78\u0026deg;49'17.59\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e718\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eSparse Shrubs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eK2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e12\u0026deg;34'27.76\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e78\u0026deg;49'13.50\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e724\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eMixed Forest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eForest\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eK3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e12\u0026deg;34'38.30\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e78\u0026deg;49'20.85\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e723\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eSparse Shrubs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eK4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e12\u0026deg;34'42.66\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e78\u0026deg;49'30.19\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e716\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eMeadow Grass\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eGauribidanur Karnataka Campus\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eG1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e13\u0026deg;36'12.85\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e77\u0026deg;25'43.17\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e725\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eMeadow Grass\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eG2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e13\u0026deg;36'9.97\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e77\u0026deg;25'37.39\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e724\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eBare Soil\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eG3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e13\u0026deg;36'8.69\"N\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e77\u0026deg;25'45.40\"E\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e723\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eSparse Shrubs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMixed Use\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe field process involved using a double ring infiltrometer made of a metal plate that is 10 mm thick with a depth of 42 cm, and inner and outer ring diameters of 25 cm and 48 cm, respectively. The upper existing soil layer is removed to eliminate surface irregularities and organic matter, ensuring accurate infiltration measurements. Both rings of the infiltrometer are driven simultaneously into the ground to a depth of 5 cm using a wooden plank and hammer. The observations are carried out in the inner ring to ensure that the infiltration measurements reflect the vertical downward movement of water into the soil strata, providing a more precise observation of soil water conductivity. The outer ring is used to control the lateral movement of water from the inner ring, which is one of the error sources of this type of infiltrometer. Observations were taken at time intervals of 2, 3, 5, 10, 15, 30, 45, 60, 90, and 120 minutes, and continued until the infiltration reached a steady rate.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo determine the percentages of gravel, sand, combined silt, and clay content, as well as the moisture content, various sample types were collected from different depths at each test point. Three messy samples, each weighing approximately 100 g, and six samples, each weighing approximately 1000 g, were collected from each test point (collectively 99 samples). The messy samples were used for moisture content analysis, while the other samples were used for determining the percentages of gravel, sand, and combined silt and clay content. The 100 g messy samples was divided into two equal parts, each weighing 50 g, and the moisture content present in each sample was determined using the ASTM Standard Test Method for Laboratory Determination of Water Content of Soil Sample by Mass (ASTM Standard D2216\u0026ndash;19 2019). To distinguish the size of soil particles, the 1000 g soil samples were oven-dried for 24 hours at 105℃. Subsequently, the dried samples were sieved using 4.75 mm and 0.0075 mm sieves to determine the percentage of gravel (particles greater than 4.75 mm), sand (particles ranging from 4.75 to 0.0075 mm), and combined silt and clay (particles less than 0.0075 mm). The experimental results of two samples collected from the same depth are averaged and used for further study.\u003c/p\u003e \u003cp\u003eThe infiltration rate measured with the double ring infiltrometer serves as the foundation for the formulation of infiltration models. These models, including the Philip\u0026rsquo;s (Philip 1969), Horton\u0026rsquo;s (Horton 1941), and Kostiakov\u0026rsquo;s (Kostiakov 1932), can be parameterized through fitting procedures by utilizing the observed infiltration data. The principal equation employed in the models is detailed in the Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, along with a comprehensive summary of each associated parameter. Analysing these models, the governing parameters that describe the infiltration process for a specific soil and land-use condition can be driven and which will be used for further training and testing of ML techniques. Figure\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e shows the detailed method of research and evaluation of hybrid ML and traditional infiltration models.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eApproximate equations for infiltration rate derived from both theoretical principles and empirical observations\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eEquation\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eModel Parameters\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eApplication Context\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePhilip (1969)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:f\\left(t\\right)=s{t}^{-1/2}+2A\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003es is sorptivity (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:L{T}^{-0.5}\\)\u003c/span\u003e\u003c/span\u003e), which is the capacity of the soil to absorb water due to capillarity. A is transmissivity factor (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:L{T}^{-1}\\)\u003c/span\u003e\u003c/span\u003e), indicating the rate at which water moves through the soil under a unit gradient, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:t\\)\u003c/span\u003e\u003c/span\u003e is the time of infiltration (T).\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSuitable for short-term infiltration studies and homogeneous soils.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHorton (1941)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:f\\left(t\\right)={i}_{c}+m\\left({e}^{-{K}_{h}t}\\right)\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{i}_{c}\\)\u003c/span\u003e\u003c/span\u003e (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:L{T}^{-1}\\)\u003c/span\u003e\u003c/span\u003e), is the steady rate or ultimate infiltration capacity, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:m=\\left({i}_{o}-{i}_{c}\\right),\\:\\)\u003c/span\u003e\u003c/span\u003ewhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{i}_{o}\\)\u003c/span\u003e\u003c/span\u003e(\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:L{T}^{-1}\\)\u003c/span\u003e\u003c/span\u003e), is the infiltration capacity at \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:t=0\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{K}_{h}\\)\u003c/span\u003e\u003c/span\u003e is an empirical soil constant.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eIdeal for predicting infiltration capacity over time, especially in varied soil conditions.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKostiakov (1932)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:f\\left(t\\right)=\\left(ab\\right){t}^{(b-1)}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:a\u0026gt;0\\:and\\:0\u0026lt;b\u0026lt;1\\)\u003c/span\u003e\u003c/span\u003e are empirical dimensionless constants\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCommonly used for irrigation studies and quick field estimations.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Machine leaning algorithm\u003c/h2\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003e3.2.1 Artificial Neural Network (ANN)\u003c/h2\u003e \u003cp\u003eAs inspired by the intricate connections of human neurons, the ANN serve as computational tools in hydrology, mimicking the natural complexity to model and predict water-related phenomena (ASCE Task Committee on Application of Artificial Neural Networks in Hydrology. 2000b, a). The ANN consist of one input layer, one or multiple hidden layers and one output layer. These layers process the information fed as input variables in interconnected processing components called nodes or neurons. Neurons in adjacent layers are allied through weighted connections, which function as communication channels. The connections between nodes represent weights (W), which determine the strength of influence one neuron has on neurons in the subsequent layer. Additionally, biases (B) are constant values added to the weighted sum of inputs for each neuron. These weights and biases enhance the flexibility of the ANN model to fit the input variables. Inputs to a neuron are multiplied by their corresponding weights, summed, and then processed through a transfer function, which controls the signal strength relayed through the neuron's output.\u003c/p\u003e \u003cp\u003eHidden and output layers in neural networks use special functions called \"activation functions\" to introduce non-linearity. This enables the network to learn complex patterns in the data that simple linear models cannot capture. Popular choices include Rectified Linear Unit (ReLU, known for its simplicity and efficiency) and the Hyperbolic Tangent function (tanh, which is similar to the sigmoid function but faster to compute and with different learning behavior). These functions are essential for the network's ability to capture intricate relationships within the data (Dubey et al. 2022).\u003c/p\u003e \u003cp\u003eIn our approach, a custom Python script utilizing the scikit-learn library was used to develop the multilayer feed-forward ANN model with a back-propagation training algorithm, specifically the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\prime\\:}\\text{M}\\text{L}\\text{P}\\text{R}\\text{e}\\text{g}\\text{r}\\text{e}\\text{s}\\text{s}\\text{o}\\text{r}{\\prime\\:}\\)\u003c/span\u003e\u003c/span\u003e. The data, loaded from an Excel sheet, consisted of various features and target variables. The available data was split such that the last row was reserved for testing, while the rest was used for training. Hyperparameter tuning was conducted using \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\prime\\:}\\text{G}\\text{r}\\text{i}\\text{d}\\text{S}\\text{e}\\text{a}\\text{r}\\text{c}\\text{h}\\text{C}\\text{V}{\\prime\\:}\\)\u003c/span\u003e\u003c/span\u003e with a parameter grid that included variations in hidden layer sizes, activation functions, and solvers. The best estimator was selected based on the mean squared error scoring. This approach ensured optimal prediction accuracy for the target variables.\u003c/p\u003e \u003cp\u003eAlso, we explored various scenarios to test the performance of the ANN model by incorporating data from multiple neighbouring test points and a target test point. This approach, which we refer to as a hybrid model integrating different infiltration models such as Philip's, Horton's, and Kostikov's methods with ANN. As an example, the 22 input parameters used were the percentages of gravel (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:G\\)\u003c/span\u003e\u003c/span\u003e), sand (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:S\\)\u003c/span\u003e\u003c/span\u003e), silt and clay (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:CS\\)\u003c/span\u003e\u003c/span\u003e), and moisture content (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:M\\)\u003c/span\u003e\u003c/span\u003e) for all test points (1, 2, 3, and 4) and, as well as the sorptivity (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:s\\)\u003c/span\u003e\u003c/span\u003e) and transmissivity factor (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:A\\)\u003c/span\u003e\u003c/span\u003e) for the nearby points (1, 2, and 3). The output parameters were the sorptivity (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{s}_{t}\\)\u003c/span\u003e\u003c/span\u003e) and transmissivity factor (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{A}_{t}\\)\u003c/span\u003e\u003c/span\u003e) for the target test point (t). Therefore, there were 22 nodes in the input layer and two in the output layer. The suggested structure was 22-j-2, where j is the number of nodes in the hidden layer (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eIn a similar manner, we tested the ANN model using Horton's and Kostikov's methods as hybrid models, integrating their respective parameters to evaluate and enhance the network's prediction capabilities across different scenarios.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section3\"\u003e \u003ch2\u003e3.2.2 MissForest (MF)\u003c/h2\u003e \u003cp\u003eStekhoven and B\u0026uuml;hlmann (2012) introduced the MF algorithm, an iterative imputation method that is an enhanced version of the RF algorithm. RF algorithm ((Breiman 2001) grow many decision trees and average their results. However, averaging can mask underlying variability and interactions in the data, leading to biased imputations, which MF addresses by iteratively capturing these complex relationships for more accurate imputation.\u003c/p\u003e \u003cp\u003eTo illustrate the concept, consider an example of a hybrid MF algorithm combined with Philip's model. In each iteration of the adapted MF algorithm, aimed at imputing missing values of sorptivity (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{s}_{t}\\)\u003c/span\u003e\u003c/span\u003e) and transmissivity factor (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{A}_{t}\\)\u003c/span\u003e\u003c/span\u003e) at the target test site \u0026lsquo;t\u0026rsquo;, the process begins by preparing the data from an Excel file. Features \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:G,\\:S,\\:CS\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:M\\)\u003c/span\u003e\u003c/span\u003e and targets \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{s}_{t}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{A}_{t}\\)\u003c/span\u003e\u003c/span\u003e are separated, with the last row reserved for testing. Specifically, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{X}_{train}=df.iloc[:\\:-1,\\::\\:-2]\\)\u003c/span\u003e\u003c/span\u003e includes all rows except the last one, excluding the last two columns, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{y}_{train}=df.iloc[:\\:-1,\\::\\:-2]\\)\u003c/span\u003e\u003c/span\u003e includes the last two columns for all rows except the last one. The MF model fits a RF model with \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{P}_{k}\\sim\\{G,\\:S,\\:CS,\\:M\\}\\)\u003c/span\u003e\u003c/span\u003e, where \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{P}_{k}\\)\u003c/span\u003e\u003c/span\u003e is either \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{s}_{t}\\)\u003c/span\u003e\u003c/span\u003e or \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{A}_{t}\\)\u003c/span\u003e\u003c/span\u003e, using data from rows at three nearby test points that do not have missing values. These test points include data like \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{G}_{1},\\:{\\:S}_{1},\\:{CS}_{1},\\)\u003c/span\u003e\u003c/span\u003e \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{M}_{1}\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{s}_{1}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{A}_{1}\\)\u003c/span\u003e\u003c/span\u003e for test point 1, and similarly for test points 2 and 3. A grid search optimizes hyperparameters such as the \u0026lsquo;\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:n\\_estimators\\)\u003c/span\u003e\u003c/span\u003e\u0026rsquo;, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\prime\\:}\\text{m}\\text{a}\\text{x}\\_features{\\prime\\:}\\)\u003c/span\u003e\u003c/span\u003e, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\prime\\:}\\text{m}\\text{a}\\text{x}\\_depth{\\prime\\:}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\prime\\:}random\\_state{\\prime\\:}\\)\u003c/span\u003e\u003c/span\u003e. Once the best RF model is identified, it is used to impute missing values in the last row of the test data. This involves initializing the MF model, using \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\prime\\:}\\text{G}\\text{r}\\text{i}\\text{d}\\text{S}\\text{e}\\text{a}\\text{r}\\text{c}\\text{h}\\text{C}\\text{V}{\\prime\\:}\\)\u003c/span\u003e\u003c/span\u003e to fit the model and find the best parameters, imputing missing values in the test data, and predicting \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{s}_{t}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{A}_{t}\\)\u003c/span\u003e\u003c/span\u003e for the last row. The best parameters' values are estimated using the mean squared error, ensuring accurate predictions. Initial mean imputation provides a baseline, and iterative refinement improves the predictions based on the optimized model.\u003c/p\u003e \u003cp\u003eSimilarly, models were built using Horton's and Kostikov's methods, combining the classical infiltration models with the MF algorithm to robustly impute and predict the missing values under different scenarios with an increase in nearby stations' data.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Performance evaluation criteria\u003c/h2\u003e \u003cp\u003eThe infiltration rates obtained from the hybrid models for varying target test points were evaluated by computing three standard statistical performance indicators and one graphical indicator. These indicators were the coefficient of determination (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{R}^{2})\\)\u003c/span\u003e\u003c/span\u003e, Root Mean Square Error (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:RMSE\\)\u003c/span\u003e\u003c/span\u003e), Legates\u0026rsquo;s Coefficient of Efficiency (LCE), and Taylor diagram.\u003c/p\u003e \u003cp\u003eThe three statistical indicators were expressed as:\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Taba\" border=\"1\"\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{R}^{2}=\\:\\left[\\frac{{\\left[\\sum\\:_{i=1}^{N}\\left({p}_{i}-{P}_{m}\\right)\\left({o}_{i}-{O}_{m}\\right)\\right]}^{2}}{\\sum\\:_{i=1}^{N}{\\left({p}_{i}-{P}_{m}\\right)}^{2}\\:\\sum\\:_{i=1}^{N}{\\left({o}_{i}-{O}_{m}\\right)}^{2}}\\right]\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e(1)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:RMSE={\\left[\\frac{\\sum\\:_{i=1}^{N}{({p}_{i}-{o}_{i})}^{2}}{N}\\right]}^{1/2}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e(2)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:LCE=1-\\frac{\\left[\\sum\\:_{i=1}^{N}\\left|{p}_{i}-{o}_{i}\\right|\\right]}{\\left[\\sum\\:_{i=1}^{N}\\left|{o}_{i}-{O}_{m}\\right|\\right]}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e(3)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003ewhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{o}_{i}\\)\u003c/span\u003e\u003c/span\u003e is the i-th observed infiltration rate, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{p}_{i}\\)\u003c/span\u003e\u003c/span\u003e is the i-th predicted infiltration rate. The mean values of the observed and predicted rates, each consisting of N values, are represented as \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{O}_{m}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{P}_{m}\\)\u003c/span\u003e\u003c/span\u003e respectively.\u003c/p\u003e \u003cp\u003eR\u0026sup2; is a commonly used metric to measure the degree of correlation between predicted and observed values, where an ideal R\u0026sup2; value close to 1 indicates a strong match. To quantify the error between these values, RMSE is frequently employed, expressed in the same units as the observed values; lower RMSE values suggest better predictive accuracy. Additionally, LCE provides a dimensionless measure of model prediction accuracy relative to observed values, with values near 1 indicating near-perfect agreement (Legates and McCabe 1999).\u003c/p\u003e \u003cp\u003eA Taylor diagram offers a comprehensive visual representation of model performance by combining RMSE, R\u0026sup2;, and standard deviation (SD) into a single polar plot (Taylor 2001). This tool's significant advantage is its ability to compare multiple model predictions on a single plot, providing a more holistic view than individual summary statistics. In this study, a Python script was developed to generate Taylor diagrams, facilitating rapid assessment of model predictions relative to observed values and enabling efficient comparison of multiple models.\u003c/p\u003e \u003c/div\u003e"},{"header":"4. Results and Discussions","content":"\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003e4.1. Predictor variables from lab-measured soil parameters\u003c/h2\u003e \u003cp\u003eThe soil parameters for different test points, including gravel (G), sand (S), silt and clay (SC), and moisture content (M), were measured at three different depths: at the surface, 50 cm, and 100 cm below the surface. The data obtained from the laboratory analysis revealed notable variability across different test points and depths (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). The Gravel content (G) varied significantly, with surface levels ranging from 14% at H3 to 47.8% at K4, generally decreasing with soil depth. Sand content (S) showed notable variability, with the highest surface content at H3 (85.2%) and increasing with depth, particularly at K4 (92.5% at 100 cm). Silt and clay (SC) content was relatively low across all test points and depths, with surface levels ranging from 0.0% at K1, K2 and K3 to 4.1% at G3. Moisture content (M) varied widely, with surface levels from 2.1% at G1 to 14.2% at K3, generally increasing with depth, reaching up to 15.0% at H2 at 100 cm.\u003c/p\u003e \u003cp\u003eSite-specific observations showed that Hoskote Station (H), with elevations from 941 to 931 m and mixed-use land, exhibited a decrease in gravel content and an increase in moisture content with depth. Kavalur Station (K), in a hilly area with elevations from 724 to 716 m, showed high surface gravel content and low clay and silt, indicative of well-drained conditions. Gauribidanur Station (G), with elevations from 725 to 723 m and known for water scarcity, exhibited high sand content at depth and low moisture, highlighting its water scarcity issues. These variations in soil parameters across different sites and depths reflect the heterogeneity of soil properties in the study area, which is crucial for understanding soil behaviour and management in semi-arid regions.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSoil parameters from laboratory analysis for training hybrid models\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"13\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c10\" colnum=\"10\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c11\" colnum=\"11\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c12\" colnum=\"12\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c13\" colnum=\"13\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eTest point\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{G}^{a}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{S}^{b}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{SC}^{c}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{M}^{d}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:G\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:S\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:SC\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:M\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:G\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c11\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:S\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:SC\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c13\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:M\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003cp\u003e(%)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colspan=\"4\" nameend=\"c5\" namest=\"c2\"\u003e \u003cp\u003eAt surface\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"4\" nameend=\"c9\" namest=\"c6\"\u003e \u003cp\u003eAt 50 cm below surface\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"4\" nameend=\"c13\" namest=\"c10\"\u003e \u003cp\u003eAt 100 cm below surface\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:H1\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e23.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e76.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e10.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e17\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e81.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e1.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e8.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e11.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e86.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e1.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e9.8\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:H2\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e17.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e81.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e1.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e7.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e12.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e84.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e2.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e12.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e6.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e90.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e3.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e15\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:H3\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e14\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e85.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e9.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e14.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e83.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e1.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e11.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e22.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e74.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e2.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e8.7\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:H4\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e42.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e55.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e1.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e9.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e23.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e76.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e0.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e11.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e14.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e82.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e2.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e9.6\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:H5\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e26.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e71.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e1.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e9.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e13.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e86.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e0.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e12.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e22.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e76.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e0.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e12.2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:K1\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e39.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e60.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e5.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e15\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e84\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e10.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e11.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e82.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e4.2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:K2\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e44.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e55.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e13.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e19.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e0.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e10.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e19.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e80.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e0.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e7.2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:K3\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e32.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e67.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e14.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e26.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e73.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e0.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e16.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e25.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e74.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e8.6\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:K4\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e47.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e51.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e12\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e32\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e68\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e11.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e7.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e92.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e0.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e9\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:G1\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e25.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e72.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e1.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e2.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e48.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e51.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e0.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e4.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e12.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e85.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e1.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e4.6\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:G2\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e33.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e65.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e1.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e2.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e33.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e65.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e5.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e7.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e91.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e5.2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:G3\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e25.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e70.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e4.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e8.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e14.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e83\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e2.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e7.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e5.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c11\"\u003e \u003cp\u003e89.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e4.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c13\"\u003e \u003cp\u003e3.6\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThese results align with other studies on soil variability in semi-arid regions. Yuan et al. (2024) emphasized the influence of land use and topography on soil properties, noting that urban and disturbed areas tend to exhibit higher variability in surface soil composition due to anthropogenic interferences. Similarly, Bonanomi et al. (2024) and Qiu et al. (2001) highlighted the impact of elevation and land use on soil moisture content, with lower moisture levels typically observed in hilly and well-drained areas, and higher moisture retention in flatter, less disturbed regions. These studies support the observed patterns in the current analysis, where moisture content increases with depth and is higher in flatter areas like Hoskote and Gauribidanur compared to the hilly Kavalur. Moreover, the high sand content observed at deeper levels in Gauribidanur is consistent with findings by Ceballos et al. (2002), reporting similar trends in semi-arid regions, where deeper soil layers often exhibit higher sand fractions due to historical deposition processes. This high sand content correlates with the low moisture retention capacity, exacerbating water scarcity issues in these areas. In contrast, the mixed-use and forest land use at Kavalur contribute to lower moisture levels and higher gravel content, typical of well-drained, hilly terrains as described by Lado et al. (2004). The soil parameters obtained from this study will be used as predictor variables to train and test hybrid ML and infiltration models to predict infiltration rates. Recent studies have demonstrated the efficacy of ML techniques in hydrological modeling, significantly enhancing prediction accuracy (Mosavi et al. 2018). Integrating these parameters into hybrid ML and hydrological models will substantially improve the accuracy of infiltration rate predictions, which is crucial for effective water resource management and soil conservation in semi-arid regions. The heterogeneity in soil properties observed in this study underscores the necessity for tailored approaches in model training and validation to account for site-specific characteristics, ensuring robust predictions. This approach aligns with Salvadore et al. (2015), emphasizing the importance of developing site-specific models in heterogeneous environments to significantly boost predictive accuracy and management effectiveness.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003e4.2. Derived infiltration parameters for hybrid model development\u003c/h2\u003e \u003cdiv id=\"Sec12\" class=\"Section3\"\u003e \u003ch2\u003e4.2.1 Sorptivity and transmissivity from Philip's model\u003c/h2\u003e \u003cp\u003eThe application of Philip's model to the observed infiltration rates \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:f\\left(t\\right)\\)\u003c/span\u003e\u003c/span\u003e against \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{t}^{-0.5}\\)\u003c/span\u003e\u003c/span\u003e for infiltration test sites provided key parameters: sorptivity (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:s\\)\u003c/span\u003e\u003c/span\u003e) and transmissivity factor (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:A\\)\u003c/span\u003e\u003c/span\u003e) (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). The Analysis resulted that H1 had the highest sorptivity (13.378), indicating rapid initial infiltration due to its high sand content (76.1%) and low clay content (0.7%) (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). K3 and K2 also exhibited high sorptivity values (12.720 and 4.962, respectively), corresponding to their relatively high sand content (67.6% and 55.2%, respectively). In contrast, G3 had the lowest sorptivity (1.497), aligning with its relatively high silt and clay content (4.1%) and high moisture content (8.2%). The transmissivity factors further supported these findings, with H1 and H3 displaying high values (0.529 and 1.124, respectively), indicating efficient water movement through the sandy soil. Conversely, G3's very low transmissivity factor (0.001) suggested its soil structure hinders water movement, consistent with its higher silt and clay content.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe sandy soils observed at H1 and K3 generally exhibit higher infiltration rates (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e) due to larger pore spaces that facilitate rapid water movement. These findings aligns with observations by Allaire et al. (2009) and Manns et al. (2024), which indicate that coarse-textured soils like sands promote greater hydraulic conductivity due to their interconnected macropores. Conversely, soils with higher silt and clay content, along with higher moisture content (e.g., G3), show reduced infiltration rates. The smaller particle sizes and larger specific surface area in these soils create a more tortuous path for water, hindering hydraulic conductivity (Mantoglou and Gelhar 1987). Additionally, subsurface moisture content significantly impacts a soil's ability to absorb and transmit water after initial infiltration. Dry soils, like those observed in H1 with a low moisture content (8.8% at 50 cm and 9.8% at 100 cm), exhibit higher sorptivity. This translates to a greater capacity for prolonged water uptake. Conversely, soils with higher soil moisture (e.g., H2 with 12.1% at 50 cm and 15% at 100 cm) demonstrate lower sorptivity. This can be attributed to reduced capillary forces driving water absorption, as reported by Rosenbom et al. (2009).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section3\"\u003e \u003ch2\u003e4.2.2 Initial infiltration capacity and soil empirical constant from Horton\u0026rsquo;s model\u003c/h2\u003e \u003cp\u003eThe scatter plots showed in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e illustrate the fit of Horton's infiltration model to the observed infiltration data from various sites. Each plot gives the logarithmic difference in infiltration capacity (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:ln\\left({i}_{o}-{i}_{c}\\right)\\)\u003c/span\u003e\u003c/span\u003e) against time, along with the derived parameters \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{K}_{h}\\)\u003c/span\u003e\u003c/span\u003e (empirical soil constant or Horton's decay coefficient) and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{i}_{o}-{i}_{c}\\)\u003c/span\u003e\u003c/span\u003e (the initial infiltration potential). The parameters were estimated from the linear fits to the logarithmic infiltration data, indicating the rate of decrease in infiltration capacity over time and the initial infiltration potential.\u003c/p\u003e \u003cp\u003eFor the Hoskote sites, the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{K}_{h}\\)\u003c/span\u003e\u003c/span\u003e values ranged from 0.015 to 0.031, with \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{i}_{o}-{i}_{c}\\)\u003c/span\u003e\u003c/span\u003e values ranging from 0.253 to 4.253. The Kavalur sites exhibited \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{K}_{h}\\)\u003c/span\u003e\u003c/span\u003e values between 0.019 and 0.048, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{i}_{o}-{i}_{c}\\)\u003c/span\u003e\u003c/span\u003e values from 0.418 to 2.094. The Gauribidanur sites demonstrated \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{K}_{h}\\)\u003c/span\u003e\u003c/span\u003e values ranging from 0.025 to 0.039, with \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{i}_{o}-{i}_{c}\\)\u003c/span\u003e\u003c/span\u003e values from 0.215 to 0.527. Here, the decay constants and initial infiltration potentials serve as primary indicators of soil infiltration performance. The varied decay constants across different sites underscore the influence of site-specific factors. For example, the higher decay constants observed at H4, K4, G1, and G3 suggest quicker saturation and reduced infiltration rates that decrease asymptotically to reach the basic or steady-state over time (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e), which could be indicative of higher silt and clay content in the upper profile of the soil (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). These observations align with the findings of Adhikary et al. (2008), who reported a significant inverse relationship between silt and clay content and infiltration rates, demonstrating that increased silt and clay concentrations substantially impede soil permeability and water infiltration. Additionally, the initial infiltration potential offers valuable insights into the soil's initial response to water application, a key parameter for designing efficient irrigation and drainage management (Gjettermann et al. 1997). The high sand content observed in Hoskote corresponds to its high initial infiltration potential. This reinforces the notion that larger pore spaces characteristic of sandy soils promote faster infiltration. Sandy soils, dominated by large, interconnected macropores, typically exhibit higher initial infiltration potential due to their enhanced hydraulic conductivity (Zhang and Schaap 2019). This rapid infiltration minimizes surface runoff and promotes deep percolation, which in turn is crucial for groundwater recharge in the vadose zone (Shanafield and Cook 2014; He et al. 2024).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section3\"\u003e \u003ch2\u003e4.2.3 Empirical dimensionless constants from Kostiakov\u0026rsquo;s model\u003c/h2\u003e \u003cp\u003eThe Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e utilizes scatter plots to showcase the application of Kostiakov's infiltration model to field observations of cumulative infiltration (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:F\\left(t\\right)\\)\u003c/span\u003e\u003c/span\u003e) across various test sites. The x-axis depicts the natural logarithm of time (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{l}\\text{n}t\\)\u003c/span\u003e\u003c/span\u003e), while the y-axis represents the natural logarithm of cumulative infiltration (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{ln}F\\left(t\\right)\\)\u003c/span\u003e\u003c/span\u003e). This logarithmic transformation facilitates a clearer visualization of the relationship between infiltration and time, enabling a comprehensive analysis of infiltration patterns at each location. The successful fitting of the model to the data isolates the initial phase of rapid infiltration, characterized by the parameter '\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:a\\)\u003c/span\u003e\u003c/span\u003e' and the subsequent decline in infiltration rate, represented by '\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:b\\)\u003c/span\u003e\u003c/span\u003e'. These parameters offer valuable information regarding the hydraulic properties of the soil at each site, reflecting the impact of soil texture and structure on water infiltration.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eH1 and K3 exhibited the highest \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:a\\)\u003c/span\u003e\u003c/span\u003e values (7.600 and 7.200, respectively), indicating rapid initial infiltration consistent with their high sand content (76.1% for H1 and 67.6% for K3), which promotes higher hydraulic conductivity. Conversely, G1 and G3 showed the lowest \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:a\\)\u003c/span\u003e\u003c/span\u003e values (0.353 and 0.800, respectively), reflecting slower initial infiltration rates due to higher silt and clay content, which hinder infiltration. The \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:b\\)\u003c/span\u003e\u003c/span\u003e values ranged between 0.579 (G3) and 0.843 (H2), indicating a moderate decrease in infiltration rate over time across all sites. The consistent \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:b\\)\u003c/span\u003e\u003c/span\u003e values suggest a uniform decrease in infiltration rate over time, reflecting similar soil behaviour in terms of infiltration rate reduction. This could be attributed to factors like pore clogging by fine particles, as reported by Mantoglou and Gelhar (1987) in their modeling of water flow in stratified soils. Additionally, the gradual saturation of the upper soil horizons, as infiltration progresses, can contribute to the observed decrease in the infiltration rate.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003e4.3. Evaluation of hybrid machine learning and hydrological models for infiltration rate prediction under varying scenarios\u003c/h2\u003e \u003cp\u003eFor a comprehensive and unbiased evaluation, two target test sites (H4 and K4) were randomly chosen. This approach minimizes potential site-specific biases and provides a more generalizable assessment of the models' performance. A diverse set of predictor variables is utilized, including both direct soil data and features extracted from existing hydrological models. This broadens the model's understanding of infiltration processes, leading to a more robust evaluation that's less susceptible to site-specific quirks.\u003c/p\u003e \u003cp\u003eThe following section delves into the details of each hybrid model's performance, providing insights into their effectiveness for predicting infiltration rates\u003c/p\u003e \u003cdiv id=\"Sec16\" class=\"Section3\"\u003e \u003ch2\u003e4.3.1 Hybrid ANN and hydrological models keeping H4 as a target site\u003c/h2\u003e \u003cp\u003eThe evaluation of hybrid ANN and hydrological models for predicting infiltration rates keeping H4 has target site reveals significant insights. Increasing the number of predictor sites enhances the models' ability to accurately predict the observed infiltration rates. This trend is visually represented in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003ea, where prediction curves for models with 7 and 10 predictor sites align more closely with the observed infiltration rates compared to those with only 3 predictor sites. Quantitatively, the ANN\u0026thinsp;+\u0026thinsp;Horton model demonstrates the most consistent improvement across various error metrics with increasing predictor sites, followed by the ANN\u0026thinsp;+\u0026thinsp;Philip and ANN\u0026thinsp;+\u0026thinsp;Kostiakov models.\u003c/p\u003e \u003cp\u003eFor the ANN\u0026thinsp;+\u0026thinsp;Horton model, the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e increases from 0.85 with 3 predictor sites to 0.94 with 10 predictor sites, indicating a significant enhancement in model performance (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003eb). The \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e decreases from 0.36 to 0.08 cm/min, highlighting reduced prediction errors. Additionally, the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e improves from \u0026minus;\u0026thinsp;0.42 to 0.76, reflecting enhanced model efficiency and predictive accuracy. The Taylor diagram corroborates these findings, showing that with more predictor sites, the model points move closer to the reference point, indicating better agreement with observed values (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003ec). Although the ANN\u0026thinsp;+\u0026thinsp;Philip model maintains a high \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e from 0.87 to 0.91 across different scenarios, its \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e decreases from 1.60 to 0.08 cm/min, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e shifts from \u0026minus;\u0026thinsp;5.90 to 0.71, indicating substantial improvements as the ANN\u0026thinsp;+\u0026thinsp;Horton model. The ANN\u0026thinsp;+\u0026thinsp;Kostiakov model shows a high \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e, decreasing \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e from 1.34 to 0.09 cm/min, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e improving from \u0026minus;\u0026thinsp;4.41 to 0.67, with improved performance on the Taylor diagram, though less consistent compared to the ANN\u0026thinsp;+\u0026thinsp;Horton model.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section3\"\u003e \u003ch2\u003e4.3.2 Hybrid MF and hydrological models keeping H4 as a target station\u003c/h2\u003e \u003cp\u003eThe assessment of hybrid MF and hydrological models for predicting infiltration rates at the target station H4 is shown in the Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e. Similar to hybrid ANN models, incorporating more predictor sites will enhances the models' ability to predict the infiltration rates (as shown in the Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003ea). The MF\u0026thinsp;+\u0026thinsp;Philip model excels \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e value of 0.89, 0.91, and 0.92 for scenarios 3, 7, and 10 predictor sites. As illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eb and c, with increasing predictor sites demonstrably benefits this model, significantly reducing errors (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e from 1.57 to 0.1 cm/min) and improving efficiency (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e from \u0026minus;\u0026thinsp;5.87 to 0.62), as upheld by the Taylor diagram's convergence towards the reference point. The MF\u0026thinsp;+\u0026thinsp;Horton model exhibits a more nuanced behaviour. While its \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e starts well (0.87 with 3 sites), it dips slightly (0.84 with 10 sites) with more predictors, suggesting potential overfitting. However, this is outweighed by a substantial decrease in errors (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e from 1.0 to 0.19 cm/min) and a clear improvement in efficiency (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e from \u0026minus;\u0026thinsp;3.71 to 0.47) as the number of sites increases. The Taylor diagram aligns with this, showing the model better matching observations with more predictors. Finally, the MF\u0026thinsp;+\u0026thinsp;Kostiakov model thrives with increasing predictor sites. The \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e steadily increases from 0.88 to 0.9, indicating a stronger grasp of the data's variability. Similarly, the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e descends (from 1.59 to 0.12 cm/min), highlighting a considerable error reduction. The \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e showcases a remarkable improvement as well (from \u0026minus;\u0026thinsp;6.31 to 0.69), demonstrating enhanced model efficiency and accuracy. As with the other models, the Taylor diagram confirms this positive trend.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section3\"\u003e \u003ch2\u003e4.3.3 Hybrid ANN and hydrological models keeping K4 as a target station\u003c/h2\u003e \u003cp\u003eAnalysing predicted infiltration rates at K4 reveals that incorporating more predictor sites (7 and 10) significantly improves the accuracy of hybrid ANN and hydrological models, as evident in Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003ea, where models produce prediction curves closely aligned with observed data (particularly ANN\u0026thinsp;+\u0026thinsp;Philip, whose curves progressively approach observed rates). This is further supported by bar plot of error measures (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003eb): ANN\u0026thinsp;+\u0026thinsp;Philip exhibits substantial increase in \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e (0.72 to 0.80), drastic reduction in \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e (1.05 to 0.11 cm/min), and improved \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e (-5.38 to 0.57), indicating enhanced efficiency and accuracy. While improvements in ANN\u0026thinsp;+\u0026thinsp;Horton and ANN\u0026thinsp;+\u0026thinsp;Kostiakov models are less pronounced, some metrics improve with more predictor sites, while others show contrasting behaviour. For instance, ANN\u0026thinsp;+\u0026thinsp;Horton's \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e decreases from 0.59 (3 sites) to 0.54 (10 sites), \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e increases from 0.16 to 0.33 cm/min and \u0026minus;\u0026thinsp;0.05 to -0.92. Similarly, ANN\u0026thinsp;+\u0026thinsp;Kostiakov's \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e reduces from 0.72 to 0.67, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e decreases from 0.34 to 0.11 cm/min. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e values improve from \u0026minus;\u0026thinsp;0.49 to 0.63. The Taylor diagram (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003ec) reinforces this, demonstrating that model points move closer to the reference point with more data, signifying better agreement with observed values. Overall, for Kavalur site 4, increasing predictor sites generally enhances the model performance, with ANN\u0026thinsp;+\u0026thinsp;Philip exhibiting the most significant improvements.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section3\"\u003e \u003ch2\u003e4.3.4 Hybrid MF and hydrological models keeping K4 as a target station\u003c/h2\u003e \u003cp\u003eThe Fig.\u0026nbsp;\u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003e illustrates the prediction accuracy of hybrid MF and hydrological models for infiltration rates at K4. The results reveal a clear benefit for the MF\u0026thinsp;+\u0026thinsp;Philip model, which exhibits a substantial decrease in \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e from 0.87 to 0.20 cm/min, indicating a significant reduction in prediction error. Additionally, the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e values show a marked improvement from \u0026minus;\u0026thinsp;3.58 to 0.13, reflecting enhanced model efficiency and accuracy. While \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e values show some variability, the overall trend suggests improvement for MF\u0026thinsp;+\u0026thinsp;Philip with more predictor sites.\u003c/p\u003e \u003cp\u003eThe \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e values exhibit inconsistent changes in both MF\u0026thinsp;+\u0026thinsp;Horton and MF\u0026thinsp;+\u0026thinsp;Kostikov model. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e generally, improves in MF\u0026thinsp;+\u0026thinsp;Horton (0.31 to 0.22 cm/min) but with a slight increase at 10 sites. \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e shows mixed improvements, suggesting some performance gains but inconsistencies. Similarly, MF\u0026thinsp;+\u0026thinsp;Kostiakov's \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{R}}^{2}\\)\u003c/span\u003e\u003c/span\u003e shows a slight decrease, while \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{R}\\text{M}\\text{S}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e improves (0.51 to 0.47 cm/min). \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{L}\\text{C}\\text{E}\\)\u003c/span\u003e\u003c/span\u003e initially improves but worsens slightly with more data. The Taylor diagram (Fig.\u0026nbsp;\u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003ec) reinforces these findings. MF\u0026thinsp;+\u0026thinsp;Philip displays the most consistent movement towards the reference point, indicating improved agreement with observed values. Conversely, MF\u0026thinsp;+\u0026thinsp;Horton and MF\u0026thinsp;+\u0026thinsp;Kostiakov models show less consistent trends.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAs the study aims to predict infiltration rates at sites using minimal data, such as soil parameters (% gravel, % sand, % silt\u0026thinsp;+\u0026thinsp;clay, % moisture), leveraging datasets from accessible locations. This approach is particularly valuable for locations where direct measurement is impractical. However, some models (e.g., ANN\u0026thinsp;+\u0026thinsp;Horton and MF\u0026thinsp;+\u0026thinsp;Horton) exhibit mixed results with increasing predictor sites. This could be due to overfitting and the inherent complexity of soil-water interactions not fully captured by the models. To comprehensively evaluate model performance, we employed different error measures.\u003c/p\u003e \u003cp\u003ePrevious research has demonstrated significant improvements in predictive accuracy by integrating ML techniques with hydrological models. This study aligns with the trend in data-driven hydrological modeling, emphasizing the importance of combining physical process-based empirical models with ML approaches for enhanced prediction. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e12\u003c/span\u003e, standalone infiltration models (Horton's, Philip's, Kostiakov's) exhibited limited success in predicting infiltration rates compared to the superior performance achieved by the hybrid models (particularly ANN\u0026thinsp;+\u0026thinsp;Philip) at both target sites. Althoff et al. (2021), Xu et al. (2024) and Young et al. (2017) also highlighted that hybrid models can effectively capture complex hydrological processes that traditional models cannot. Similarly, Zubelzu et al. (2024) underscored the benefits of ML algorithms in improving the accuracy of hydrological predictions, especially in data-scarce regions. Furthermore, the adaptability of ML-based hybrid models to the inherent variability and uncertainties within hydrological data is particularly valuable for predicting infiltration rates across diverse soil and climatic conditions, where conventional models often struggle (Parchami-Araghi et al. 2013).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAmong the evaluated models, hybrids incorporating Philip's equation (ANN\u0026thinsp;+\u0026thinsp;Philip and MF\u0026thinsp;+\u0026thinsp;Philip) demonstrated superior performance in predicting infiltration rates at remote sites. Philip's model, a well-established hydrological approach, describes infiltration as a function of time and initial soil moisture. Its robust mathematical foundation provides a solid representation of infiltration processes, making it a strong foundation for integration with ML models. Furthermore, the superior performance of these Philip's model-based hybrids can be attributed to their ability to leverage the strengths of both approaches. The combination of an ANN or MF with Philip's model enhances predictive accuracy. While Philip's model offers a strong foundation, ANN and MF can capture complex, non-linear relationships in the data that the physical model alone might miss (Sy 2006). This synergy between physical process understanding and data-driven techniques ultimately leads to improved prediction capabilities.\u003c/p\u003e \u003cp\u003eIn contrast, the Horton and Kostiakov models, while useful, have limitations that might explain their relatively poorer performance when combined with ML models. Both Horton and Kostiakov models are primarily empirical and might not generalize well across different environments, which can limit the hybrid models' ability to capture the full complexity of infiltration processes (Parchami-Araghi et al. 2013). To overcome these limitations, the integration of ML techniques can be tailored to enhance the empirical models' flexibility and generalizability. By using ANN and MF, these hybrid models can identify and adjust for site-specific factors and non-linear relationships that the empirical models alone may miss. Additionally, incorporating techniques such as cross-validation and sensitivity analysis can help quantify and reduce uncertainties, ensuring more robust and reliable predictions across diverse environments. This combined approach not only leverages the empirical models' simplicity and ease of use but also enriches them with the adaptability and precision of machine learning, leading to better performance in various hydrological contexts.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"5. Conclusions","content":"\u003cp\u003eThis study successfully integrated traditional hydrological models with ML techniques to predict infiltration rates in semi-arid regions of southern India. Hybrid models, particularly those based on Philip's equation, outperformed standalone traditional methods by leveraging both physical understanding and ML's predictive power. For instance, the ANN-Philip model achieved impressive accuracy (R\u0026sup2;, RMSE, and LCE of 0.91, 0.08 cm/min, and 0.71 at target site H4 and 0.92, 0.1 cm/min, and 0.62 at K4). A robust dataset with detailed soil characteristics from various depths and locations was crucial for model training. The study further demonstrates the value of spatially diverse data, as models trained with data from more sites exhibited higher performance. Accounting for the inherent variability in semi-arid soils was essential for robust predictions. Importantly, the hybrid models can predict infiltration rates at remote sites with minimal data, making them a valuable tool for areas with limited measurement capabilities. By integrating theory-guided data science with physics-informed ML, these hybrid models offer interpretable and accurate predictions, overcoming a major limitation of traditional approaches in hydrology. This novel approach has the potential to significantly improve water resource management and soil conservation in semi-arid and data-scarce regions. However, the study observed that hybrid models struggled to predict chaotic patterns, such as sudden dips and peaks in infiltration rates, especially at the start of the process. These fluctuations, likely due to initial soil conditions, water repellency, or micro-variations in soil texture, were not well captured. To overcome these limitations and address the concerns regarding the integration of empirical models with ML, future research should aim to improve the models' ability to handle these chaotic changes by incorporating more detailed data, utilizing advanced ML techniques, and enhancing training algorithms. Specifically, integrating cross-validation and sensitivity analysis can help quantify and reduce uncertainties, ensuring more robust and reliable predictions. Furthermore, exploring advanced ML techniques such as recurrent neural networks (RNNs) or long short-term memory (LSTM) networks could better capture temporal dynamics and sudden variations in infiltration rates. Additionally, using high-resolution temporal and spatial data can help in understanding and modeling initial soil conditions and micro-variations more accurately. This combined approach not only leverages the empirical models' simplicity but also enriches them with the adaptability and precision of machine learning, leading to better performance in various hydrological contexts. By addressing these challenges, the prediction accuracy and reliability of hybrid models can be further enhanced across diverse conditions.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors would like to express their sincere gratitude to the Director of the Indian Institute of Astrophysics, Government of India, for granting access to the campus to perform the infiltration tests. This support was instrumental in the successful completion of our study.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConflict of Interest\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe data and code used in this study will be made available upon request.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eAdhikary PP, Chakraborty D, Kalra N, et al (2008) Pedotransfer functions for predicting the hydraulic properties of Indian soils. Aust J Soil Res 46:476\u0026ndash;484. https://doi.org/10.1071/SR07042\u003c/li\u003e\n\u003cli\u003eAhmed AA, Sayed S, Abdoulhalik A, et al (2024) Applications of machine learning to water resources management: A review of present status and future opportunities. J Clean Prod 441:140715. https://doi.org/10.1016/j.jclepro.2024.140715\u003c/li\u003e\n\u003cli\u003eAllaire SE, Roulier S, Cessna AJ (2009) Quantifying preferential flow in soils: A review of different techniques. J Hydrol 378:179\u0026ndash;204. https://doi.org/10.1016/j.jhydrol.2009.08.013\u003c/li\u003e\n\u003cli\u003eAlthoff D, Bazameb HC, Nascimentob JG (2021) Untangling hybrid hydrological models with explainable artificial intelligence. H2Open J 4:13\u0026ndash;28. https://doi.org/10.2166/H2OJ.2021.066\u003c/li\u003e\n\u003cli\u003eArya LM, Leij FJ, Shouse PJ, van Genuchten MT (1999) Relationship between the Hydraulic Conductivity Function and the Particle‐Size Distribution. Soil Sci Soc Am J 63:1063\u0026ndash;1070. https://doi.org/10.2136/sssaj1999.6351063x\u003c/li\u003e\n\u003cli\u003eASCE Task Committee on Application of Artificial Neural Networks in Hydrology. (2000a) Artificial Neural Networks in Hydrology. I: Preliminary Concepts. J Hydrol Eng 5:115\u0026ndash;123. https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(115)\u003c/li\u003e\n\u003cli\u003eASCE Task Committee on Application of Artificial Neural Networks in Hydrology. (2000b) Artificial Neural Networks in Hydrology. II: Hydrologic Applications. J Hydrol Eng 5:124\u0026ndash;137. https://doi.org/https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(124)\u003c/li\u003e\n\u003cli\u003eASTM Standard D2216\u0026ndash;19 (2019) Standard Test Methods for Laboratory Determination of Water (Moisture) Content of Soil and Rock by Mass. West Conshohocken, PA\u003c/li\u003e\n\u003cli\u003eBik\u0026scaron;e J, Retike I, Haaf E, Kalvāns A (2023) Assessing automated gap imputation of regional scale groundwater level data sets with typical gap patterns. J Hydrol 620: 129424. https://doi.org/10.1016/j.jhydrol.2023.129424\u003c/li\u003e\n\u003cli\u003eBonanomi G, Motti R, Abd-ElGawad AM, Idbella M (2024) Soil water repellency along elevation gradients: The role of climate, land use and soil chemistry. Geoderma 443:116847. https://doi.org/10.1016/j.geoderma.2024.116847\u003c/li\u003e\n\u003cli\u003eBreiman L (2001) Random forests. Mach Learn 45:5\u0026ndash;32. https://doi.org/10.1023/A:1010933404324\u003c/li\u003e\n\u003cli\u003eCeballos A, Martı́nez-Fern\u0026aacute;ndez J, Santos F, Alonso P (2002) Soil-water behaviour of sandy soils under semi-arid conditions in the Duero Basin (Spain). J Arid Environ 51:501\u0026ndash;519. https://doi.org/10.1006/jare.2002.0973\u003c/li\u003e\n\u003cli\u003eChen G, Hou J, Liu Y, et al (2024) Urban inundation rapid prediction method based on multi-machine learning algorithm and rain pattern analysis. J Hydrol 633:131059. https://doi.org/10.1016/j.jhydrol.2024.131059\u003c/li\u003e\n\u003cli\u003eChristiaens K, Feyen J (2001) Analysis of uncertainties associated with different methods to determine soil hydraulic properties and their propagation in the distributed hydrological MIKE SHE model. J Hydrol 246:63\u0026ndash;81. https://doi.org/10.1016/S0022-1694(01)00345-6\u003c/li\u003e\n\u003cli\u003eDexter AR, Richard G (2009) The saturated hydraulic conductivity of soils with n-modal pore size distributions. Geoderma 154:76\u0026ndash;85. https://doi.org/10.1016/j.geoderma.2009.09.015\u003c/li\u003e\n\u003cli\u003eDubey SR, Singh SK, Chaudhuri BB (2022) Activation functions in deep learning: A comprehensive survey and benchmark. Neurocomputing 503:92\u0026ndash;108. https://doi.org/10.1016/j.neucom.2022.06.111\u003c/li\u003e\n\u003cli\u003eGjettermann B, Nielsen KL, Petersen CT, et al (1997) Preferential flow in sandy loam soils as affected by irrigation intensity. Soil Technol 11:139\u0026ndash;152. https://doi.org/10.1016/S0933-3630(97)00001-9\u003c/li\u003e\n\u003cli\u003eHaverkamp R, Kutilek M, Parlange J-Y, et al (1988) Infiltration under ponded conditions: 2. Infiltration equationstested for parameter time-dependence and predictive use. Soil Sci 145:317\u0026ndash;329\u003c/li\u003e\n\u003cli\u003eHe Y, Wang Y, Liu Y, et al (2024) Focus on the nonlinear infiltration process in deep vadose zone. Earth-Science Rev 252:104719. https://doi.org/10.1016/j.earscirev.2024.104719\u003c/li\u003e\n\u003cli\u003eHeber Green W, Ampt GA (1911) Studies on Soil Phyics. J Agric Sci 4:1\u0026ndash;24. https://doi.org/10.1017/S0021859600001441\u003c/li\u003e\n\u003cli\u003eHoltan HN (1961) A concept for infiltration estimates in watershed engineering, 41st edn. Agricultural Research Service, US Department of Agriculture\u003c/li\u003e\n\u003cli\u003eHorton RE (1941) An Approach Toward a Physical Interpretation of Infiltration‐Capacity. Soil Sci Soc Am J 5:399\u0026ndash;417. https://doi.org/10.2136/sssaj1941.036159950005000C0075x\u003c/li\u003e\n\u003cli\u003eIspirova G, Eftimov T, Seljak BK (2020) Evaluating missing value imputation methods for food composition databases. Food Chem Toxicol 141:111368. https://doi.org/10.1016/j.fct.2020.111368\u003c/li\u003e\n\u003cli\u003eJia Y, Culver TB (2006) Bootstrapped artificial neural networks for synthetic flow generation with a small data sample. J Hydrol 331:580\u0026ndash;590. https://doi.org/10.1016/j.jhydrol.2006.06.005\u003c/li\u003e\n\u003cli\u003eKostiakov AN (1932) On the dynamics of the coefficient of water-percolation in soils and on the necessity of studying it from a dynamic point of view for purposes of amelioration. Trans 6th Cong Int Soil Sci Russ Part A 17\u0026ndash;21\u003c/li\u003e\n\u003cli\u003eLado M, Paz A, Ben-Hur M (2004) Organic Matter and Aggregate‐Size Interactions in Saturated Hydraulic Conductivity. Soil Sci Soc Am J 68:234\u0026ndash;242. https://doi.org/10.2136/sssaj2004.2340\u003c/li\u003e\n\u003cli\u003eLegates DR, McCabe GJ (1999) Evaluating the use of \u0026ldquo;goodness-of-fit\u0026rdquo; measures in hydrologic and hydroclimatic model validation. Water Resour Res 35:233\u0026ndash;241. https://doi.org/10.1029/1998WR900018\u003c/li\u003e\n\u003cli\u003eMahapatra S, Jha MK, Biswal S, Senapati D (2020) Assessing Variability of Infiltration Characteristics and Reliability of Infiltration Models in a Tropical Sub-humid Region of India. Sci Rep 10:1\u0026ndash;18. https://doi.org/10.1038/s41598-020-58333-8\u003c/li\u003e\n\u003cli\u003eManns HR, Jiang Y, Parkin G (2024) Soil pores in preferential flow terminology and permeability equations. Vadose Zo J 1\u0026ndash;12. https://doi.org/10.1002/vzj2.20365\u003c/li\u003e\n\u003cli\u003eMantoglou A, Gelhar LW (1987) Effective hydraulic conductivities of transient unsaturated flow in stratified soils. Water Resour Res 23:57\u0026ndash;67. https://doi.org/10.1029/WR023i001p00057\u003c/li\u003e\n\u003cli\u003eMattar MA, Alazba AA, Zin El-Abedin TK (2015) Forecasting furrow irrigation infiltration using artificial neural networks. Agric Water Manag 148:63\u0026ndash;71. https://doi.org/10.1016/j.agwat.2014.09.015\u003c/li\u003e\n\u003cli\u003eMosavi A, Ozturk P, Chau KW (2018) Flood prediction using machine learning models: Literature review. Water (Switzerland) 10:1\u0026ndash;40. https://doi.org/10.3390/w10111536\u003c/li\u003e\n\u003cli\u003eNaranjo-Fern\u0026aacute;ndez N, Guardiola-Albert C, Aguilera H, et al (2020) Clustering groundwater level time series of the exploited almonte-marismas aquifer in southwest Spain. Water (Switzerland) 12:1\u0026ndash;20. https://doi.org/10.3390/W12041063\u003c/li\u003e\n\u003cli\u003eOverton D (1964) Mathematical refinement of an infiltration equation for watershed engineering. Agricultural Research Service, US Department of Agriculture\u003c/li\u003e\n\u003cli\u003eParchami-Araghi F, Mirlatifi SM, Ghorbani Dashtaki S, Mahdian MH (2013) Point estimation of soil water infiltration process using Artificial Neural Networks for some calcareous soils. J Hydrol 481:35\u0026ndash;47. https://doi.org/10.1016/j.jhydrol.2012.12.007\u003c/li\u003e\n\u003cli\u003ePhilip JR (1969) Theory of Infiltration. In: Advances in Hydroscience. Academic PRESS, INC., pp 215\u0026ndash;296\u003c/li\u003e\n\u003cli\u003eQiu Y, Fu B, Wang J, Chen L (2001) Soil moisture variation in relation to topography and land use in a hillslope catchment of the Loess Plateau, China. J Hydrol 240:243\u0026ndash;263. https://doi.org/10.1016/S0022-1694(00)00362-0\u003c/li\u003e\n\u003cli\u003eRichards LA (1931) Capillary conduction of liquids through porous mediums. J Appl Phys 1:318\u0026ndash;333. https://doi.org/10.1063/1.1745010\u003c/li\u003e\n\u003cli\u003eRosenbom AE, Therrien R, Refsgaard JC, et al (2009) Numerical analysis of water and solute transport in variably-saturated fractured clayey till. J Contam Hydrol 104:137\u0026ndash;152. https://doi.org/10.1016/j.jconhyd.2008.09.001\u003c/li\u003e\n\u003cli\u003eSalvadore E, Bronders J, Batelaan O (2015) Hydrological modelling of urbanized catchments: A review and future directions. J Hydrol 529:62\u0026ndash;81. https://doi.org/10.1016/j.jhydrol.2015.06.028\u003c/li\u003e\n\u003cli\u003eSayari S, Mahdavi-Meymand A, Zounemat-Kermani M (2021) Irrigation water infiltration modeling using machine learning. Comput Electron Agric 180:105921. https://doi.org/10.1016/j.compag.2020.105921\u003c/li\u003e\n\u003cli\u003eShanafield M, Cook PG (2014) Transmission losses, infiltration and groundwater recharge through ephemeral and intermittent streambeds: A review of applied methods. J Hydrol 511:518\u0026ndash;529. https://doi.org/10.1016/j.jhydrol.2014.01.068\u003c/li\u003e\n\u003cli\u003eSidhu RK, Kumar R, Rana PS (2020) Machine learning based crop water demand forecasting using minimum climatological data. Multimed Tools Appl 79:13109\u0026ndash;13124. https://doi.org/10.1007/s11042-019-08533-w\u003c/li\u003e\n\u003cli\u003eSihag P, Singh B, Sepah Vand A, Mehdipour V (2020) Modeling the infiltration process with soft computing techniques. ISH J Hydraul Eng 26:138\u0026ndash;152. https://doi.org/10.1080/09715010.2018.1464408\u003c/li\u003e\n\u003cli\u003eSihag P, Singh VP, Angelaki A, et al (2019) Modelling of infiltration using artificial intelligence techniques in semi-arid Iran. Hydrol Sci J 64:1647\u0026ndash;1658. https://doi.org/10.1080/02626667.2019.1659965\u003c/li\u003e\n\u003cli\u003eSingh B, Sihag P, Singh K (2017) Modelling of impact of water quality on infiltration rate of soil by random forest regression. Model Earth Syst Environ 3:999\u0026ndash;1004. https://doi.org/10.1007/s40808-017-0347-3\u003c/li\u003e\n\u003cli\u003eSmith RE (1972) The infiltration envelope: Results from a theoretical infiltrometer. J Hydrol 17:1\u0026ndash;22. https://doi.org/10.1016/0022-1694(72)90063-7\u003c/li\u003e\n\u003cli\u003eSmith RE, Parlange J ‐Y (1978) A parameter‐efficient hydrologic infiltration model. Water Resour Res 14:533\u0026ndash;538. https://doi.org/10.1029/WR014i003p00533\u003c/li\u003e\n\u003cli\u003eStekhoven DJ, B\u0026uuml;hlmann P (2012) Missforest-Non-parametric missing value imputation for mixed-type data. Bioinformatics 28:112\u0026ndash;118. https://doi.org/10.1093/bioinformatics/btr597\u003c/li\u003e\n\u003cli\u003eSy NL (2006) Modelling the infiltration process with a multi-layer perceptron artificial neural network. Hydrol Sci J 51:3\u0026ndash;20. https://doi.org/10.1623/hysj.51.1.3\u003c/li\u003e\n\u003cli\u003eTaylor KE (2001) Summarizing multiple aspects of model performance in a single diagram. J Geophys Res Atmos 106:7183\u0026ndash;7192. https://doi.org/10.1029/2000JD900719\u003c/li\u003e\n\u003cli\u003eTeshome FT, Bayabil HK, Schaffer B, et al (2024) Simulating soil hydrologic dynamics using crop growth and machine learning models. Comput Electron Agric 224:109186. https://doi.org/10.1016/j.compag.2024.109186\u003c/li\u003e\n\u003cli\u003eWang Q, Shao M, Horton R (1999) Modified Green and Ampt models for layered soil infiltration and muddy water infiltration. Soil Sci 164:445\u0026ndash;453\u003c/li\u003e\n\u003cli\u003eXu W, Chen J, Corzo G, et al (2024) Coupling Deep Learning and Physically Based Hydrological Models for Monthly Streamflow Predictions. Water Resour Res 60:1\u0026ndash;25. https://doi.org/10.1029/2023WR035618\u003c/li\u003e\n\u003cli\u003eYoung CC, Liu WC, Wu MC (2017) A physically based and machine learning hybrid approach for accurate rainfall-runoff modeling during extreme typhoon events. Appl Soft Comput J 53:205\u0026ndash;216. https://doi.org/10.1016/j.asoc.2016.12.052\u003c/li\u003e\n\u003cli\u003eYuan J, Yao Y, Guan Y, et al (2024) Effects of land use patterns on soil properties and nitrous oxide flux on a semi-arid environmental conditions of Loess Plateau China. Glob Ecol Conserv 51:e02899. https://doi.org/10.1016/j.gecco.2024.e02899\u003c/li\u003e\n\u003cli\u003eZhang Y, Schaap MG (2019) Estimation of saturated hydraulic conductivity with pedotransfer functions: A review. J Hydrol 575:1011\u0026ndash;1030. https://doi.org/10.1016/j.jhydrol.2019.05.058\u003c/li\u003e\n\u003cli\u003eZubelzu S, Ghalkha A, Ben Issaid C, et al (2024) Coupling machine learning and physical modelling for predicting runoff at catchment scale. J Environ Manage 354:120404. https://doi.org/10.1016/j.jenvman.2024.120404\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":true,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"acta-geophysica","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"agph","sideBox":"Learn more about [Acta Geophysica](http://link.springer.com/journal/11600)","snPcode":"11600","submissionUrl":"https://www.editorialmanager.com/agph/default2.aspx","title":"Acta Geophysica","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Soil Infiltration prediction, Infiltration models, Artificial Neural Network (ANN), MissForest, Hybrid hydrological model","lastPublishedDoi":"10.21203/rs.3.rs-4869876/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4869876/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eDespite the success of machine learning (ML) in many disciplines, its application in hydrology, especially in water-scarce regions, faces challenges due to the lack of interpretability and physical consistency. This study addresses these challenges by integrating established empirical hydrological models with ML techniques to predict infiltration rates in water-scarce regions of southern India. Data from 199 observations across 11 sites, including soil characteristics and infiltration measurements, were used to parameterize traditional models like Philip's, Horton's, and Kostiakov's, which were then combined with Artificial Neural Networks (ANN) and the MissForest (MF) algorithm to form hybrid models. The results demonstrate that hybrid models, particularly those based on Philip's model, significantly improve prediction accuracy (R\u0026sup2;: 0.76\u0026ndash;0.92, RMSE: 0.08\u0026ndash;0.2 cm/min, and LCE: 0.11\u0026ndash;0.71 with more predictors) across all target sites while retaining interpretability. This approach leverages the strengths of both empirical models and machine learning, addressing the limitations of each. The study highlights that while empirical models are data-driven and may introduce uncertainties, combining them with ML techniques can enhance predictive power and provide a more robust understanding of infiltration dynamics. This is particularly valuable in regions where direct measurement is challenging. The hybrid models facilitate accurate predictions using minimal data from readily accessible locations, offering a practical solution for effective water resource management and soil conservation in semi-arid and data-scarce regions. By blending empirical knowledge with machine learning algorithms, this approach not only improves accuracy but also enhances the physical meaningfulness of hydrological models, providing a balanced and innovative solution to hydrological modeling challenges.\u003c/p\u003e","manuscriptTitle":"Enhancing Infiltration Rate Predictions with Hybrid Machine Learning and Empirical Models: Addressing Challenges in Southern India","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-09-12 19:18:33","doi":"10.21203/rs.3.rs-4869876/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Major revisions","date":"2024-10-17T01:14:42+00:00","index":"","fulltext":""},{"type":"reviewerAgreed","content":"","date":"2024-08-16T13:51:36+00:00","index":0,"fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-08-16T13:47:28+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"Acta Geophysica","date":"2024-08-15T20:26:52+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-08-09T18:48:39+00:00","index":"","fulltext":""},{"type":"submitted","content":"Acta Geophysica","date":"2024-08-06T12:24:56+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"acta-geophysica","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"agph","sideBox":"Learn more about [Acta Geophysica](http://link.springer.com/journal/11600)","snPcode":"11600","submissionUrl":"https://www.editorialmanager.com/agph/default2.aspx","title":"Acta Geophysica","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"79b98040-904c-416c-ae6a-613a61956232","owner":[],"postedDate":"September 12th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-02-24T16:02:55+00:00","versionOfRecord":{"articleIdentity":"rs-4869876","link":"https://doi.org/10.1007/s11600-025-01535-3","journal":{"identity":"acta-geophysica","isVorOnly":false,"title":"Acta Geophysica"},"publishedOn":"2025-02-22 15:57:41","publishedOnDateReadable":"February 22nd, 2025"},"versionCreatedAt":"2024-09-12 19:18:33","video":"","vorDoi":"10.1007/s11600-025-01535-3","vorDoiUrl":"https://doi.org/10.1007/s11600-025-01535-3","workflowStages":[]},"version":"v1","identity":"rs-4869876","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4869876","identity":"rs-4869876","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.