Metrics Comparison of Machine Learning Algorithms used to classify Noiler Chicken Egg from Egg QualityTrait | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Metrics Comparison of Machine Learning Algorithms used to classify Noiler Chicken Egg from Egg QualityTrait Iyabode Dudusola, Hameed Bashiru, Christopher Adetola, Fatimoh Egbinola, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8431208/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 15 You are reading this latest preprint version Abstract This study evaluated the predictive performance metrics of four machine learning algorithms (Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), and Linear Regression (LRG)) for classifying egg size based on internal and external egg quality traits of Noiler chickens. Three hundred freshly laid eggs (100 per plumage variety) were collected at young laying age (26 weeks) and old laying age (46 weeks), and assessed for various quality parameters. External traits included egg weight, egg width, egg length, shell surface area, percentage of shell thickness, and shell weight) while internal traits included albumen height, haugh unit, yolk height, yolk index, yolk and albumen weight and yolk width. Data were analysed using Python-based implementations of the four algorithms. Among the models, the Random Forest algorithm achieved the highest classification accuracy (98%), with perfect precision (1.00) and a recall of 0.98 which indicated exceptional predictive ability. SVM and Logistic Regression both recorded accuracies of 95%, while linear regression recorded 92% Therefore, the model developed from the Random Forest algorithm can be effectively used for automated egg grading and selection in poultry breeding programs. Future research could incorporate additional features such as computer vision and deep learning techniques to further enhance prediction accuracy. Egg size Machine learning Egg quality traits Noiler chicken Predictive modeling Figures Figure 1 INTRODUCTION In Nigeria, animal agriculture is the second largest subsector of the country’s agricultural sector, with a 9.2% average contribution to agricultural GDP between 1960 and 2020. (Uzonwanne et al., 2023 ). Animal agriculture also directly provides animal protein, including dairy and poultry products. Furthermore, it generates employment, income, and food security, highlighting its critical economic importance (Ojiako & Olayode, 2008 ). The poultry industry plays a crucial role in providing affordable and nutritious protein sources, such as meat and eggs, to the growing global population. Among the various poultry breeds, the Noiler chicken, an improved Nigerian indigenous dual-purpose breed, has gained significant attention for its desirable meat and egg production characteristics (Dogara et al., 2021 ; Ajayi et al., 2020 ; Bamidele et al., 2020 ; Yakubu et al., 2020 ). Eggs are reservoirs of nutrition for embryos that are still developing and are also a source of protein for human beings (Liswaniso et al., 2021 ; Tyasi et al., 2024 ). The quality of eggs is determined by various traits, including egg weight, shell thickness, yolk colour, and Haugh unit (Dogara et al., 2021 ; Xiao et al., 2023). Egg weight is one of the important parameters in marketing and it determines egg size; in addition to this, it has a significant role in determining quality indexes such as albumen ratio, eggshell thickness, and hatchability (Asadi et al., 2010; Alkan et al., 2010 ; Kul & Şeker, 2004). Traditionally, egg sizing methods have always relied on manual and instinctive grading which is stressful and labour-intensive which is subjective thereby leading to inconsistencies. Therefore, machine learning algorithms can be trained using various phenotypic traits to accurately classify several egg sizes (small, large) with high recall and precision leading to reduced labour cost and improved consistency of egg grading for large-scale poultry operation. Previous studies predicted egg weight and classified egg size using image processing and machine learning techniques (Thipakorn et al., 2017 ). Çımen et al. (2018) studied the Classification of dynamic egg weight using an artificial neural network. Similarly, Soltani et al. ( 2015 ) proposed an egg volume prediction method using machine vision techniques based on Pappus theorem and Artificial Neural Network. Their research seeks to develop a predictive model that leverages machine learning techniques to estimate egg weight from egg quality traits, promoting more efficient and accurate egg sorting and grading practices. In artificial intelligence (AI), machine learning is the study of teaching computers to learn from data without the need for explicit programming. It entails the broad application of data to build machines that are either fully or partially autonomous. According to Tyasi et al. ( 2020 ), machine learning techniques including Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR) have demonstrated encouraging outcomes in estimating egg size based on egg quality parameters. These algorithms don't require rigid assumptions about the distribution of the data, and they can handle intricate non-linear interactions and produce precise predictions (Çiftsüren & Akkol, 2018 ). The application of these advanced machine learning techniques in the context of Noiler chickens can lead to the development of robust predictive models, which can enhance breeding programs and production management. Four machine learning algorithms were employed to classify egg size based on internal and external egg quality traits namely Support Vector Machine (SVM), Logistic Regression, Random Forest, and Linear Regression. According to Aruna and Rajagopalan ( 2011 ), SVM is considerably more successful than other ML approaches at identifying minute patterns in big datasets. Since the SVM incorporates elements and methods from machine learning, statistics, functional analysis, and convex optimization, it has raised expectations recently due to its performance in classification problems, regression, and forecasting. The SVMs are appropriate for classifying tiny samples of data in addition to having excellent adaptability, global optimization, and good generalization performance (Ccoicca, 2013 ). A class of models known as generalized linear models, for which certain essential linear assumptions are loosened, includes logistic regression. Since linear regression requires that outcomes be measured on a continuous scale, logistic regression is a great tool for modeling relationships with such outcomes. Classification modeling, which uses logistic regression, is frequently used to model the likelihood that observations will fall into one of several classes of a categorical outcome. Two classes (e.g., active/inactive, promoted/not promoted), multiple unordered classes (e.g., job family, location), or multiple ordered classes (e.g., survey items measured on a Likert scale, performance level, education level) can have a binomial classification context. As logistic regression uses a link function to generalize the linear model for non-continuous outcomes, it is a type of regression analysis that, by definition, returns a numeric outcome, and probabilities are numeric, regardless of the classes of the outcome variable (Starbuck, 2023 ). Random Forest (RF) is a machine learning approach that integrates multiple decision trees to minimize feature data correlation. The computational complexity of RF is O(n), where n represents the number of samples, making it efficient for processing large datasets. Additionally, its parallel execution capability enhances computational speed. To reduce correlation between decision trees, RF employs random selection of both samples and features. Initially, a randomly selected subset of data is drawn from the original training set (Salman et al., 2024 ). Moreover, a random subset of features is chosen for constructing each decision tree. This dual-randomization strategy lowers inter-tree correlation, thereby reducing the risk of overfitting and improving model accuracy (Cutler et al., 2011 ). Linear regression is the statistical method used to ascertain whether there is a linear relationship between the independent variable (X) and the dependent variable (Y). Determining the optimal linear function, or a collection of coefficients (weights) that enable the function to make the most accurate possible prediction about the value of the dependent variable. Formally, the linear regression model can be expressed as Y = β0+β1X1+β2X2+⋯+βnXn+ε (Hall & Horowitz, 2007 ) where β0 is the intercept, β1 to βn is the regression coefficient, X1 to Xn is the independent variable, and ε is the error term (Qu, 2024 ). The objectives of this study, therefore, were to develop predictive models using machine learning algorithms (SVM, LR, RF and Linear Regression) to classify egg size based on egg quality traits and to assess the metrics of the developed predictive models in estimating egg size. MATERIALS AND METHODS 3.1.Description of the Study Area This experiment was carried out in the Poultry Unit of the Teaching and Research Farm of Obafemi Awolowo University, Ile-Ife, Osun State, Nigeria. Ile-Ife. 3.2. Experimental birds and management Two hundred and forty (240) day-old chicks, comprising 80 each from black, brown and barred plumage varieties were used for the study. The chicks were brooded together for three weeks and subsequently raised on deep litter in labelled pens within a semi-open housing system at around temperature of 32-35 0 C on the floor at the first week and decreased by 2-30C each week until it reaches room temperature with optimum lighting conditions. The experimental birds were fed ad libitum for the first 12 weeks. Thereafter, the birds were sexed and transferred into standard galvanized battery cages, where they were managed uniformly throughout the study period. The birds were sorted into pens based on their plumage colour, and the pens were assigned using a completely randomized design. The birds were raised under an intensive production system. They were fed a standard commercial starter diet containing 19% crude protein (CP) and 2,750 kcal ME/kg from hatch to 4 weeks of age, followed by a grower diet containing 15% CP and 2,600 kcal ME/kg, offered ad libitum. Clean water was also provided ad libitum . 3.4 Data collection A total of three hundred (300) freshly laid eggs were collected from three Noiler plumage varieties (black, brown and barred) at two laying ages: 26 weeks (young) or 46 weeks (old). For each plumage variety, one hundred (100) eggs were collected. The external egg quality parameters measured included egg weight, egg width, egg length, shell surface area, percentage of shell thickness, and shell weight. The internal egg quality parameters assessed were albumen height, haugh unit, yolk height, yolk index, yolk and albumen weight, and yolk width. Egg weight, yolk and albumen weight and shell weights were recorded in grams using the KERRO® electronic compact scale (model number BL50001) with a maximum capacity of 5000 g and sensitivity of 0.1 g. Egg length (EL), egg width (EW), yolk height (YH), albumen height (AH) and yolk width (YW) were measured in centimetres using a Vernier calliper while shell thickness (ST) was measured in millimetres using a micrometre screw gauge. Haugh units were calculated by the formula described by Khaleel et al. (2019) and Wahyuni et al. ( 2023 ) as follows: $$\:HU\:=\:100\text{log}\left(H\:+\:7.57\:-\:1.7\:W0.37\right)$$ Where, H is albumen height in millimeters and W is observed weight of the egg in grams. $$\:Shape\:Index\:=\:Egg\:width/\:Egg\:length\:\times\:\:100$$ $$\:Yolk\:index\:=\:Yolk\:height\:/Yolk\:width\:\times\:\:100$$ The surface of an egg was calculated using the formula of Paganelli et al., ( 1974 ): P = 4.835 × W 0.662 , where: W - egg weight. 3.5 Data and Statistical Analysis The data collected from the egg quality traits were analyzed using Python programming language. The analysis involved statistical methods, visualizations, and machine learning algorithms to classify egg size based on internal and external egg quality traits. All data were analyzed using the Python programming language (Python Software Foundation, 2023 ). The complete codebase for data exploration is publicly available on GitHub, the link below: https://github.com/chrisseub/ML-Practice-Rep/blob/main/EGG%20SIZE%20MODELS/EGG_Size_PREDICTION.ipynb By outlining the relationship between variables in a sample or population, descriptive statistics help to organize and synthesize data (Kaur et al. 2018 ). This was calculated to provide a summary of the data, encompassing measurements of dispersion (standard deviation, range) and central tendency (mean, median). The distribution of egg quality characteristics among plumage variants was revealed by these figures. To investigate the links between internal and external egg quality features, a correlation matrix was created using Pearson's correlation coefficients. To find significant positive or negative correlations between variables like egg weight, albumen height, yolk weight, and shell thickness, this matrix was shown using Seaborn's heatmap. For Class Distribution, the distribution of the 'Outcome' variable (Small vs. Large eggs) was examined using value_counts(). This helped to understand the class balance in the dataset. Finally, Grouped Analysis was done, the mean of the numeric features was calculated for each 'Outcome' group (Small and Large). This allowed for a comparison of the average trait values between the two egg size categories. 3.6 Machine Learning 3.6.1 Data Collection and Pre-processing The dataset used are the measurements of internal and external trait of Noiler chicken eggs. These datasets were utilized as features (features: egg length, egg width, yolk height, yolk width, albumin height, yolk albumin weight, shell weight, shell thickness, Haugh Unit, shell surface area, yolk index, shape index; target: size class [0, 1]). 3.6.2 Pre-processing 3.6.2.1 Handling Missing Values: Hyphen values in the dataset were treated as missing values (NaN) and then imputed using the mean of the respective columns i.e, Missing values imputed using SimpleImputer (mean was used for numerical features). 3.6.2.2 Data Splitting: the dataset was split into training and testing sets to evaluate performance on unseen data. A stratify split was used to maintain the proportion of Small and Large eggs in both sets. Train-test split (80:20) with stratification (train_test_split). 3.6.2.3 Hyperparameter Tuning: this was employed for only the SVM model to optimize the model performance because their settings (like C and gamma) have a greater influence on how they draw the line to separate the data. Small changes in these settings can significantly change the SVM's performance. Less complex models, such as Linear and Logistic Regression, have fewer parameters that have a significant impact on the result. Despite being more complex, Random Forests are generally less sensitive to tuning than SVMs. Hence, hyperparameter tuning was not utilized in the other models to avoid overfitting. 3.6.3 Machine Learning Models This study employed three machine learning models - Logistic Regression, Random Forest, and Linear Regression - to classify egg outcome based on measured traits. The performance of each model was evaluated using various metrics to determine their effectiveness in predicting whether an egg is small or large. 3.6.3.1 Support Vector Machine (SVM) : an SVM classifier with a linear kernel was implemented using svm.SVC class from the scikit-learn library. 3.6.3.2 Logistic Regression : This linear model was applied to predict the binary outcome (small or large) based on the input egg characteristics. Its performance in classifying the egg size was assessed through standard evaluation metrics. 3.6.3.3 Random Forest : An ensemble learning method consisting of multiple decision trees, this model was utilized to improve the robustness and accuracy of egg size classification. Its predictions were aggregated from the individual trees to determine the final outcome. 3.6.3.4 Linear Regression : this is primarily a regression model, it was included in the analysis to provide a comparative perspective on the relationship between egg traits and the numerical representation of the outcome. While not a classification model, its output can be interpreted in the context of predicting the scaled egg size. 3.6.4 Model Evaluation 3.6.4.1 Metrics : For classification, the following metrics were evaluated; Accuracy, Precision, Recall, F1-score. For regression; R-squared, MSE. For validation; train_test_split with test_size = 0.2 was utilized and finally for visualization; Decision trees from RF plotted using plot_tree 3.6.5. Software Python 3.11, libraries: scikit-learn (1.6.1), pandas, numpy, matplotlib. The complete notebook for modeling is available on GitHub: https://github.com/chrisseub/ML-Practice-Rep/blob/main/EGG%20SIZE%20MODELS/EGG%20SIZE%20CLASSIFICATION%20MODELS.ipynb RESULTS Descriptive statistics The descriptive statistics of Noiler chickens' internal and external egg quality characteristics are shown in Table 1 . Three hundred eggs in all were assessed for weight and associated parameters. With a mean of 63.85 g and a standard deviation of 6.30 g, the egg weights varied from 49.7 g to 78.9 g. Eggs were divided into large and small categories based on a 55 g criteria. A total of 71 eggs were classed as small (< 55 g), while 229 eggs were classified as large (≥ 55 g). Among the external traits, the mean egg length was 5.80 cm and mean egg width was 4.34 cm, reflecting the elongated shape typical of poultry eggs. Among the external traits, the mean egg length was 5.80 cm and mean egg width was 4.34 cm, reflecting the elongated shape typical of poultry eggs. The samples showed consistent shell structure, with the shell thickness recording a mean of 0.323 mm and the shell weight having a mean of 7.59 g, with values ranging from 5.90 g to 9.90 g. The majority of the total egg mass was made up of the yolk and albumen, which had a mean weight of 55.61 g, a minimum weight of 43.00 g, and a maximum weight of 70.20 g for internal quality features. The albumen height had a lower mean value of 0.83 mm, which reflected the flatness of the thick albumen layer, whereas the yolk height and yolk breadth averaged 1.93 mm and 4.04 mm, respectively. The samples all exhibited good internal egg quality as indicated by the high mean value of 84.54 for the Haugh unit, a common indicator of albumen quality and egg freshness. Additional calculated indices that support the normal ovality of eggs in Noiler hens are the shape index (mean: 75.56%) and the yolk index (mean: 48.00%). Egg size is correlated with the shell surface area, which ranged from 61.51 cm² to 83.44 cm² with an average of 72.48 cm². Table 1 Overall descriptive statistics of internal and external egg quality traits for Noiler chicken VARIABLE N MEAN SD MIN MAX Egg weight (g) 300 63.849 6.304 49.700 78.900 Egg length (cm) 300 5.801 0.307 5.100 7. 000 Egg width (cm) 300 4.341 0.291 0.300 4.900 Yolk height (mm) 296 1.932 0.140 0.800 2.300 Yolk width mm 297 4.040 0.274 3.200 4.800 Albumin height (mm) 299 0.834 0.134 0.350 1.200 Yolk and albumin weight (g) 300 55.607 5.841 43.000 70.200 Shell weight (g) 300 7.594 0.705 5.900 9.900 Shell thickness (mm) 300 0.323 0.043 0.200 0.4200 Haugh unit 299 84.540 0.744 81.801 86.421 Shell surface area (cm 2 ) 300 72.484 4.740 61.505 83.443 Yolk index (%) 296 48.002 4.481 21.622 61.765 Shape Index (%) 300 75.556 5.982 5.263 89.091 Correlation matrix The correlation matrix illustrating the connections between several internal and external egg quality characteristics in Noiler chicken eggs is shown in Fig. 1 . A number of statistically significant relationships that provide insight on the co-variation of several attributes are revealed by the matrix. Key egg quality parameters, such as egg length, egg width, yolk and albumen weight, and shell weight, showed strong positive correlations with egg weight. According to these positive correlations, larger eggs typically have heavier shells and more interior components (yolk and albumen). Interestingly, there was a particularly strong link between egg weight and yolk-albumen weight, suggesting that the interior makeup of the egg has a significant role in total egg weight. The relationship between external parameters and shell weight was further supported by the positive association that shell weight demonstrated with both egg length and egg width. On the other hand, most other traits showed weak or even slightly negative correlations with shell thickness, suggesting that an increase in shell thickness does not always translate into an increase in egg weight or size. An established indicator of albumen quality and freshness, the Haugh unit, showed poor relationships with most characteristics, suggesting that it is mainly unaffected by yolk quantity, egg size, and shell weight. Egg shape is comparatively consistent regardless of other physical parameters, as seen by the shape index's little association with egg weight and other size-related characteristics. Moderate correlations were also observed between shell surface area and both egg weight and egg dimensions, which is expected since surface area expands proportionally with overall egg size. The correlation matrix concludes by indicating that egg weight can be a good predictor of a number of internal and external characteristics of egg quality, especially those related to size and content. However, some parameters, like shell thickness and Haugh unit, seem to be controlled separately from the characteristics that affect total egg weight. Comparison of Performance metrics among the four Machine Learning models Out of the four models that were assessed: Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), and Linear Regression (LRG), the Random Forest model had the highest accuracy (98.3%). The next two models were SVM (95%) and Logistic Regression (95%), which were nearly comparable in terms of total accuracy. Additionally, Random Forest achieved perfect precision (1.00) and a recall of 0.98 for the 'Large' class, and perfect recall (1.00) for the 'Small' class, demonstrating its ability to discern egg sizes with few errors. SVM's ability to handle non-linearity makes it a more dependable choice for classification tasks involving complex relationships between egg traits, even though its overall accuracy results were the same as those of logistic regression. Compared to the categorization models, Linear Regression, a regression-based model, would not be the best fit for modeling the complex interactions between egg size and quality parameters, as indicated by its R-squared value of 0.45. It achieved 91.7% classification accuracy when its continuous predictions were applied to threshold-based classification. From the perspective of interpretability, Logistic Regression provides a straightforward and understandable model, which makes it perfect for comprehending how various egg characteristics affect classification through its coefficients. Even though SVM and Random Forest are typically better at identifying intricate patterns, they might not be as directly interpretable as Logistic Regression and frequently behave more like "black box" models. Nonetheless, Random Forest's ensemble-based methodology naturally improves its robustness and successfully lowers overfitting, which helped it achieve the highest accuracy (98.3%) of all the models tested in this job. While Linear Regression did not perform as well as the classification models in terms of its primary regression metrics (R-squared of 0.45), it's important to remember it is not designed for discrete classification and is less reliable for this specific purpose compared to models built for classification. Based on the classification performance observed, the tuned SVM and Logistic Regression models were the next best alternatives, both achieving 95% accuracy. Considering both performance and practical application for tasks like automated egg grading, Random Forest emerges as the most effective model, followed closely by the tuned SVM, while Logistic Regression remains a strong and more interpretable alternative for simpler implementations Table 2 Comparison of Machine Learning algorithm metrics for Noiler chicken egg size classification Metric SVM Logistic Regression Random Forest Linear Regression (with 0.5 threshold) Accuracy 0.95 0.95 0.98 0.92 Prediction (Class 0) 0.82 0.82 0.93 0.76 Prediction (Class 1) 1.00 1.00 1.00 0.98 Recall (Class 0) 1.00 1.00 1.00 0.93 Recall (Class 1) 0.93 0.93 0.98 0.91 F1-score (Macro) 0.93 0.93 0.98 0.89 R-squared 0.72 0.72 0.91 0.45 Mean Squared Error 0.05 0.05 0.02 0.09 DISCUSSION This study demonstrates the ability of machine learning algorithms to effectively classify egg size in Noiler chickens using internal and external egg quality traits, with the Random Forest (RF) model showing superior predictive performance. The high accuracy (98.3%), near-perfect precision, and strong recall achieved by RF indicate its robustness in handling complex, non-linear relationships among egg quality traits. This performance advantage is consistent with the ensemble nature of RF, which reduces variance and overfitting by aggregating multiple decision trees trained on randomly selected features and samples. Similar findings have been reported in poultry and livestock studies where RF outperformed linear and kernel-based models in predicting production traits due to its ability to model hierarchical interactions among correlated predictors (Tyasi et al., 2020 ; Salman et al., 2024 ). Support Vector Machine (SVM) and Logistic Regression (LR) also demonstrated strong and comparable classification performance (95% accuracy), suggesting that egg size is largely separable based on measured phenotypic traits. However, the higher recall and F1-score achieved by RF indicate a more balanced performance, particularly in the presence of class imbalance between small and large eggs. Logistic Regression offered greater interpretability through model coefficients, making it useful for understanding the directional influence of specific egg traits, whereas SVM and RF functioned as more complex, data-driven classifiers. These results align with previous egg classification studies that reported improved performance of ensemble and non-linear models over traditional linear approaches when multiple correlated egg quality traits are involved (Çımen & Yabanova, 2018 ; Thipakorn et al., 2017 ). Although Linear Regression achieved relatively high threshold-based classification accuracy, its low R-squared value confirms its limited suitability for modeling the complex biological relationships underlying egg size determination. Linear models assume additive and linear relationships, which are unlikely to fully capture the interactions between internal traits (such as yolk–albumen composition) and external traits (such as shell surface area). Therefore, Linear Regression should be regarded only as a baseline comparator rather than a practical classification tool. Finally, the findings highlight the suitability of Random Forest models for automated egg grading systems and decision support in poultry breeding programs. Future studies could improve generalizability by addressing class imbalance, incorporating larger datasets, and integrating image-based or deep learning features to further enhance predictive accuracy. CONCLUSION This study demonstrates the effectiveness of machine learning approaches for classifying Noiler chicken eggs into size categories using internal and external egg quality traits. Among the evaluated models, the Random Forest algorithm consistently outperformed Support Vector Machine, Logistic Regression, and Linear Regression, achieving the highest accuracy, precision, recall, and F1-score. Its superior performance highlights the ability of ensemble-based methods to capture complex, non-linear relationships among correlated biological traits, which are characteristic of egg quality parameters. The findings also indicate that while traditional statistical and linear models can provide baseline insights, they are limited in their capacity to model the multifactorial nature of egg size determination. In contrast, machine learning models, particularly Random Forest offer a scalable framework for automated egg grading and decision support in poultry production systems. The application of such models can contribute to improved production efficiency, standardized egg classification, and informed breeding strategies. Despite these promising results, the study is constrained by class imbalance and reliance on manually measured egg traits. Future research should incorporate larger and more diverse datasets, address imbalance through resampling techniques, and explore the integration of image-based features and deep learning methods Declarations Declarations Competing interests The authors declare that they have no competing interests or other interests that might be perceived to influence the results and/or discussion reported in this paper. Conflict of interest statement The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest Funding This research received no external funding or grant. Data availability The code used to generate the results presented in this study is available in a public GitHub repository. Specific implementation details are provided in the Materials and Methods section of the manuscript. The data that was utilized with this code is available from the corresponding authors upon reasonable request.. Ethies approval All experimental procedures were approved by the Animal Welfare and Ethics Committee, Department of Animal Science, Obafemi Awolowo University and were conducted following the ethical guidelines for the use of animals in biomedical research established by the Obafemi Awolowo University, Ile-Ife. This study involved no clinical trial; hence, clinical trial registration is not applicable. Clinical trial number: not applicable Consent to Participate Not applicable. Consent for Publication Not applicable. Author Contribution I.D., H.B., C.A., F.E., and R.O wrote the main manuscript text and prepared figures Data Availability The code used to generate the results presented in this study is available in a public GitHub repository. Specific implementation details are provided in the Materials and Methods section of the manuscript. The data that was utilized with this code is available from the corresponding authors upon reasonable request.. References Adeboye O, Schultz B, Adekalu K, Prasad K. Soil water storage, yield, water productivity and transpiration efficiency of soybeans (Glycine max L. Merr) as affected by soil surface management in Ile-Ife, Nigeria. International Soil Water Conserv Research. 2017;5(1):10. https://doi.org/10.1016/j.iswcr.2017.04.006 . Alkan S, Karabag K, Galic A, Karsli T, Balcıoğlu M. Effects of selection for body weight and egg production on egg quality traits in Japanese quails (Coturnix coturnix japonica) of different lines and relationships between these traits. Kafkas Universitesi Veteriner Fakultesi Dergisi. 2010;16:239–44. Ajayi FO, Bamidele O, Hassan WA, Ogundu U, Yakubu A, Alabi OO. Production performance and survivability of six dual-purpose breeds of chicken under smallholder farmers’ management practices in Nigeria. Arch Anim Breed. 2020;63:387–408. 10.5194/aab-63-387-2020 . Aruna S, Rajagopalan SP. A novel SVM-based CSSFFS feature selection algorithm for detecting breast cancer. Int J Comput Appl. 2011;31(8):14–20. Asadi V, Raoufat MH. Estimation of egg weight by machine vision and neural networks technique. Int J Nat Eng Sci (IJNES). 2010;4(2):1–4. Bamidele O, Sonaiya EB, Adebambo OA, Dessie T. On-station performance evaluation of improved tropically adapted chicken breeds for smallholder poultry production systems in Nigeria. Trop Anim Health Prod. 2020;52:1541–8. 10.1007/s11250-019-02158-9 . Çiftsüren MN, Akkol S. Prediction of internal egg quality characteristics and variable selection using regularization methods: ridge, LASSO and elastic net. Archives Anim Breed. 2018;61:279–84. Çımen H, Yabanova İ. (2018). Classification of dynamic egg weight using artificial neural network. In Proceedings of the 7 th International Conference on Computers Communications and Control (ICCCC) (pp. 302–305). Oradea, Romania. https://doi.org/10.1109/ICCCC.2018.8390475 Ccoicca YJ. Applications of Support Vector Machines in the Exploratory Phase of Petroleum and Natural Gas: a Survey. Int J Eng Technol. 2013;2(2):113–25. https://doi.org/10.14419/ijet.v2i2.834 . Cutler A, Cutler DR, Stevens JR. (2011). Random forest. Machine learning magazine . https://doi.org/10.1007/978-1-4419-9326-7_5 Dogara UM, Kalla D, Mancha Y. Evaluation of egg production and egg quality traits of Noiler chickens. Trop J Agricultural Sci. 2021;23(1):100–13. https://www.ajol.info/index.php/tjas/article/view/219041 . Hall P, Horowitz JL. Methodology and convergence rates for functional linear regression. Annals Stat. 2007;35(1):70–91. https://doi.org/10.1214/009053606000000957 . Hayward DF, Oguntoyinbo JS. The climatology of West Africa. New York: Rowan and Little Eld; 1987. Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics. Int J Acad Med. 2018;4(1):60–3. https://doi.org/10.4103/IJAM.IJAM_7_18 . Khaleel RMT. Prediction of Haugh Unit through albumen height and egg weight. Mesop J Agric. 2019;47:37–43. Liswaniso S, Qin N, Tyasi TL, Chimbaka IM. Use of data mining algorithms CHAID and CART in predicting egg weight from egg quality traits of indigenous free-range chickens in Zambia. Adv Anim Veterinary Sci. 2021;9(2):215–20. https://doi.org/10.17582/journal.aavs/2021/9.2.215.220 . Ojiako IA, Olayode GO. Analysis of trends in livestock production in Nigeria: 1970–2005. J Agric Social Res (JASR). 2008;8(1):114–20. Paganelli CV, Olszowka A, Ar A. The Avian Egg: Surface Area, Volume, and Density. Condor. 1974;76(3):319–25. https://doi.org/10.2307/1366345 . Python Software Foundation. (2023). Python (Version 3.11) [Computer software]. https://www.python.org/ Qu K. (2024). Research on linear regression algorithm. MATEC Web of Conferences, 395 , 01046. https://doi.org/10.1051/matecconf/202439501046 Salman M, Khan S, Ullah R, Khan MA. Comparative evaluation of machine learning algorithms for poultry production and egg quality classification. Artif Intell Agric. 2024;9:100–12. https://doi.org/10.1016/j.aiia.2024.100112 . Salman H, Kalakech A, Steiti A. Random forest algorithm overview. Babylon J Mach Learn. 2024;2024(1):69–79. https://doi.org/10.58496/BJML/2024/007 . Soltani M, Omid M, Alimardani R. Egg volume prediction using machine vision technique based on Pappus' theorem and artificial neural network. J Food Sci Technol. 2015;52(5):3065–71. https://doi.org/10.1007/s13197-014-1350-6 . Starbuck C. Logistic Regression. The Fundamentals of People Analytics. Cham: Springer; 2023. https://doi.org/10.1007/978-3-031-28674-2_12 . Thipakorn J, Waranusast R, Riyamongkol P. (2017). Egg weight prediction and egg size classification using image processing and machine learning. In Proceedings of the 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) (pp. 477–480). Phuket, Thailand. https://doi.org/10.1109/ECTICon.2017.8096278 Thipakorn T, Duangjinda M, Tumwasorn S. Prediction of egg weight and quality traits using data mining techniques. Comput Electron Agric. 2017;135:203–10. https://doi.org/10.1016/j.compag.2017.02.014 Tyasi TL, Eyduran E, Celik S. Comparison of tree-based regression tree methods for predicting live body weight from morphological traits in Hy-line Silver Brown commercial layer and indigenous Potchefstroom Koekoek breeds raised in South Africa. Trop Anim Health Prod. 2020;53(1):7. https://doi.org/10.1007/s11250-020-02443-y . Tyasi TL, Qin N, Jing Y, Mu F, Zhu H, Liu D, Xu R. Prediction of egg production and egg quality traits using artificial neural networks in poultry. Poult Sci. 2020;99(10):5372–80. https://doi.org/10.1016/j.psj.2020.07.022 . Tyasi TL, Ngorima L, Hlokoe VR. Predicting egg weight from egg quality traits of the Lohmann Brown chicken breed using stepwise regression. Adv Anim Veterinary Sci. 2024;12(3):436–40. https://doi.org/10.17582/journal.aavs/2024/12.3.436.440 . Uzonwanne C, Onyedibe F, Nwokoye M. Impact of livestock production on gross domestic product in Nigeria. Int J Adv Econ. 2023;5(5):107–18. https://doi.org/10.51594/ijae.v5i5.477 . Wahyuni H, Yudiarti T, Widiastuti E, Sartono T, Agusetyaningsih I, Sugiharto S. Dietary supplementation of Spirulina platensis and Saccharomyces cerevisiae on egg quality, physiological condition and ammonia emission of hens at the late laying period. J Indonesian Trop Anim Agric. 2023;48(1):47–57. https://doi.org/10.14710/jitaa.48.1.47-57b . Yakubu A, Bamidele O, Hassan WA, Ajayi FO, Ogundu UE, Alabi O. Farmers’ choice of genotypes and trait preferences in tropically adapted chickens in five agroecological zones in Nigeria. Trop Anim Health Prod. 2020;52:95–107. 10.1007/s11250-019-01993-0 . Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 06 Feb, 2026 Reviews received at journal 22 Jan, 2026 Reviews received at journal 19 Jan, 2026 Reviews received at journal 18 Jan, 2026 Reviews received at journal 12 Jan, 2026 Reviewers agreed at journal 08 Jan, 2026 Reviewers agreed at journal 08 Jan, 2026 Reviewers agreed at journal 08 Jan, 2026 Reviewers agreed at journal 08 Jan, 2026 Reviewers agreed at journal 08 Jan, 2026 Reviewers invited by journal 08 Jan, 2026 Editor invited by journal 30 Dec, 2025 Editor assigned by journal 30 Dec, 2025 Submission checks completed at journal 29 Dec, 2025 First submitted to journal 29 Dec, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8431208","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":572461250,"identity":"7754406e-ab10-4065-9676-fa2b3a7b4239","order_by":0,"name":"Iyabode Dudusola","email":"","orcid":"","institution":"Obafemi Awolowo University","correspondingAuthor":false,"prefix":"","firstName":"Iyabode","middleName":"","lastName":"Dudusola","suffix":""},{"id":572461251,"identity":"ca9e581e-af05-4371-9190-57e6451e4b86","order_by":1,"name":"Hameed Bashiru","email":"","orcid":"","institution":"Obafemi Awolowo University","correspondingAuthor":false,"prefix":"","firstName":"Hameed","middleName":"","lastName":"Bashiru","suffix":""},{"id":572461252,"identity":"e1f8df70-cb66-4f67-a903-2645fd54fbaa","order_by":2,"name":"Christopher Adetola","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABO0lEQVRIie3Sv2qDQBzA8Z8EdPkV1wNL8gonglSQ9lVOAnaRUhCc+kcQdPEBGshLZCkdLUJcJLMlS0OHLA4NhRKo0J5pC0UbcCzFL+iB8OF+dwjQ1/dHI/w5BMnni/j9jQ537/2EAQImgl8T/CRaB0LYTwL7iRwFq2KzNVGerNdP6JlnJzC4f8FzPpkU3BLh7rK1Qz7XjBtmI1k6aoAL20UQxwpSPhnOPSLkWZNQwnQFWYqwdITgIEytGJByUlk+cXQihPM2OX1VKk5GD9mKk/eaaG98l2t/VO4hjq4AJ7QAlZOkJno9GAOCNblonyV3jdi2Uc0ddTJdjK04FW1jSqkaou0eWWHSvrFoVmxNczjMssfn0ju2oihIi7KiI1lKZ8UmvPrtohsNvtbdn8Ag7UAaddmlr6+v73/3ASPFZf/+k+a1AAAAAElFTkSuQmCC","orcid":"","institution":"Obafemi Awolowo University","correspondingAuthor":true,"prefix":"","firstName":"Christopher","middleName":"","lastName":"Adetola","suffix":""},{"id":572461254,"identity":"b11738cf-a7d1-4c28-9920-8c47ba8ebd99","order_by":3,"name":"Fatimoh Egbinola","email":"","orcid":"","institution":"Obafemi Awolowo University","correspondingAuthor":false,"prefix":"","firstName":"Fatimoh","middleName":"","lastName":"Egbinola","suffix":""},{"id":572461260,"identity":"007b5b5b-759a-4c19-b681-7423f2c22705","order_by":4,"name":"Rosemary Ojo","email":"","orcid":"","institution":"The Federal Polytechnic Ilaro","correspondingAuthor":false,"prefix":"","firstName":"Rosemary","middleName":"","lastName":"Ojo","suffix":""}],"badges":[],"createdAt":"2025-12-23 07:23:26","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8431208/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8431208/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":100076594,"identity":"612c50a9-5ade-498c-a218-cb9530c12cf0","added_by":"auto","created_at":"2026-01-12 17:30:42","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5094503,"visible":true,"origin":"","legend":"","description":"","filename":"MetricsComparisonofMLAlgorithmsusedtoclassifyNoilerChickenEggfromEggQualityTrait.docx","url":"https://assets-eu.researchsquare.com/files/rs-8431208/v1/211a6a4dd98e44b9e7140a16.docx"},{"id":100076590,"identity":"82a2a7d9-9d61-47a6-9287-a43244c1fd2a","added_by":"auto","created_at":"2026-01-12 17:30:42","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":6540,"visible":true,"origin":"","legend":"","description":"","filename":"6ebe2bd66c4a460c8d5f41cfdc6a6b6a.json","url":"https://assets-eu.researchsquare.com/files/rs-8431208/v1/1d13f8999818a50ca96cb87f.json"},{"id":100364950,"identity":"ce9ecc3c-2ed7-4613-8148-cb22b43df382","added_by":"auto","created_at":"2026-01-16 07:54:30","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":97996,"visible":true,"origin":"","legend":"","description":"","filename":"6ebe2bd66c4a460c8d5f41cfdc6a6b6a1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8431208/v1/626d6cf2f600bcf61880e480.xml"},{"id":100076588,"identity":"664ccd4a-1874-448e-82c8-57e2f2b18666","added_by":"auto","created_at":"2026-01-12 17:30:42","extension":"png","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":16493,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8431208/v1/5f306f941735df07aa5784ab.png"},{"id":100364929,"identity":"280dfaf2-a541-4683-972d-fcb1b481b9a1","added_by":"auto","created_at":"2026-01-16 07:54:29","extension":"xml","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":95653,"visible":true,"origin":"","legend":"","description":"","filename":"6ebe2bd66c4a460c8d5f41cfdc6a6b6a1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8431208/v1/cd461edeff59c0f550cae97d.xml"},{"id":100076591,"identity":"9ed788f7-cd8e-438b-9303-a14497e5ed73","added_by":"auto","created_at":"2026-01-12 17:30:42","extension":"html","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":105535,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8431208/v1/d10b921590e0a4d946f1d6c8.html"},{"id":100364789,"identity":"75fcc190-7bb8-4743-82e5-8fa82321a3f0","added_by":"auto","created_at":"2026-01-16 07:54:19","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":67739,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCorrelation matrix of the internal and external egg quality traits of Noiler chicken\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8431208/v1/68c210c8cf6aa9463469abb2.png"},{"id":100382063,"identity":"2490bdaf-0f05-420d-9d92-61ef36b8133f","added_by":"auto","created_at":"2026-01-16 10:40:51","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":883485,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8431208/v1/9b294638-7d7d-4800-8079-6ea1c1ebcf11.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Metrics Comparison of Machine Learning Algorithms used to classify Noiler Chicken Egg from Egg QualityTrait","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003eIn Nigeria, animal agriculture is the second largest subsector of the country\u0026rsquo;s agricultural sector, with a 9.2% average contribution to agricultural GDP between 1960 and 2020. (Uzonwanne et al., \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Animal agriculture also directly provides animal protein, including dairy and poultry products. Furthermore, it generates employment, income, and food security, highlighting its critical economic importance (Ojiako \u0026amp; Olayode, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2008\u003c/span\u003e). The poultry industry plays a crucial role in providing affordable and nutritious protein sources, such as meat and eggs, to the growing global population. Among the various poultry breeds, the Noiler chicken, an improved Nigerian indigenous dual-purpose breed, has gained significant attention for its desirable meat and egg production characteristics (Dogara et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Ajayi et al., \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Bamidele et al., \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Yakubu et al., \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eEggs are reservoirs of nutrition for embryos that are still developing and are also a source of protein for human beings (Liswaniso et al., \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Tyasi et al., \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). The quality of eggs is determined by various traits, including egg weight, shell thickness, yolk colour, and Haugh unit (Dogara et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Xiao et al., 2023). Egg weight is one of the important parameters in marketing and it determines egg size; in addition to this, it has a significant role in determining quality indexes such as albumen ratio, eggshell thickness, and hatchability (Asadi et al., 2010; Alkan et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Kul \u0026amp; Şeker, 2004). Traditionally, egg sizing methods have always relied on manual and instinctive grading which is stressful and labour-intensive which is subjective thereby leading to inconsistencies. Therefore, machine learning algorithms can be trained using various phenotypic traits to accurately classify several egg sizes (small, large) with high recall and precision leading to reduced labour cost and improved consistency of egg grading for large-scale poultry operation. Previous studies predicted egg weight and classified egg size using image processing and machine learning techniques (Thipakorn et al., \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). \u0026Ccedil;ımen \u003cem\u003eet al.\u003c/em\u003e (2018) studied the Classification of dynamic egg weight using an artificial neural network. Similarly, Soltani et al. (\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) proposed an egg volume prediction method using machine vision techniques based on Pappus theorem and Artificial Neural Network. Their research seeks to develop a predictive model that leverages machine learning techniques to estimate egg weight from egg quality traits, promoting more efficient and accurate egg sorting and grading practices.\u003c/p\u003e \u003cp\u003eIn artificial intelligence (AI), machine learning is the study of teaching computers to learn from data without the need for explicit programming. It entails the broad application of data to build machines that are either fully or partially autonomous. According to Tyasi et al. (\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2020\u003c/span\u003e), machine learning techniques including Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR) have demonstrated encouraging outcomes in estimating egg size based on egg quality parameters. These algorithms don't require rigid assumptions about the distribution of the data, and they can handle intricate non-linear interactions and produce precise predictions (\u0026Ccedil;ifts\u0026uuml;ren \u0026amp; Akkol, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). The application of these advanced machine learning techniques in the context of Noiler chickens can lead to the development of robust predictive models, which can enhance breeding programs and production management.\u003c/p\u003e \u003cp\u003eFour machine learning algorithms were employed to classify egg size based on internal and external egg quality traits namely Support Vector Machine (SVM), Logistic Regression, Random Forest, and Linear Regression.\u003c/p\u003e \u003cp\u003eAccording to Aruna and Rajagopalan (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2011\u003c/span\u003e), SVM is considerably more successful than other ML approaches at identifying minute patterns in big datasets. Since the SVM incorporates elements and methods from machine learning, statistics, functional analysis, and convex optimization, it has raised expectations recently due to its performance in classification problems, regression, and forecasting. The SVMs are appropriate for classifying tiny samples of data in addition to having excellent adaptability, global optimization, and good generalization performance (Ccoicca, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2013\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eA class of models known as generalized linear models, for which certain essential linear assumptions are loosened, includes logistic regression. Since linear regression requires that outcomes be measured on a continuous scale, logistic regression is a great tool for modeling relationships with such outcomes. Classification modeling, which uses logistic regression, is frequently used to model the likelihood that observations will fall into one of several classes of a categorical outcome. Two classes (e.g., active/inactive, promoted/not promoted), multiple unordered classes (e.g., job family, location), or multiple ordered classes (e.g., survey items measured on a Likert scale, performance level, education level) can have a binomial classification context. As logistic regression uses a link function to generalize the linear model for non-continuous outcomes, it is a type of regression analysis that, by definition, returns a numeric outcome, and probabilities are numeric, regardless of the classes of the outcome variable (Starbuck, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2023\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eRandom Forest (RF) is a machine learning approach that integrates multiple decision trees to minimize feature data correlation. The computational complexity of RF is O(n), where n represents the number of samples, making it efficient for processing large datasets. Additionally, its parallel execution capability enhances computational speed.\u003c/p\u003e \u003cp\u003eTo reduce correlation between decision trees, RF employs random selection of both samples and features. Initially, a randomly selected subset of data is drawn from the original training set (Salman et al., \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Moreover, a random subset of features is chosen for constructing each decision tree. This dual-randomization strategy lowers inter-tree correlation, thereby reducing the risk of overfitting and improving model accuracy (Cutler et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2011\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eLinear regression is the statistical method used to ascertain whether there is a linear relationship between the independent variable (X) and the dependent variable (Y). Determining the optimal linear function, or a collection of coefficients (weights) that enable the function to make the most accurate possible prediction about the value of the dependent variable. Formally, the linear regression model can be expressed as\u003c/p\u003e \u003cp\u003eY\u0026thinsp;=\u0026thinsp;β0+β1X1+β2X2+⋯+βnXn+ε (Hall \u0026amp; Horowitz, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2007\u003c/span\u003e)\u003c/p\u003e \u003cp\u003ewhere β0 is the intercept, β1 to βn is the regression coefficient, X1 to Xn is the independent\u003c/p\u003e \u003cp\u003evariable, and ε is the error term (Qu, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe objectives of this study, therefore, were to develop predictive models using machine learning algorithms (SVM, LR, RF and Linear Regression) to classify egg size based on egg quality traits and to assess the metrics of the developed predictive models in estimating egg size.\u003c/p\u003e"},{"header":"MATERIALS AND METHODS","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\n \u003ch2\u003e3.1.Description of the Study Area\u003c/h2\u003e\n \u003cp\u003eThis experiment was carried out in the Poultry Unit of the Teaching and Research Farm of Obafemi Awolowo University, Ile-Ife, Osun State, Nigeria. Ile-Ife.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\n \u003ch2\u003e3.2. Experimental birds and management\u003c/h2\u003e\n \u003cp\u003eTwo hundred and forty (240) day-old chicks, comprising 80 each from black, brown and barred plumage varieties were used for the study. The chicks were brooded together for three weeks and subsequently raised on deep litter in labelled pens within a semi-open housing system at around temperature of 32-35\u003csup\u003e0\u003c/sup\u003eC on the floor at the first week and decreased by 2-30C each week until it reaches room temperature with optimum lighting conditions. The experimental birds were fed \u003cem\u003ead libitum\u003c/em\u003e for the first 12 weeks. Thereafter, the birds were sexed and transferred into standard galvanized battery cages, where they were managed uniformly throughout the study period. The birds were sorted into pens based on their plumage colour, and the pens were assigned using a completely randomized design. The birds were raised under an intensive production system. They were fed a standard commercial starter diet containing 19% crude protein (CP) and 2,750 kcal ME/kg from hatch to 4 weeks of age, followed by a grower diet containing 15% CP and 2,600 kcal ME/kg, offered ad libitum. Clean water was also provided \u003cem\u003ead libitum\u003c/em\u003e.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\n \u003ch2\u003e3.4 Data collection\u003c/h2\u003e\n \u003cp\u003eA total of three hundred (300) freshly laid eggs were collected from three Noiler plumage varieties (black, brown and barred) at two laying ages: 26 weeks (young) or 46 weeks (old). For each plumage variety, one hundred (100) eggs were collected. The external egg quality parameters measured included egg weight, egg width, egg length, shell surface area, percentage of shell thickness, and shell weight. The internal egg quality parameters assessed were albumen height, haugh unit, yolk height, yolk index, yolk and albumen weight, and yolk width.\u003c/p\u003e\n \u003cp\u003eEgg weight, yolk and albumen weight and shell weights were recorded in grams using the KERRO\u0026reg; electronic compact scale (model number BL50001) with a maximum capacity of 5000 g and sensitivity of 0.1 g. Egg length (EL), egg width (EW), yolk height (YH), albumen height (AH) and yolk width (YW) were measured in centimetres using a Vernier calliper while shell thickness (ST) was measured in millimetres using a micrometre screw gauge. Haugh units were calculated by the formula described by Khaleel et al. (2019) and Wahyuni et al. (\u003cspan class=\"CitationRef\"\u003e2023\u003c/span\u003e) as follows:\u003c/p\u003e\n \u003cdiv id=\"Equa\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e$$\\:HU\\:=\\:100\\text{log}\\left(H\\:+\\:7.57\\:-\\:1.7\\:W0.37\\right)$$\u003c/div\u003e\n \u003c/div\u003e\n \u003cp\u003eWhere, H is albumen height in millimeters and W is observed weight of the egg in grams.\u003c/p\u003e\n \u003cdiv id=\"Equb\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equb\" name=\"EquationSource\"\u003e$$\\:Shape\\:Index\\:=\\:Egg\\:width/\\:Egg\\:length\\:\\times\\:\\:100$$\u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Equc\" class=\"Equation\"\u003e\n \u003cdiv class=\"mathdisplay\" id=\"FileID_Equc\" name=\"EquationSource\"\u003e$$\\:Yolk\\:index\\:=\\:Yolk\\:height\\:/Yolk\\:width\\:\\times\\:\\:100$$\u003c/div\u003e\n \u003c/div\u003e\n \u003cp\u003eThe surface of an egg was calculated using the formula of Paganelli et al., (\u003cspan class=\"CitationRef\"\u003e1974\u003c/span\u003e): P\u0026thinsp;=\u0026thinsp;4.835 \u0026times; W\u003csup\u003e0.662\u003c/sup\u003e, where: W - egg weight.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\n \u003ch2\u003e3.5 Data and Statistical Analysis\u003c/h2\u003e\n \u003cp\u003eThe data collected from the egg quality traits were analyzed using Python programming language. The analysis involved statistical methods, visualizations, and machine learning algorithms to classify egg size based on internal and external egg quality traits. All data were analyzed using the Python programming language (Python Software Foundation, \u003cspan class=\"CitationRef\"\u003e2023\u003c/span\u003e). The complete codebase for data exploration is publicly available on GitHub, the link below:\u003c/p\u003e\n \u003cp\u003e\u003cspan class=\"ExternalRef\"\u003e\u0026nbsp;\u003cspan class=\"RefSource\"\u003ehttps://github.com/chrisseub/ML-Practice-Rep/blob/main/EGG%20SIZE%20MODELS/EGG_Size_PREDICTION.ipynb\u003c/span\u003e \u0026nbsp;\u003c/span\u003e\u003c/p\u003e\n \u003cp\u003eBy outlining the relationship between variables in a sample or population, descriptive statistics help to organize and synthesize data (Kaur et al. \u003cspan class=\"CitationRef\"\u003e2018\u003c/span\u003e). This was calculated to provide a summary of the data, encompassing measurements of dispersion (standard deviation, range) and central tendency (mean, median). The distribution of egg quality characteristics among plumage variants was revealed by these figures.\u003c/p\u003e\n \u003cp\u003eTo investigate the links between internal and external egg quality features, a correlation matrix was created using Pearson\u0026apos;s correlation coefficients. To find significant positive or negative correlations between variables like egg weight, albumen height, yolk weight, and shell thickness, this matrix was shown using Seaborn\u0026apos;s heatmap. For Class Distribution, the distribution of the \u0026apos;Outcome\u0026apos; variable (Small vs. Large eggs) was examined using value_counts(). This helped to understand the class balance in the dataset. Finally, Grouped Analysis was done, the mean of the numeric features was calculated for each \u0026apos;Outcome\u0026apos; group (Small and Large). This allowed for a comparison of the average trait values between the two egg size categories.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\n \u003ch2\u003e3.6 Machine Learning\u003c/h2\u003e\n \u003cdiv id=\"Sec8\" class=\"Section3\"\u003e\n \u003ch2\u003e3.6.1 Data Collection and Pre-processing\u003c/h2\u003e\n \u003cp\u003eThe dataset used are the measurements of internal and external trait of Noiler chicken eggs. These datasets were utilized as features (features: egg length, egg width, yolk height, yolk width, albumin height, yolk albumin weight, shell weight, shell thickness, Haugh Unit, shell surface area, yolk index, shape index; target: size class [0, 1]).\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec9\" class=\"Section3\"\u003e\n \u003ch2\u003e3.6.2 Pre-processing\u003c/h2\u003e\u003cspan\u003e\n \u003cp\u003e\u003cstrong\u003e3.6.2.1\u003c/strong\u003e Handling Missing Values: Hyphen values in the dataset were treated as missing values (NaN) and then imputed using the mean of the respective columns i.e, Missing values imputed using SimpleImputer (mean was used for numerical features).\u003c/p\u003e\n \u003c/span\u003e \u003cspan\u003e\n \u003cp\u003e\u003cstrong\u003e3.6.2.2\u003c/strong\u003e Data Splitting: the dataset was split into training and testing sets to evaluate performance on unseen data. A stratify split was used to maintain the proportion of Small and Large eggs in both sets. Train-test split (80:20) with stratification (train_test_split).\u003c/p\u003e\n \u003c/span\u003e \u003cspan\u003e\n \u003cp\u003e3.6.2.3 Hyperparameter Tuning: this was employed for only the SVM model to optimize the model performance because their settings (like C and gamma) have a greater influence on how they draw the line to separate the data. Small changes in these settings can significantly change the SVM\u0026apos;s performance. Less complex models, such as Linear and Logistic Regression, have fewer parameters that have a significant impact on the result. Despite being more complex, Random Forests are generally less sensitive to tuning than SVMs. Hence, hyperparameter tuning was not utilized in the other models to avoid overfitting.\u003c/p\u003e\n \u003c/span\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec10\" class=\"Section3\"\u003e\n \u003ch2\u003e3.6.3 Machine Learning Models\u003c/h2\u003e\n \u003cp\u003eThis study employed three machine learning models - Logistic Regression, Random Forest, and Linear Regression - to classify egg outcome based on measured traits. The performance of each model was evaluated using various metrics to determine their effectiveness in predicting whether an egg is small or large.\u003c/p\u003e\u003cspan\u003e\n \u003cp\u003e\u003cstrong\u003e3.6.3.1 Support Vector Machine (SVM)\u003c/strong\u003e: an SVM classifier with a linear kernel was implemented using svm.SVC class from the scikit-learn library.\u003c/p\u003e\n \u003c/span\u003e \u003cspan\u003e\n \u003cp\u003e\u003cstrong\u003e3.6.3.2 Logistic Regression\u003c/strong\u003e: This linear model was applied to predict the binary outcome (small or large) based on the input egg characteristics. Its performance in classifying the egg size was assessed through standard evaluation metrics.\u003c/p\u003e\n \u003c/span\u003e \u003cspan\u003e\n \u003cp\u003e\u003cstrong\u003e3.6.3.3 Random Forest\u003c/strong\u003e: An ensemble learning method consisting of multiple decision trees, this model was utilized to improve the robustness and accuracy of egg size classification. Its predictions were aggregated from the individual trees to determine the final outcome.\u003c/p\u003e\n \u003c/span\u003e \u003cspan\u003e\n \u003cp\u003e\u003cstrong\u003e3.6.3.4 Linear Regression\u003c/strong\u003e: this is primarily a regression model, it was included in the analysis to provide a comparative perspective on the relationship between egg traits and the numerical representation of the outcome. While not a classification model, its output can be interpreted in the context of predicting the scaled egg size.\u003c/p\u003e\n \u003c/span\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec11\" class=\"Section3\"\u003e\n \u003ch2\u003e3.6.4 Model Evaluation\u003c/h2\u003e\n \u003cdiv id=\"Sec12\" class=\"Section4\"\u003e\n \u003ch2\u003e\u003cstrong\u003e3.6.4.1 Metrics\u003c/strong\u003e:\u003c/h2\u003e\n \u003cp\u003eFor classification, the following metrics were evaluated; Accuracy, Precision, Recall, F1-score. For regression; R-squared, MSE. For validation; train_test_split with test_size\u0026thinsp;=\u0026thinsp;0.2 was utilized and finally for visualization; Decision trees from RF plotted using plot_tree\u003c/p\u003e\n \u003c/div\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec13\" class=\"Section3\"\u003e\n \u003ch2\u003e3.6.5. Software\u003c/h2\u003e\n \u003cp\u003ePython 3.11, libraries: scikit-learn (1.6.1), pandas, numpy, matplotlib.\u003c/p\u003e\n \u003cp\u003eThe complete notebook for modeling is available on GitHub: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/chrisseub/ML-Practice-Rep/blob/main/EGG%20SIZE%20MODELS/EGG%20SIZE%20CLASSIFICATION%20MODELS.ipynb\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\n \u003c/div\u003e\n\u003c/div\u003e"},{"header":"RESULTS","content":"\u003cp\u003e \u003cb\u003eDescriptive statistics\u003c/b\u003e \u003c/p\u003e \u003cp\u003eThe descriptive statistics of Noiler chickens' internal and external egg quality characteristics are shown in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Three hundred eggs in all were assessed for weight and associated parameters. With a mean of 63.85 g and a standard deviation of 6.30 g, the egg weights varied from 49.7 g to 78.9 g. Eggs were divided into large and small categories based on a 55 g criteria. A total of 71 eggs were classed as small (\u0026lt; 55 g), while 229 eggs were classified as large (≥ 55 g).\u003c/p\u003e \u003cp\u003eAmong the external traits, the mean egg length was 5.80 cm and mean egg width was 4.34 cm, reflecting the elongated shape typical of poultry eggs. Among the external traits, the mean egg length was 5.80 cm and mean egg width was 4.34 cm, reflecting the elongated shape typical of poultry eggs. The samples showed consistent shell structure, with the shell thickness recording a mean of 0.323 mm and the shell weight having a mean of 7.59 g, with values ranging from 5.90 g to 9.90 g.\u003c/p\u003e \u003cp\u003eThe majority of the total egg mass was made up of the yolk and albumen, which had a mean weight of 55.61 g, a minimum weight of 43.00 g, and a maximum weight of 70.20 g for internal quality features. The albumen height had a lower mean value of 0.83 mm, which reflected the flatness of the thick albumen layer, whereas the yolk height and yolk breadth averaged 1.93 mm and 4.04 mm, respectively. The samples all exhibited good internal egg quality as indicated by the high mean value of 84.54 for the Haugh unit, a common indicator of albumen quality and egg freshness.\u003c/p\u003e \u003cp\u003eAdditional calculated indices that support the normal ovality of eggs in Noiler hens are the shape index (mean: 75.56%) and the yolk index (mean: 48.00%). Egg size is correlated with the shell surface area, which ranged from 61.51 cm² to 83.44 cm² with an average of 72.48 cm².\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eOverall descriptive statistics of internal and external egg quality traits for Noiler chicken\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eVARIABLE\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eN\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMEAN\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSD\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eMIN\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eMAX\u003c/p\u003e \u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEgg weight (g)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e63.849\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e6.304\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e49.700\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e78.900\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEgg length (cm)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5.801\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.307\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e5.100\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e7. 000\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEgg width (cm)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4.341\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.291\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e4.900\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYolk height (mm)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e296\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.932\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.140\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.800\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e2.300\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYolk width mm\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e297\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4.040\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.274\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e3.200\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e4.800\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAlbumin height (mm)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e299\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.834\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.134\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.350\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e1.200\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYolk and albumin\u0026nbsp;weight (g)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e55.607\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e5.841\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e43.000\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e70.200\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eShell weight (g)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7.594\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.705\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e5.900\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e9.900\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eShell thickness (mm)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.323\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.043\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.200\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.4200\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHaugh unit\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e299\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e84.540\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.744\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e81.801\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e86.421\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eShell surface area (cm\u003csup\u003e2\u003c/sup\u003e)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e72.484\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e4.740\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e61.505\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e83.443\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYolk index (%)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e296\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e48.002\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e4.481\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e21.622\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e61.765\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eShape Index (%)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e300\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e75.556\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e5.982\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e5.263\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e89.091\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e \u003cp\u003e\u003c/p\u003e \u003cp\u003e \u003cb\u003eCorrelation matrix\u003c/b\u003e \u003c/p\u003e \u003cp\u003eThe correlation matrix illustrating the connections between several internal and external egg quality characteristics in Noiler chicken eggs is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. A number of statistically significant relationships that provide insight on the co-variation of several attributes are revealed by the matrix.\u003c/p\u003e \u003cp\u003eKey egg quality parameters, such as egg length, egg width, yolk and albumen weight, and shell weight, showed strong positive correlations with egg weight. According to these positive correlations, larger eggs typically have heavier shells and more interior components (yolk and albumen). Interestingly, there was a particularly strong link between egg weight and yolk-albumen weight, suggesting that the interior makeup of the egg has a significant role in total egg weight. The relationship between external parameters and shell weight was further supported by the positive association that shell weight demonstrated with both egg length and egg width. On the other hand, most other traits showed weak or even slightly negative correlations with shell thickness, suggesting that an increase in shell thickness does not always translate into an increase in egg weight or size.\u003c/p\u003e \u003cp\u003eAn established indicator of albumen quality and freshness, the Haugh unit, showed poor relationships with most characteristics, suggesting that it is mainly unaffected by yolk quantity, egg size, and shell weight. Egg shape is comparatively consistent regardless of other physical parameters, as seen by the shape index's little association with egg weight and other size-related characteristics. Moderate correlations were also observed between shell surface area and both egg weight and egg dimensions, which is expected since surface area expands proportionally with overall egg size.\u003c/p\u003e \u003cp\u003eThe correlation matrix concludes by indicating that egg weight can be a good predictor of a number of internal and external characteristics of egg quality, especially those related to size and content. However, some parameters, like shell thickness and Haugh unit, seem to be controlled separately from the characteristics that affect total egg weight.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eComparison of Performance metrics among the four Machine Learning models\u003c/b\u003e \u003c/p\u003e \u003cp\u003eOut of the four models that were assessed: Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), and Linear Regression (LRG), the Random Forest model had the highest accuracy (98.3%). The next two models were SVM (95%) and Logistic Regression (95%), which were nearly comparable in terms of total accuracy. Additionally, Random Forest achieved perfect precision (1.00) and a recall of 0.98 for the 'Large' class, and perfect recall (1.00) for the 'Small' class, demonstrating its ability to discern egg sizes with few errors. SVM's ability to handle non-linearity makes it a more dependable choice for classification tasks involving complex relationships between egg traits, even though its overall accuracy results were the same as those of logistic regression. Compared to the categorization models, Linear Regression, a regression-based model, would not be the best fit for modeling the complex interactions between egg size and quality parameters, as indicated by its R-squared value of 0.45. It achieved 91.7% classification accuracy when its continuous predictions were applied to threshold-based classification.\u003c/p\u003e \u003cp\u003eFrom the perspective of interpretability, Logistic Regression provides a straightforward and understandable model, which makes it perfect for comprehending how various egg characteristics affect classification through its coefficients. Even though SVM and Random Forest are typically better at identifying intricate patterns, they might not be as directly interpretable as Logistic Regression and frequently behave more like \"black box\" models. Nonetheless, Random Forest's ensemble-based methodology naturally improves its robustness and successfully lowers overfitting, which helped it achieve the highest accuracy (98.3%) of all the models tested in this job.\u003c/p\u003e \u003cp\u003eWhile Linear Regression did not perform as well as the classification models in terms of its primary regression metrics (R-squared of 0.45), it's important to remember it is not designed for discrete classification and is less reliable for this specific purpose compared to models built for classification. Based on the classification performance observed, the tuned SVM and Logistic Regression models were the next best alternatives, both achieving 95% accuracy. Considering both performance and practical application for tasks like automated egg grading, Random Forest emerges as the most effective model, followed closely by the tuned SVM, while Logistic Regression remains a strong and more interpretable alternative for simpler implementations\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComparison of Machine Learning algorithm metrics for Noiler chicken egg size classification\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMetric\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSVM\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eLogistic Regression\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRandom Forest\u003c/p\u003e \u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eLinear Regression (with 0.5 threshold)\u003c/p\u003e \u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.95\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.95\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.98\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.92\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePrediction (Class 0)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.82\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.82\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.93\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.76\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePrediction (Class 1)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.98\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRecall (Class 0)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.93\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRecall (Class 1)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.93\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.93\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.98\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.91\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eF1-score (Macro)\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.93\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.93\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.98\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.89\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eR-squared\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.72\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.72\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.91\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.45\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMean Squared Error\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.05\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.05\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.02\u003c/p\u003e \u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.09\u003c/p\u003e \u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e \u003cp\u003e\u003c/p\u003e "},{"header":"DISCUSSION","content":"\u003cp\u003eThis study demonstrates the ability of machine learning algorithms to effectively classify egg size in Noiler chickens using internal and external egg quality traits, with the Random Forest (RF) model showing superior predictive performance. The high accuracy (98.3%), near-perfect precision, and strong recall achieved by RF indicate its robustness in handling complex, non-linear relationships among egg quality traits. This performance advantage is consistent with the ensemble nature of RF, which reduces variance and overfitting by aggregating multiple decision trees trained on randomly selected features and samples. Similar findings have been reported in poultry and livestock studies where RF outperformed linear and kernel-based models in predicting production traits due to its ability to model hierarchical interactions among correlated predictors (Tyasi et al., \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Salman et al., \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eSupport Vector Machine (SVM) and Logistic Regression (LR) also demonstrated strong and comparable classification performance (95% accuracy), suggesting that egg size is largely separable based on measured phenotypic traits. However, the higher recall and F1-score achieved by RF indicate a more balanced performance, particularly in the presence of class imbalance between small and large eggs. Logistic Regression offered greater interpretability through model coefficients, making it useful for understanding the directional influence of specific egg traits, whereas SVM and RF functioned as more complex, data-driven classifiers. These results align with previous egg classification studies that reported improved performance of ensemble and non-linear models over traditional linear approaches when multiple correlated egg quality traits are involved (Çımen \u0026amp; Yabanova, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2018\u003c/span\u003e; Thipakorn et al., \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2017\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eAlthough Linear Regression achieved relatively high threshold-based classification accuracy, its low R-squared value confirms its limited suitability for modeling the complex biological relationships underlying egg size determination. Linear models assume additive and linear relationships, which are unlikely to fully capture the interactions between internal traits (such as yolk–albumen composition) and external traits (such as shell surface area). Therefore, Linear Regression should be regarded only as a baseline comparator rather than a practical classification tool. Finally, the findings highlight the suitability of Random Forest models for automated egg grading systems and decision support in poultry breeding programs. Future studies could improve generalizability by addressing class imbalance, incorporating larger datasets, and integrating image-based or deep learning features to further enhance predictive accuracy.\u003c/p\u003e"},{"header":"CONCLUSION","content":"\u003cp\u003eThis study demonstrates the effectiveness of machine learning approaches for classifying Noiler chicken eggs into size categories using internal and external egg quality traits. Among the evaluated models, the Random Forest algorithm consistently outperformed Support Vector Machine, Logistic Regression, and Linear Regression, achieving the highest accuracy, precision, recall, and F1-score. Its superior performance highlights the ability of ensemble-based methods to capture complex, non-linear relationships among correlated biological traits, which are characteristic of egg quality parameters.\u003c/p\u003e\u003cp\u003eThe findings also indicate that while traditional statistical and linear models can provide baseline insights, they are limited in their capacity to model the multifactorial nature of egg size determination. In contrast, machine learning models, particularly Random Forest offer a scalable framework for automated egg grading and decision support in poultry production systems. The application of such models can contribute to improved production efficiency, standardized egg classification, and informed breeding strategies.\u003c/p\u003e\u003cp\u003eDespite these promising results, the study is constrained by class imbalance and reliance on manually measured egg traits. Future research should incorporate larger and more diverse datasets, address imbalance through resampling techniques, and explore the integration of image-based features and deep learning methods\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eDeclarations\u003c/h2\u003e \u003cp\u003e \u003cstrong\u003eCompeting interests\u003c/strong\u003e \u003cp\u003eThe authors declare that they have no competing interests or other interests that might be perceived to influence the results and/or discussion reported in this paper.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eConflict of interest statement\u003c/strong\u003e \u003cp\u003eThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest\u003c/p\u003e \u003c/p\u003e\u003ch2\u003eFunding\u003c/h2\u003e \u003cp\u003eThis research received no external funding or grant.\u003c/p\u003e \u003cp\u003eData availability The code used to generate the results presented in this study is available in a public GitHub repository. Specific implementation details are provided in the Materials and Methods section of the manuscript. The data that was utilized with this code is available from the corresponding authors upon reasonable request..\u003c/p\u003e \u003cp\u003e Ethies approval All experimental procedures were approved by the Animal Welfare and Ethics Committee, Department of Animal Science, Obafemi Awolowo University and were conducted following the ethical guidelines for the use of animals in biomedical research established by the Obafemi Awolowo University, Ile-Ife. This study involved no clinical trial; hence, clinical trial registration is not applicable.\u003c/p\u003e \u003cp\u003eClinical trial number: not applicable\u003c/p\u003e \u003cp\u003eConsent to Participate Not applicable.\u003c/p\u003e \u003cp\u003eConsent for Publication Not applicable.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eI.D., H.B., C.A., F.E.,\u0026nbsp;and\u0026nbsp;R.O\u0026nbsp;wrote the main manuscript text and prepared figures\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eThe code used to generate the results presented in this study is available in a public\u0026nbsp;GitHub repository. Specific implementation details are provided in the Materials and Methods section of the manuscript. The data that was utilized with this code is available from the corresponding authors upon reasonable request..\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAdeboye O, Schultz B, Adekalu K, Prasad K. Soil water storage, yield, water productivity and transpiration efficiency of soybeans (Glycine max L. Merr) as affected by soil surface management in Ile-Ife, Nigeria. International Soil Water Conserv Research. 2017;5(1):10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.iswcr.2017.04.006\u003c/span\u003e\u003cspan address=\"10.1016/j.iswcr.2017.04.006\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlkan S, Karabag K, Galic A, Karsli T, Balcıoğlu M. Effects of selection for body weight and egg production on egg quality traits in Japanese quails (Coturnix coturnix japonica) of different lines and relationships between these traits. Kafkas Universitesi Veteriner Fakultesi Dergisi. 2010;16:239\u0026ndash;44.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAjayi FO, Bamidele O, Hassan WA, Ogundu U, Yakubu A, Alabi OO. Production performance and survivability of six dual-purpose breeds of chicken under smallholder farmers\u0026rsquo; management practices in Nigeria. Arch Anim Breed. 2020;63:387\u0026ndash;408. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.5194/aab-63-387-2020\u003c/span\u003e\u003cspan address=\"10.5194/aab-63-387-2020\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAruna S, Rajagopalan SP. A novel SVM-based CSSFFS feature selection algorithm for detecting breast cancer. Int J Comput Appl. 2011;31(8):14\u0026ndash;20.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAsadi V, Raoufat MH. Estimation of egg weight by machine vision and neural networks technique. Int J Nat Eng Sci (IJNES). 2010;4(2):1\u0026ndash;4.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBamidele O, Sonaiya EB, Adebambo OA, Dessie T. On-station performance evaluation of improved tropically adapted chicken breeds for smallholder poultry production systems in Nigeria. Trop Anim Health Prod. 2020;52:1541\u0026ndash;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s11250-019-02158-9\u003c/span\u003e\u003cspan address=\"10.1007/s11250-019-02158-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003e\u0026Ccedil;ifts\u0026uuml;ren MN, Akkol S. Prediction of internal egg quality characteristics and variable selection using regularization methods: ridge, LASSO and elastic net. Archives Anim Breed. 2018;61:279\u0026ndash;84.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003e\u0026Ccedil;ımen H, Yabanova İ. (2018). Classification of dynamic egg weight using artificial neural network. In \u003cem\u003eProceedings of the 7\u003c/em\u003eth \u003cem\u003eInternational Conference on Computers Communications and Control\u003c/em\u003e (ICCCC) (pp. 302\u0026ndash;305). Oradea, Romania. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICCCC.2018.8390475\u003c/span\u003e\u003cspan address=\"10.1109/ICCCC.2018.8390475\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCcoicca YJ. Applications of Support Vector Machines in the Exploratory Phase of Petroleum and Natural Gas: a Survey. Int J Eng Technol. 2013;2(2):113\u0026ndash;25. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.14419/ijet.v2i2.834\u003c/span\u003e\u003cspan address=\"10.14419/ijet.v2i2.834\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCutler A, Cutler DR, Stevens JR. (2011). Random forest. \u003cem\u003eMachine learning magazine\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/978-1-4419-9326-7_5\u003c/span\u003e\u003cspan address=\"10.1007/978-1-4419-9326-7_5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDogara UM, Kalla D, Mancha Y. Evaluation of egg production and egg quality traits of Noiler chickens. Trop J Agricultural Sci. 2021;23(1):100\u0026ndash;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ajol.info/index.php/tjas/article/view/219041\u003c/span\u003e\u003cspan address=\"https://www.ajol.info/index.php/tjas/article/view/219041\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHall P, Horowitz JL. Methodology and convergence rates for functional linear regression. Annals Stat. 2007;35(1):70\u0026ndash;91. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1214/009053606000000957\u003c/span\u003e\u003cspan address=\"10.1214/009053606000000957\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHayward DF, Oguntoyinbo JS. The climatology of West Africa. New York: Rowan and Little Eld; 1987.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKaur P, Stoltzfus J, Yellapu V. Descriptive statistics. Int J Acad Med. 2018;4(1):60\u0026ndash;3. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.4103/IJAM.IJAM_7_18\u003c/span\u003e\u003cspan address=\"10.4103/IJAM.IJAM_7_18\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhaleel RMT. Prediction of Haugh Unit through albumen height and egg weight. Mesop J Agric. 2019;47:37\u0026ndash;43.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiswaniso S, Qin N, Tyasi TL, Chimbaka IM. Use of data mining algorithms CHAID and CART in predicting egg weight from egg quality traits of indigenous free-range chickens in Zambia. Adv Anim Veterinary Sci. 2021;9(2):215\u0026ndash;20. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.17582/journal.aavs/2021/9.2.215.220\u003c/span\u003e\u003cspan address=\"10.17582/journal.aavs/2021/9.2.215.220\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOjiako IA, Olayode GO. Analysis of trends in livestock production in Nigeria: 1970\u0026ndash;2005. J Agric Social Res (JASR). 2008;8(1):114\u0026ndash;20.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePaganelli CV, Olszowka A, Ar A. The Avian Egg: Surface Area, Volume, and Density. Condor. 1974;76(3):319\u0026ndash;25. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2307/1366345\u003c/span\u003e\u003cspan address=\"10.2307/1366345\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePython Software Foundation. (2023). \u003cem\u003ePython (Version 3.11)\u003c/em\u003e [Computer software]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.python.org/\u003c/span\u003e\u003cspan address=\"https://www.python.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQu K. (2024). Research on linear regression algorithm. \u003cem\u003eMATEC Web of Conferences, 395\u003c/em\u003e, 01046. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1051/matecconf/202439501046\u003c/span\u003e\u003cspan address=\"10.1051/matecconf/202439501046\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalman M, Khan S, Ullah R, Khan MA. Comparative evaluation of machine learning algorithms for poultry production and egg quality classification. Artif Intell Agric. 2024;9:100\u0026ndash;12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.aiia.2024.100112\u003c/span\u003e\u003cspan address=\"10.1016/j.aiia.2024.100112\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalman H, Kalakech A, Steiti A. Random forest algorithm overview. Babylon J Mach Learn. 2024;2024(1):69\u0026ndash;79. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.58496/BJML/2024/007\u003c/span\u003e\u003cspan address=\"10.58496/BJML/2024/007\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSoltani M, Omid M, Alimardani R. Egg volume prediction using machine vision technique based on Pappus' theorem and artificial neural network. J Food Sci Technol. 2015;52(5):3065\u0026ndash;71. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s13197-014-1350-6\u003c/span\u003e\u003cspan address=\"10.1007/s13197-014-1350-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStarbuck C. Logistic Regression. The Fundamentals of People Analytics. Cham: Springer; 2023. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/978-3-031-28674-2_12\u003c/span\u003e\u003cspan address=\"10.1007/978-3-031-28674-2_12\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThipakorn J, Waranusast R, Riyamongkol P. (2017). Egg weight prediction and egg size classification using image processing and machine learning. In \u003cem\u003eProceedings of the 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology\u003c/em\u003e (ECTI-CON) (pp. 477\u0026ndash;480). Phuket, Thailand. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ECTICon.2017.8096278\u003c/span\u003e\u003cspan address=\"10.1109/ECTICon.2017.8096278\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThipakorn T, Duangjinda M, Tumwasorn S. Prediction of egg weight and quality traits using data mining techniques. Comput Electron Agric. 2017;135:203\u0026ndash;10.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003e\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.compag.2017.02.014\u003c/span\u003e\u003cspan address=\"10.1016/j.compag.2017.02.014\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTyasi TL, Eyduran E, Celik S. Comparison of tree-based regression tree methods for predicting live body weight from morphological traits in Hy-line Silver Brown commercial layer and indigenous Potchefstroom Koekoek breeds raised in South Africa. Trop Anim Health Prod. 2020;53(1):7. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s11250-020-02443-y\u003c/span\u003e\u003cspan address=\"10.1007/s11250-020-02443-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTyasi TL, Qin N, Jing Y, Mu F, Zhu H, Liu D, Xu R. Prediction of egg production and egg quality traits using artificial neural networks in poultry. Poult Sci. 2020;99(10):5372\u0026ndash;80. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.psj.2020.07.022\u003c/span\u003e\u003cspan address=\"10.1016/j.psj.2020.07.022\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTyasi TL, Ngorima L, Hlokoe VR. Predicting egg weight from egg quality traits of the Lohmann Brown chicken breed using stepwise regression. Adv Anim Veterinary Sci. 2024;12(3):436\u0026ndash;40. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.17582/journal.aavs/2024/12.3.436.440\u003c/span\u003e\u003cspan address=\"10.17582/journal.aavs/2024/12.3.436.440\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUzonwanne C, Onyedibe F, Nwokoye M. Impact of livestock production on gross domestic product in Nigeria. Int J Adv Econ. 2023;5(5):107\u0026ndash;18. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.51594/ijae.v5i5.477\u003c/span\u003e\u003cspan address=\"10.51594/ijae.v5i5.477\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWahyuni H, Yudiarti T, Widiastuti E, Sartono T, Agusetyaningsih I, Sugiharto S. Dietary supplementation of Spirulina platensis and Saccharomyces cerevisiae on egg quality, physiological condition and ammonia emission of hens at the late laying period. J Indonesian Trop Anim Agric. 2023;48(1):47\u0026ndash;57. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.14710/jitaa.48.1.47-57b\u003c/span\u003e\u003cspan address=\"10.14710/jitaa.48.1.47-57b\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYakubu A, Bamidele O, Hassan WA, Ajayi FO, Ogundu UE, Alabi O. Farmers\u0026rsquo; choice of genotypes and trait preferences in tropically adapted chickens in five agroecological zones in Nigeria. Trop Anim Health Prod. 2020;52:95\u0026ndash;107. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s11250-019-01993-0\u003c/span\u003e\u003cspan address=\"10.1007/s11250-019-01993-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"discover-applied-sciences","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"Learn more about [Discover Applied Sciences](https://link.springer.com/journal/42452)","snPcode":"42452","submissionUrl":"https://submission.springernature.com/new-submission/42452/3","title":"Discover Applied Sciences","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Discover Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Egg size, Machine learning, Egg quality traits, Noiler chicken, Predictive modeling","lastPublishedDoi":"10.21203/rs.3.rs-8431208/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8431208/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis study evaluated the predictive performance metrics of four machine learning algorithms (Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), and Linear Regression (LRG)) for classifying egg size based on internal and external egg quality traits of Noiler chickens. Three hundred freshly laid eggs (100 per plumage variety) were collected at young laying age (26 weeks) and old laying age (46 weeks), and assessed for various quality parameters. External traits included egg weight, egg width, egg length, shell surface area, percentage of shell thickness, and shell weight) while internal traits included albumen height, haugh unit, yolk height, yolk index, yolk and albumen weight and yolk width. Data were analysed using Python-based implementations of the four algorithms. Among the models, the Random Forest algorithm achieved the highest classification accuracy (98%), with perfect precision (1.00) and a recall of 0.98 which indicated exceptional predictive ability. SVM and Logistic Regression both recorded accuracies of 95%, while linear regression recorded 92% Therefore, the model developed from the Random Forest algorithm can be effectively used for automated egg grading and selection in poultry breeding programs. Future research could incorporate additional features such as computer vision and deep learning techniques to further enhance prediction accuracy.\u003c/p\u003e","manuscriptTitle":"Metrics Comparison of Machine Learning Algorithms used to classify Noiler Chicken Egg from Egg QualityTrait","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-12 17:30:34","doi":"10.21203/rs.3.rs-8431208/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-02-06T17:05:45+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-22T06:05:04+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-19T06:55:29+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-18T17:39:49+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-12T21:15:10+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"8858504704524519420361997026198255384","date":"2026-01-08T18:17:16+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"117041462744318510583622892312927015900","date":"2026-01-08T08:41:33+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"182288712739658792814555763644574035391","date":"2026-01-08T08:34:40+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"287072364185990030761748177425950558581","date":"2026-01-08T08:31:28+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"52723346088427419308683218949173377053","date":"2026-01-08T08:25:03+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-01-08T08:07:20+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-12-30T15:08:05+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-12-30T05:34:30+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-30T04:55:38+00:00","index":"","fulltext":""},{"type":"submitted","content":"Discover Applied Sciences","date":"2025-12-30T04:49:26+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"discover-applied-sciences","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"Learn more about [Discover Applied Sciences](https://link.springer.com/journal/42452)","snPcode":"42452","submissionUrl":"https://submission.springernature.com/new-submission/42452/3","title":"Discover Applied Sciences","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Discover Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"2603b524-12cd-48a6-82e1-5c947141a2a6","owner":[],"postedDate":"January 12th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-03-30T16:53:18+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-12 17:30:34","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8431208","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8431208","identity":"rs-8431208","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.