Improving interpretability and applicability of welfare management decisions in dairy cows through explainable artificial intelligence | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Improving interpretability and applicability of welfare management decisions in dairy cows through explainable artificial intelligence Pablo Augusto Souza Fonseca, Aroa Suárez-Vega, Beatriz Gutiérrez-Gil, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9082135/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 12 You are reading this latest preprint version Abstract Background Improving animal welfare and health is a key objective in modern dairy systems. However, translating the nowcommonly available high-frequency bovine-dedicated sensor instrument data into transparent, actionable decisions remains challenging, particularly when using complex machine learning (ML) models. In this study, we used ML to predict several welfare indicators (WIs) from routinely recorded milk-related data in dairy cows, and explainable AI (XAI) to interpret and visualise ML models for practical use. Results Monthly individual WIs for mastitis, subclinical acidosis, subclinical ketosis, longevity, and reproduction were predicted used with random forest models trained on routinely available test-day traits in dairy cows (milk acetone, fat, lactose, protein, urea, and electrical conductivity). The individual WIs were then used to derive an overall welfare score, with both individual and overall scores classified into good (G), intermediate (I), and risk (B) classes. SHapley Additive exPlanations (SHAP) values quantified feature importance and interactions, revealing relationships largely consistent with known physiology (e.g., extreme high and low values of milk components and urea associated with increased welfare risk) and highlighting class-specific and U-shaped effects that call for context-dependent interpretation. Counterfactual explanations were used to identify minimal changes in milk traits required to shift predictions from B to G class, thereby translating model outputs into candidate management adjustments. While most counterfactuals followed biologically plausible patterns, occasional non-intuitive suggestions underscored the need for expert oversight. Conclusions This study illustrates how SHAP and counterfactual explanations can be layered on top of ML models to generate interpretable, customizable decision-support tools for precision dairy cattle welfare, while emphasizing the need to field-validate their usability, economic value, and ethical implications. Dairy cattle Animal welfare Machine learning data-driven decisions XAI welfare scores Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Background The effective and sustained improvement of animal welfare is a global demand in the livestock sector, with increasingly relevant health, ethical, legal, and economic implications in livestock production [ 1 – 3 ]. The welfare of farmed animals is one of the objectives pursued by the Common Agricultural Policy in the European Union, including Italy. An enhanced welfare is associated with better health, productivity, longevity, and sustainability in the production chain [ 4 – 7 ]. Therefore, monitoring animal welfare is a crucial process in the current context of the livestock production chain. Due to their complexity, animal welfare traits must be improved using complementary approaches that use both breeding and environmental data, plus animal-specific data (ketosis, acidosis, mastitis) [ 8 ]. Therefore, strategies that allow early detection of dairy animals with higher genetic merits for a better welfare profile are important. Additionally, forecasting stressful conditions resulting in risks to animal welfare might help shape management conditions and reduce the impact of these conditions on the animal’s health and productivity. In the last few decades, the growth of precision livestock farming technologies has allowed the constant monitoring of animals and the measurement of a broad range of parameters that can be used to track and forecast stressful conditions and welfare risk [ 9 , 10 ]. The dimensionality and complexity of the data generated by such technologies might be challenging to fit into traditional statistical models. Consequently, in recent years, the use of machine learning (ML) models has become popular in the analysis of precision farming data and in the creation of predictive models for different traits in livestock species, including welfare-related traits [ 11 – 14 ]. The use of ML models offers a robust alternative to handle dimensionality-related problems, uncover hidden patterns, and better model non-linear relationships in the data, among other advantages. In this context, ML models offer a promising approach to predict complex traits, enabling earlier and more efficient animal selection. However, data-driven decisions based exclusively on the ML model’s output are still challenging. The higher the complexity of ML models (high dimensionality, multicollinearity, nonlinear relationships, etc.), the smaller their interpretability potential, leading to the concept of black-boxes models [ 15 ]. In the livestock sector, models that are easy to interpret and explain are crucial not only for selecting animals but also for informed decisions, such as adjusting diets or infrastructure to enhance animal welfare. Despite the absence of a clear legal requirement for interpretability in predictive models used in the livestock sector, the decision-maker, ultimately the farmer, who relies on these models and is affected by their outcomes, should be able to understand the reasons behind the decisions. The increasing adoption of AI in livestock farming offers important opportunities to improve productivity, sustainability, and animal welfare through more precise and data-driven management. However, these advances also raise important concerns related to data quality, privacy, trust, and the risk of transferring critical decisions from farmers to opaque automated systems [ 16 ]. In this context, the explication of ML models is a crucial step to understand the relationship between the variables in a model and how these variables contribute to the final output [ 17 , 18 ]. Explainable artificial intelligence (XAI) is a collective term for the methods and techniques that make the decisions and inner workings of ML models understandable to humans, rather than being “black boxes”. The main goal of XAI techniques is to show why a model made a specific prediction or decision, in a way that is transparent, trustworthy, and useful for users [ 19 , 20 ]. The application of XAI techniques to predict welfare-related traits is a promising approach to enhance the efficiency of detecting welfare risk events. Additionally, XAI techniques can also help refine individual management decisions aimed at improving animal welfare at individual and herd levels. In light of the above, the current study aimed to develop a proof of concept for the use of ML models for predicting welfare indicators in dairy cows and to integrate XAI techniques, such as Shapley values and Counterfactual explanations, to identify patterns in the features used for prediction that might help better define management decisions to reduce potential welfare risks. Methods Sample and welfare indicators All the information used in the current study was retrieved from the Italian Breeders Association (AIA) Livestock Environment Opendata (LEO - https://opendata.leo-italy.eu/portale/home ). Five individual cow welfare indicators (WI) parameters were selected: longevity, mastitis, subclinical ketosis, subclinical acidosis, and reproduction. These five WIs are used to compute an overall WI score for dairy cows. The five individual WIs are on a scale from 0–30, where values between 0–10 indicate G (Good) welfare, from 10–20 a I Intermediate) welfare, and from > 20–30 a B (Risk) to the welfare. A complete description of how each WI is estimated is available in the manual 14 of the AIA ( http://www.sialleva.it/Media/Default/document//ManR14%20-%20REPORT%20E%20GRAFICI%20RISCHIO%20BENESSERE.pdf ). The overall welfare score (G, I, and B) is computed based on the individual WI within each of the previously mentioned classes. Cows with all five individual WI in the good class are classified in the good overall welfare class; cows without any individual bad WI but with at least one intermediary WI are classified in the intermediary overall welfare class; and the rest of the cases are classified in the moderately acceptable (G) welfare class. Monthly means for these five individual WIs were calculated for 843 cows, resulting in 227,603 observations. After a careful evaluation of potential features to be included in the models to predict the individual WI, among the traits available on the LEO database, the mean monthly values of acetone (mM/l), milk fat content (g/100g), milk lactose content (g/100g), milk protein content (g/100g), milk urea content (mg/dl), and milk average conductivity were chosen as features. The selection of these features was based on availability for animals with WI records and reducing collinearity among the features. Prediction of welfare indicators and welfare scores using random forest Previous to the fit of the predictive models, quality control analysis was performed on the feature dataset, and all feature records with values above or below the feature mean ± 5 standard deviations were removed. A threshold of ± 5 standard deviations was adopted to avoid excluding biologically plausible extreme observations, which are expected in large longitudinal datasets and may reflect true individual variability rather than measurement or recording errors. After this step, the dataset consisted of 798 cows and 125,285 observations (i.e. individual cow monthly means). These records were split into training (70%) and test (30%) sets for each overall welfare score (G, I, and B), yielding 87,699 and 37,586 observations in the train and test sets, respectively. The h2o R package [ 21 ] was used to fit Random Forest (RF) models for the prediction of each individual WI. A random discrete grid search was performed in the hyperparameter tuning stage using root mean squared error (RMSE) as a stopping metric, a stopping tolerance of 0.005, and a maximum of training rounds equal to 100. The hyperparameter space explored in this study was intentionally limited to a small set of biologically and computationally relevant values. Therefore, a random discrete grid search was preferred over a more extensive search because it allowed efficient identification of suitable hyperparameter combinations while reducing computational burden and avoiding unnecessary evaluation of a large number of similar models. The hyperparameter grid was defined by combining a vector of numbers of trees from 10 to 100, incremented by 10, and row sample rates of 0.8 and 1. A total of 10-fold was used in the hyperparameter tuning. Moreover, to predict a WI value in a given month, the closest previous monthly mean for each feature was used. The individual WIs were treated as continuous scores, and the best model for each individual WI was defined based on the smallest RMSE. The models were employed in the test set to predict the individual WI, and the prediction accuracy was evaluated based on the RMSE between the predicted and observed individual WI. Subsequently, the predicted individual WIs were used to reconstruct the overall welfare score in the G, I, and B classes. The reconstruction accuracy was assessed using balanced accuracy, and weighted versions of precision, recall, and F1-scores. The weighted versions of precision, recall, and F1 scores were estimated using a support-weighted average based on the following formula: $${M}_{w}=\frac{\sum_{i=1}^{n}\left({M}_{i}{N}_{i}\right)}{N}$$ where M w is the weighted version of the metric for class i (M i ) and N i is the number of observations of class i, and N is the total number of observations. Feature importance and relationships evaluation through Shapley values The R package shapviz [ 22 ] was used to obtain SHAP values for each feature included in the selected models in the test set. The values were analyzed based on the relationship between SHAP values and feature distribution by importance plots and the interaction between features and the SHAP values by dependence plots. Additionally, the SHAP values for the five individual WIs were compared across contrasting overall welfare scores (G vs B) observed in the same individuals (considering only true positive predictions). These comparisons were contrasted with the individual WI and the overall welfare scores to assess the certainty of the feature importance. Counterfactual explanations for enhanced management decisions The R package counterfactuals ( https://github.com/dandls/counterfactuals ) was used to assess the minimum changes required in the feature composition to transform a B individual WI (score higher than 20) to a G individual WI (score lower than or equal to five). The distribution of all counterfactual explanations of each feature obtained for the cows predicted as B welfare for each individual WI was compared with the observed feature distributions in the observed G welfare scores. The comparison was performed to assess the certainty of the suggested changes. These comparisons were performed only for correctly predicted B scores, excluding false negatives. Results Sample, welfare indicators, and features distribution The observed number of the individual WI for each of five classes (longevity, mastitis, subclinical ketosis, subclinical acidosis, and reproduction) and overall welfare scores across the 125,285 monthly observations from the 798 cows analyzed are shown in Table 1 . The number of cows per breed is shown in Supplementary Table S1. The distribution of the feature values (acetone, fat, lactose, protein, urea, and milk conductivity) used for the prediction of individual WI and the correlation across the features are shown in Supplementary Figs. 1 and 2. The analysis of the distribution of welfare classes indicated an unbalanced scenario for some individual WI. Therefore, it was chosen to predate the individual welfare core rather than the welfare class. Table 1 Distribution of monthly observations for the good (G), intermediate (I) and risk (B) welfare classes for individual welfare scores and the overall welfare indicator. Welfare indicator Welfare classes G I B Longevity 10,233 86,815 28,237 Mastitis 74,913 26,221 24,151 Subclinical ketosis 111,314 0 13,971 Subclinical acidosis 115,985 24 9,276 Reproduction 51,913 50,957 22,415 Overall welfare indicator 2,132 55,252 67,901 Predictive performance of individual welfare indicators and overall welfare score The Random Forest (RF) models used to predict each individual WI showed strong performance. The Pearson correlation coefficient (r) and the RMSE obtained for the individual WI in the test set indicated a good accuracy for the selected models: Longevity (r = 0.982; RMSE = 0.937); Mastitis (r = 0.987; RMSE = 1.476); Subclinical ketosis (r = 0.971; RMSE = 1.246); Subclinical acidosis (r = 0.979; RMSE = 0.839); and, Reproduction (r = 0.978; RMSE = 1.597). Table 2 shows the reconstructive performance for the overall welfare scores based on the predicted individual WI. A balanced accuracy of 0.841 was achieved for the reconstruction of the overall welfare score with 0.938 precision, recall of 0.930, and an F1-score of 0.930, suggesting a good predictive performance. However, results varied between the three classes [good (G), intermediary (I), and risk (B) class; Table 2 ]. For the G class, the model resulted in a precision equal to 1, indicating that whenever the model predicts G, it is correct (false positive = 0), and a recall equal to 0.665. The latter implies that only ~ 67% of true G were classified as G (with 179 false negatives, all predicted as I class). A slight change in the classification threshold for the individual overall welfare score, where the values were rounded (without decimals), allowed an increase in the true positive rate to ~ 80.5% for the G class. However, this increment resulted in 70 false positive classifications (I-class cows classified as G). Figure 1 A shows the RMSE obtained for the individual WIs predictions in the cases where the overall welfare score was misclassified as I instead of G. It is possible to observe higher RMSE for the longevity, mastitis, and reproduction WIs. Additionally, most errors were caused by misclassification of one individual WI, with the most frequent misclassified class being reproduction, followed by the subclinical ketosis and longevity (Fig. 1 B). The distribution of the features used for the prediction of the individual WIs indicated that the misclassified G classes (classified as I) have larger differences for acetone and fat when compared with the cows correctly classified as the overall welfare G class (Fig. 1 C). For class I, a precision equal to 0.885 and a recall equal to 0.998 were observed (false positives = 2,582 and false negatives = 37). These results indicate that the model misclassifies some cows belonging to the G and B classes as belonging to the I class. In the case of the B class of predictions, a precision equal to 0.997 and a recall equal to 0.859 were observed (false positives = 37 and false negatives = 2,403). It is important to highlight that all the false-negative predictions were misclassifications between the B and I classes. Therefore, any B score was misclassified as G class. Table 2 Predictive performance for the reconstruction of the overall welfare score classes (Good (G), Intermediary (I), and risk (B)) based on the predicted individual welfare indicators. Individual classes Obs TP FP FN TN Precision Recall F1-score B 17,059 14,656 37 2,403 20,490 0.997 0.859 0.923 G 534 355 0 179 37,052 1.000 0.665 0.799 I 19,993 19,956 2,582 37 15,011 0.885 0.998 0.938 Obs: Number of records observed for each class; TP: True Positive; FP: False Positive; FN: False Negative; TN: True Negative. Features importance and feature interactions in the influence on predictive outcomes The analysis of beeswarm plots for the SHAP (SHapley Additive exPlanations) values obtained for the features used in the prediction of the five individual WIs is summarised in Fig. 2 . It should be noted that to predict a WI value in a given month, the closest previous monthly mean for each of the six features selected was used. Therefore, the SHAP values correspond to the contribution of the previously available feature values to the predicted WI scores. For mastitis WI, lower lactose, protein, and fat are associated with higher SHAP values, especially in the B class, suggesting a higher welfare score (Fig. 2 A). In the B class of the mastitis WI, lower levels of urea and acetone were associated with smaller SHAP values, consequently, reducing the welfare score. Regarding milk conductivity, in the B and I classes for mastitis WI, higher milk conductivity was associated with higher SHAP values, while in the G class, higher values decreased the SHAP values. In the G class, milk solid content (mainly lactose) tends to be higher, which may also increase milk conductivity (Fig. 3 ). Higher lactose levels appear to be associated with lower milk conductivity, but there is no clear association with SHAP values in the G and I classes of mastitis WI (Supplementary Figs. 3 and 4). The SHAP values obtained for the ketosis WI showed a varied pattern. The beeswarm plots indicated that higher protein and acetone values are associated with increases in SHAP values, whereas lower milk conductivity, lactose, and urea (especially for the G class) content are associated with higher SHAP values (Fig. 2 B). Results of the urea concentration in the milk in association with the SHAP values are presented in Fig. 4 A and B. For both G and B classes, higher SHAP values were associated with higher urea values below 20 and above 30 (this pattern was intensified for the B class). No specific interaction with other features was observed for the urea WI. The acetone values showed an increasing pattern in association with the SHAP values in the B class, which was not found in the G class (Fig. 4 C and D). However, it is important to highlight that the majority of the acetone values observed in the test set were below 0.3 (Supplementary Fig. 5). The beeswarm plot of the SHAP values for the features used in the acidosis WI showed that, especially for cows classified in the G class, lower milk conductivity and protein are associated with smaller SHAP values (Fig. 2 C). On the other hand, higher values of urea and acetone were associated with smaller SHAP values. Figure 5 presents a clear visualization of the points associated with the G and B classes for all the features, except for lactose, where the dots representing the B class are located in areas with higher SHAP values. Despite a clear interaction pattern with other features, the low-fat and high-protein milk contents also showed a relationship with high SHAP values for acidosis in the B class. An interaction between lactose content and milk conductivity was observed, where higher milk conductivity and lower lactose content were associated with higher SHAP values (Fig. 5 ). The analysis of SHAP values for the longevity WI indicated that the extreme values for milk conductivity, urea, lactose, fat, and protein were associated with an absolute increase of SHAP values, in both positive and negative zones (Fig. 6 A). In addition, an inverse pattern was observed between the G (dots below zero on the y-axis) and B classes (dots above zero on the y-axis). For example, smaller protein values in the G class were associated with smaller SHAP values. In contrast, higher values in the B class were associated with higher SHAP values (Fig. 6 A). The urea values above 40 were associated with higher SHAP values in both classes. For the G class, higher acetone levels (below 0.3) were associated with smaller SHAP values. The beeswarm plot for the reproduction WI indicated that higher protein, urea, and fat values were associated with higher SHAP values in the B class, while smaller milk conductivity and higher lactose were associated with smaller SHAP values in the G class (Fig. 2 E). Additionally, Fig. 6 B indicates that for the B class (dots above zero on the y-axis), values in both extremes for milk conductivity, urea, lactose, fat, and acetone are associated with higher SHAP values. The SHAP values can also be used to evaluate the feature contribution within individual WIs used to reconstruct the overall welfare score. One of the animals with the largest number of individual WIs classified in the risk welfare (B class) in a month was selected as an example. The cow with identification number "-9223372036849293312" in the LEO database had a monthly record composed of three individual WIs above 20 (Fig. 7 A): longevity (20.5), mastitis (23.6), and reproduction (21.5). For this cow, a good overall welfare score was selected from another monthly record for comparison. As expected, the SHAP values for the individual WIs showed opposite contribution patterns for longevity (Fig. 7 B), mastitis (Fig. 7 C), and reproduction (Fig. 7 D) in the B (positive SHAP value) and G (negative SHAP values) classes. In addition, the SHAP values for the WI of subclinical acidosis in the B class have a mix of positive and negative contributions (Fig. 7 E). In contrast, for the WI for subclinical ketosis, all features had negative SHAP values (Fig. 7 F), which corroborates the absence of contribution of these individual WIs to the overall welfare score. Counterfactual explanations for individualized management decisions Counterfactual explanations add interpretability to the ML model and thereby can provide important information for management purposes in the farms. Figure 8 illustrates the minimal changes in the input features required to convert a welfare risk outcome (class B) into either class G (left-hand side) or class I (right-hand side) for the longevity, mastitis, and reproduction WIs. For the longevity and mastitis WI, increases in fat, lactose, protein, and urea, together with a decrease in milk conductivity, were identified as the minimal changes needed to shift the prediction from B to G class. Similar patterns were observed for changing B to I class, with the main difference being the absence of changes in milk conductivity and protein (based on the raw, unscaled values). For the mastitis WI, the same pattern of changes as for longevity was observed, converting from B to G class. The shift from B to I class for mastitis differed only in fat content, which was instead suggested to decrease. For the reproduction WI, increases in acetone, lactose, protein, and urea, combined with reductions in fat and milk conductivity, were suggested as the minimal changes required to convert a B outcome to a G outcome. In contrast, to change the prediction from B class to I class, only a decrease in fat content and an increase in lactose and milk conductivity were indicated. It is important to note that, for some features, the counterfactual explanations for the reproduction WI exhibited patterns opposite to those observed for the longevity and mastitis WIs. Notably, most of the proposed minimal changes in feature values align with the observed values for the same features in a month when the same animal was classified in the G class for overall welfare score (green lines in Fig. 8 ). Overall, the mean counterfactual explanations used to shift WI classes from B to G indicated that, in most cases, the suggested feature values lay between the mean values observed in the true B and G classes (Fig. 9 ). Exceptions were found for specific WIs: for mastitis, the counterfactuals suggested higher milk conductivity than in the B group and lower urea than in the G group; for subclinical acidosis, they suggested lower lactose than in both B and G; for subclinical ketosis, they indicated higher lactose but lower milk conductivity than in both B and G; and for reproduction, they suggested lower milk conductivity than in both B and G (Fig. 9 ). Summary statistics of the counterfactual explanations used to shift WIs classes from B to G are reported in Supplementary Table S2. Discussion The interpretability of a predictive ML model is a key component to enable applicability and efficiency of data-driven decisions. The value of information, i.e, benefit gained from using information to improve a decision, often by reducing uncertainty, is crucial in the context of livestock precision to help the development of better, scalable, and transparent decision-making frameworks [ 23 , 24 ]. Interpretability and visualization have been reported as key feaurures in the livestock sector [ 24 , 25 ]. The inclusion of XAI techniques in these frameworks has the potential to improve not only the interpretability of the ML models but also the efficiency of predictions, and to provide better guidance for decision-making [ 19 , 20 ]. Additionally, the interpretability of automated decision-making systems is already a requirement in different areas, such as in human health [ 20 , 26 ]. The use of XAI techniques can be applied in different stages of a framework, including pre-modeling, interpretation, and post-modeling [ 20 ]. The methodology used here applies XAI in these three areas.Therefore, XAI tools (Shapley values and counterfactual explanations) were used to evaluate model performance, feature importance, and to provide management insights from the predictive outcomes. In general, the performance of the models predicting the individual WIs was good (r > 0.97) for all the individual WIs. Regarding the reconstruction of the overall welfare score, the RF classifier was accurate, especially for the larger classes (B and I), but it under-detected the minority class G. When each class was given equal importance, the RF performance was decreased (~ 0.84 balanced accuracy). However, this decrease in classification accuracy was a consequence of a conservative decision to avoid false positive predictions for the G class. Indeed, this is confirmed by a slight change applied to the predicted WI (values were rounded to the nearest integer), which improved the true positive rate to ~ 80.5%, but also allowed for false negatives. The selection of the threshold must thus be considered in each specific case, based on the consequences of including false-positive classifications within each overall welfare class. Another relevant point is the high RMSE observed in the reproduction and longevity WIs due to G class misclassification. Based on the original WIs estimations obtained from AIA, the reproduction and longevity WIs exhibit lower monthly variability, which might have impacted the predictive performance of the trained models. The contributions of each feature to the predictive outcome were analysed for validating the accuracy of the predictions and for ensuring the absence of non-logical patterns, such as a hypothetical increment in milk somatic cell counts to improve mastitis welfare scores. The analysis of the feature value distribution and the association with SHAP values indicated that for all the individual WIs, extreme values (in the observed distribution) of fat, protein, lactose, and urea were associated with higher SHAP values, especially in the B (risk) overall welfare class. Biologically, these results are supported by the fact that disturbances in energy balance can have negative effects in dairy cows in both positive and negative energy balance [ 27 – 33 ]. These results were especially reinforced by the pattern observed for the urea SHAP values. Both excessively high and low milk urea concentrations are associated with metabolic disorders, reproductive performance, and other health and welfare issues in dairy cattle, related to the nitrogen metabolism [ 32 , 34 ]. For example, low milk urea concentrations are associated with lameness [ 35 , 36 ] while high milk urea concentrations are one of the main indicators of ketosis [ 33 ] in dairy cattle. Acetone also showed a direct association with higher ketosis welfare risk (higher SHAP values). These results are reinforced by the physiological alterations that occur after calving, where the cow often enters a state of negative energy balance, in which glucose demand exceeds supply, triggering the mobilization of adipose tissue and the conversion of fatty acids into ketone bodies, particularly acetone and β-hydroxybutyrate [ 37 ]. Indeed, previous studies have reported that high levels of milk acetone is an indicator of ketosis in dairy cows [ 38 , 39 ] and suggested the in-line evaluation of acetone levels for subclinical ketosis risk assessment [ 40 ]. For the acidosis WI, fat and protein concentrations in the milk were associated with changes in the SHAP values for welfare risk. Ruminal acidosis is another common metabolic condition in dairy cattle and is associated with alterations in milk components, such as fat and protein [ 41 – 43 ]. Subacute ruminal acidosis can cause a reduction in fat percentage and an increase in protein percentage in milk [ 44 , 45 ]. In our study, lower milk fat and higher milk protein percentage in the B class for the welfare WI were associated with higher SHAP values. This is in agreement with the known relationship between the the fat-to-protein ratio and the metabolic status in dairy cattle, including acidosis (low fat-to-protein ratio is an indicator of sabucte ruminal acidosis) [ 46 – 49 ]. An interaction between lactose (lower values,see Supplementary Fig. 5) and milk conductivity (higher values) was observed in the increase of SHAP values and, consequently, the acidosis welfare risk. The lactose content in the milk showed negative correlations with energy balance measures (condition score, live weight in kilograms, and the product of condition score and live weight) in dairy cows [ 29 ]. A previous study reported that milk conductivity tends to rise several days before dairy cows are diagnosed with acidosis or ketosis [ 50 ]. Mastitis WI As expected, the risk of mastitis was associated with lower levels of milk components such as lactose, fat, and protein, in line with current evidence [ 51 – 53 ]. Similar to other WIs, urea levels were also associated with changes in the SHAP values for the mastitis WI. For urea, both small and high values of urea in the milk were associated with a higher mastitis welfare risk. Milk urea levels were previously associated with pathogen-specific clinical mastitis [ 52 ]. The interaction between low lactose levels and high milk conductivity with higher SHAP values was also observed for mastitis WI. The milk electrical conductivity is a well-defined biomarker for mastitis in dairy cows and is largely determined by the concentration of the main milk ions [ 50 , 54 – 56 ]. The milk conductivity is especially determined by Na+, K+, and Cl− ions [ 57 ]. After intramammary infection, disruption of mammary epithelial tight junctions and of the ion-pumping system promotes the movement of Na + and Cl− into milk and the leakage of lactose and K+ toward the extracellular fluid, resulting in increased milk electrical conductivity [ 58 ]. This mechanism is consistent with the pattern observed here, in which higher mastitis risk was associated with the combination of increased conductivity and reduced lactose levels. In Gyr cows, milk conductivity is negatively associated with lactose content, positively associated with somatic cell counts, and lactose content itself shows a negative relationship with somatic cell counts, reinforcing the pattern observed here [ 54 ]. Longevity WI For the longevity WI, both low and high urea levels were associated with higher welfare risk. Zhao et al. [ 32 ] suggested that both low and high milk urea concentrations may have a direct or indirect relationship with cow longevity based on their association with health issues, and consequently, the animal culling. The same pattern of contribution to the increase in SHAP values was observed in the ketosis, acidosis, and mastitis welfare indicators, reinforcing this association. For the G class of longevity welfare, inverse U-shapes were observed for milk conductivity, lactose, and protein content. These results need to be interpreted carefully, as this pattern is observed within a narrow interval for a specific group and might indicate the absence of intermediary points caused by sample size or measurement biases. In the G class, higher milk acetone levels (till < 0.4) were associated with lower longevity welfare risk. This relationship is complex to explain, but in low-productivity dairy herds, the increase in acetone level was associated with higher productivity [ 59 ]. The increase in the risk of reproduction WI was associated with higher fat, protein, and lactose content (despite a slight U-shaped pattern) in the B class. This might be explained by the association observed between highly productive cows and fertility problems [ 60 ]. Interestingly, the Wis for reproduction and longevity exhibit opposite U-shape patterns between the G and B classes for the SHAP values of almost all features. These results reinforce the necessity to evaluate the feature importance in using group-specific approaches. Therefore, this pattern should be better evaluated to understand the relationship between these variables and welfare risk across different animal classes. SHAP values The SHAP values were also used to assess model's reliability. The SHAP values for the individual WIs were compared, contrasting overall welfare scores (B vs G classes) for the same cow in different months. In the G scenario, all features across all individual WIs were associated with negative SHAP values. In contrast, in the B scenario, only longevity, mastitis, and reproduction WIs were associated with negative SHAP values. In the G scenario, negative SHAP values for all features were observed for the subclinical ketosis WI, reinforcing the impact of this indicator on the overall welfare score. On the contrary, for the subclinical acidosis WI, positive SHAP values were observed for urea, lactose, protein, and acetone. This suggests that urea, lactose, protein, and acetone should be carefully monitored to avoid an increasing risk of subclinical acidosis later in life. Precision farming aims to enhance production efficiency by using advanced information and communication technologies that optimize resource use and precisely control the production process, while maintaining high standards of animal welfare [ 10 ]. The evaluation of individual requirements for customized management decisions may be an important tool for achieving this goal, since it allows the identification of animal-specific needs and supports targeted strategies to reduce somatic cell counts, increase milk lactose content, decrease milk urea concentration, and ultimately improve production efficiency, udder health, and animal welfare [ 23 , 61 – 64 ]. The use of XAI techniques can be a powerful tool to leverage customized data-driven decisions in the livestock sector. In this context, the use of statistical techniques, such as counterfactual explanations, provide a useful tool for vets and farm consultants to reduce the risk posed by consequential automated decisions, increase strategic behavior, and better understand model recommendations [ 65 – 68 ]. In the present study, the comparison of contrasting overall welfare scores (B vs G classes) for the same animal was used as a proof of concept. In general, the proposed shifts in feature values suggested by the counterfactual explanations followed a logic pattern, with proposed values between the observed values in true G and B predicted classes. Nevertheless, for the mastitis WI an increase in the milk conductivity was suggested, which is not logical. Thus, counterfactual explanations, should be interpreted with caution. As mentioned above, higher milk conductivity is associated with mastitis cases [ 50 , 54 – 56 ]. However, an increase in fat and lactose, and a decrease in urea compared with the B class were also proposed, which coincides with previously reported patterns of association with mastitis in dairy cows [ 52 , 54 , 69 ]. In addition, counterfactual explanations are best applied at the level of individual cows, where they can support data-driven management decisions while allowing for biological plausibility and features interactions. For example, if a counterfactual suggests increasing milk conductivity above the levels typically observed in the G (healthy) class, this recommendation should not be followed blindly; instead, its impact on other traits (such as milk composition) should be evaluated, and any adjustments should remain within realistic ranges for healthy animals. In this sense, counterfactuals should be interpreted as tools to explore feasible management scenarios rather than rigid prescriptions. Nevertheless, their contribution to decision-making in dairy cow production systems needs to be tested in practice to assess their true applicability, limitations, and potential to improve on-farm management. In addition, the inclusion of other biomarkers, such as beta-hydroxibutyrate, specific fatty acids, and lactate dehydrogenase, among others, might help to improve the predictive performance and explainability of the models proposed here, and enhnance accurate management decisions. Conclusions The results shown here demonstrate how integrating XAI tools into predictive frameworks for dairy cow welfare has the potential to enhance both the transparency and practical usefulness of data-driven decisions on the farm. The models developed here achieved high predictive performance for individual WIs and acceptable reconstruction of the overall welfare score. The SHAP-based analyses revealed feature–outcome relationships that are largely consistent with established physiology and metabolism knowledge, thereby reinforcing confidence in the biological plausibility of the predictions. At the same time, patterns such as U-shaped effects and class-specific differences in feature importance highlight the need for group- and context-specific interpretation rather than one-size-fits-all rules. Complementarily, counterfactual explanations proved valuable for translating model output into actionable management scenarios. However, occasional non-intuitive suggestions, such as increased milk conductivity to improve mastitis scores, underline that counterfactuals must be interpreted critically, in light of domain knowledge and potential interactions among traits, and used as decision-support rather than prescriptive recommendations. Overall, our findings suggest that the combined use of SHAP and counterfactual explanations offers a promising avenue to build more interpretable, robust, and customizable decision-support systems in precision livestock farming. However, their on-farm applicability, economic impact, and usability for farmers and advisors still need to be systematically evaluated in real production environments with approipriate health/economic assessment. Declarations Ethics approval and consent to participate Not aplicable Consent for publication Not aplicable Availability of data and materials The datasets supporting the results of this article are available in the Zenodo repository (https://doi.org/10.5281/zenodo.18834616). Competing interests The authors declare that they have no competing interests. Funding This work was supported by the Veterinary Health Innovation Engine sponsored by Zoetis. Authors' contributions PASF participate of the conceptualization, data retrieve, formal analysis, implementation of ML and XAI algorithms, writing and review of the manuscript. ASV participate of the conceptualization, data retrieve, writing and review of the manuscript. BGC ASV participate of the conceptualization, data retrieve, writing and review of the manuscript. JCD participate of the data retrieve, writing and review of the manuscript. NG participate of the data retrieve, writing and review of the manuscript. ADW provided funding and contributed to the writing and review of the manuscript. JJA of the conceptualization, data retrieve, formal analysis, writing and review of the manuscript. Acknowledgements Project LEO funded under Submeasure 16.2 – PSRN 2014/2020. References Grethe H. The Economics of Farm Animal Welfare. Annual Rev Resource Econ. 2017;9. https://doi.org/10.1146/annurev-resource-100516-053419 . Tuyttens FAM, Molento CFM, Benaissa S. Twelve Threats of Precision Livestock Farming (PLF) for Animal Welfare. Front Veterinary Sci. 2022;9. https://doi.org/10.3389/fvets.2022.889623 . Verbeke WAJ, Viaene J. Ethical challenges for livestock production: Meeting consumer concerns about meat safety and animal welfare. J Agric Environ Ethics. 2000;12. https://doi.org/10.1023/A:1009538613588 . Petherick JC. Animal welfare issues associated with extensive livestock production: The northern Australian beef cattle industry. Appl Anim Behav Sci. 2005. https://doi.org/10.1016/j.applanim.2005.05.009 . Fraser D. Animal Welfare and the Intensification of Animal Production. In: International Library of Environmental, Agricultural and Food Ethics. 2008. https://doi.org/10.1007/978-1-4020-8722-6_12 Llonch P, Haskell MJ, Dewhurst RJ, Turner SP. Current available strategies to mitigate greenhouse gas emissions in livestock systems: An animal welfare perspective. Animal. 2017;11. https://doi.org/10.1017/S1751731116001440 . Broom DM. Animal welfare complementing or conflicting with other sustainability issues. Appl Anim Behav Sci. 2019;219. https://doi.org/10.1016/j.applanim.2019.06.010 . Lawrence AB, Conington J, Simm G. Breeding and animal welfare: Practical and theoretical advantages of multi-trait selection. Anim Welf. 2004. https://doi.org/10.1017/s0962728600014585 . 13 SUPPL. Papakonstantinou GI, Voulgarakis N, Terzidou G, Fotos L, Giamouri E, Papatsiros VG. Precision livestock farming technology: applications and challenges of animal welfare and climate change. Agriculture. 2024;14:620. Banhazi TM, Babinszky L, Halas V, Tscharke M. Precision livestock farming: Precision feeding technologies and sustainable livestock production. Int J Agricultural Biol Eng. 2012;5:54–61. https://doi.org/10.3965/j.ijabe.20120504.006 . Schillings J, Bennett R, Rose DC. Exploring the Potential of Precision Livestock Farming Technologies to Help Address Farm Animal Welfare. Front Anim Sci. 2021;2. https://doi.org/10.3389/fanim.2021.639678 . Neethirajan S, Scott S, Mancini C, Boivin X, Strand E. Human-computer interactions with farm animals—enhancing welfare through precision livestock farming and artificial intelligence. Front Vet Sci. 2024;11:1490851. García R, Aguilar J, Toro M, Pinto A, Rodríguez P. A systematic literature review on the use of machine learning in precision livestock farming. Comput Electron Agric. 2020. https://doi.org/10.1016/j.compag.2020.105826 . Benjamin M, Yik S. Precision livestock farming in swinewelfare: A review for swine practitioners. Animals. 2019;9:1–21. https://doi.org/10.3390/ani9040133 . Mahinpei A, Clark J, Lage I, Doshi-Velez F, Pan W. Promises and pitfalls of black-box concept learning models. arXiv preprint arXiv:210613314. 2021. Rosati A. Guiding principles of AI: application in animal husbandry and other considerations. Anim Front. 2024;14. https://doi.org/10.1093/af/vfae045 . Azodi CB, Tang J, Shiu SH. Opening the Black Box: Interpretable Machine Learning for Geneticists. Trends Genet. 2020. https://doi.org/10.1016/j.tig.2020.03.005 . 36. Rudin C. Why black box machine learning should be avoided for high-stakes decisions, in brief. Nat Reviews Methods Primers. 2022. https://doi.org/10.1038/s43586-022-00172-0 . 2. Angelov PP, Soares EA, Jiang R, Arnold NI, Atkinson PM. Explainable artificial intelligence: an analytical review. Wiley Interdiscip Rev Data Min Knowl Discov. 2021;11. https://doi.org/10.1002/widm.1424 . Minh D, Wang HX, Li YF, Nguyen TN. Explainable artificial intelligence: a comprehensive review. Artif Intell Rev. 2022;55. https://doi.org/10.1007/s10462-021-10088-y . Candel A, Parmar V. Erin LeDell AA. Deep learning with H2O. H2O ai Inc. 2016. Mayer M, shapviz. SHAP Visualizations. R package version 0.9. 3. 2024. Rojo-Gimeno C, van der Voort M, Niemi JK, Lauwers L, Kristensen AR, Wauters E. Assessment of the value of information of precision livestock farming: A conceptual framework. NJAS - Wageningen J Life Sci. 2019;90–1. https://doi.org/10.1016/j.njas.2019.100311 . Elliott KC, Werkheiser I. A Framework for Transparency in Precision Livestock Farming. Animals. 2023;13. https://doi.org/10.3390/ani13213358 . Mouzakitis S, Tsapelas G, Pelekis S, Ntanopoulos S, Askounis D, Osinga S et al. Investigation of Common Big Data Analytics and Decision-Making Requirements Across Diverse Precision Agriculture and Livestock Farming Use Cases. In: IFIP Advances in Information and Communication Technology. 2020. https://doi.org/10.1007/978-3-030-39815-6_14 Albrecht JP. How the GDPR Will Change the World. Eur Data Prot Law Rev. 2017. https://doi.org/10.21552/edpl/2016/3/4 . 2. Gunn PJ, Schoonmaker JP, Lemenager RP, Bridges GA. Feeding excess crude protein to gestating and lactating beef heifers: Impact on parturition, milk composition, ovarian function, reproductive efficiency and pre-weaning progeny growth. Livest Sci. 2014;167. https://doi.org/10.1016/j.livsci.2014.05.010 . Sinclair KD, Garnsworthy PC, Mann GE, Sinclair LA. Reducing dietary protein in dairy cow diets: Implications for nitrogen utilization, milk production, welfare and fertility. Animal. 2014;8. https://doi.org/10.1017/S1751731113002139 . Friggens NC, Ridder C, Løvendahl P. On the use of milk composition measures to predict the energy balance of dairy cows. J Dairy Sci. 2007;90. https://doi.org/10.3168/jds.2006-821 . Carlsson J, Pehrson B. The influence of the dietary balance between energy and protein on milk urea concentration. Experimental trials assessed by two different protein evaluation systems. Acta Vet Scand. 1994;35. Kirchgessner M, Kreuzer M, Roth-Maier DA. Milk urea and protein content to diagnose energy and protein malnutrition of dairy cows. Arch Tierernahr. 1986;36. https://doi.org/10.1080/17450398609425260 . Zhao X, Zheng N, Zhang Y, Wang J. The role of milk urea nitrogen in nutritional assessment and its relationship with phenotype of dairy cows: A review. Anim Nutr. 2025;20:33–41. Rius AG, McGilliard ML, Umberger CA, Hanigan MD. Interactions of energy and predicted metabolizable protein in determining nitrogen efficiency in the lactating dairy cow. J Dairy Sci. 2010;93. https://doi.org/10.3168/jds.2008-1777 . Dreyer C, Losand B, Spiekers H, Hummel J. Influence of fat-to-protein ratio and udder health parameters on the milk urea content of dairy cows. J Dairy Sci. 2025;108. https://doi.org/10.3168/jds.2024-25492 . Necula DC, Warren HE, Taylor-Pickard J, Simiz E, Stef L. Associations of Lameness with Indicators of Nitrogen Metabolism and Excretion in Dairy Cows. Agric (Switzerland). 2022;12. https://doi.org/10.3390/agriculture12122109 . Slovák P, Hisira V, Marčeková P, Mudroň P. The relationship between claw diseases of dairy cows and the protein and urea content of the milk. Acta Fytotechnica et Zootechnica. 2021;24. https://doi.org/10.15414/AFZ.2021.24.MI-PRAP.102-104 . David Baird G. Primary Ketosis in the High-Producing Dairy Cow: Clinical and Subclinical Disorders, Treatment, Prevention, and Outlook. J Dairy Sci. 1982. https://doi.org/10.3168/jds.S0022-0302(82)82146-2 . 65. Heuer C, Luinge HJ, Lutz ETG, Schukken YH, Van Der Maas JH, Wilmink H, et al. Determination of acetone in cow milk by Fourier transform infrared spectroscopy for the detection of subclinical ketosis. J Dairy Sci. 2001;84. https://doi.org/10.3168/jds.S0022-0302(01)74510-9 . Klein SL, Scheper C, May K, König S. Genetic and nongenetic profiling of milk β-hydroxybutyrate and acetone and their associations with ketosis in Holstein cows. J Dairy Sci. 2020;103. https://doi.org/10.3168/jds.2020-18339 . Abeni F, Ferla A, Negrini R, Galli A. Large scale subclinical ketosis risk assessment in dairy herds using predicted milk acetone and β-hydroxybutyrate via MIR technology. J Dairy Res. 2025;92:45–51. Stone W. The effect of subclinical rumen acidosis on milk components. 1999. Plaizier JC, Krause DO, Gozho GN, McBride BW. Subacute ruminal acidosis in dairy cows: The physiological causes, incidence and consequences. Vet J. 2008;176. https://doi.org/10.1016/j.tvjl.2007.12.016 . Alzahal O, Or-Rashid MM, Greenwood SL, McBride BW. Effect of subacute ruminal acidosis on milk fat concentration, yield and fatty acid profile of dairy cows receiving soybean oil. J Dairy Res. 2010;77. https://doi.org/10.1017/S0022029910000294 . Fairfield AM, Plaizier JC, Duffield TF, Lindinger MI, Bagg R, Dick P, et al. Effects of prepartum administration of a monensin controlled release capsule on rumen pH, feed intake, and milk production of transition dairy cows. J Dairy Sci. 2007;90. https://doi.org/10.3168/jds.S0022-0302(07)71577-1 . Khafipoor E, Krause DO, Plaizier JC. Induction of subacute ruminal acidosis (SARA) by replacing alfalfa hay with alfalfa pellets does not stimulate inflammatory response in lactating dairy cows. In: JOURNAL OF DAIRY SCIENCE; 2007. p. 654. Antanaitis R, Džermeikaitė K, Januškevičius V, Šimonytė I, Baumgartner W. In-Line Registered Milk Fat-to-Protein Ratio for the Assessment of Metabolic Status in Dairy Cows. Animals. 2023;13. https://doi.org/10.3390/ani13203293 . Atalay H. Milk fat/protein ratio in ketosis and acidosis. Balkesir Sağlk Bilimleri Dergisi. 2019;8:143–6. Zschiesche M, Mensching A, Sharifi AR, Hummel J. The Milk Fat-to-Protein Ratio as Indicator for Ruminal pH Parameters in Dairy Cows: A Meta-Analysis. Dairy. 2020;1. https://doi.org/10.3390/dairy1030017 . Oetzel GR. Subacute ruminal acidosis in dairy herds: physiology, pathophysiology, milk fat responses, and nutritional management. In: 40th Annual Conference, American Association of Bovine Practitioners. 2007. pp. 89–119. Inzaghi V, Zucali M, Thompson PD, Penry JF, Reinemann DJ. Changes in electrical conductivity, milk production rate and milk flow rate prior to clinical mastitis confirmation. Ital J Anim Sci. 2021;20. https://doi.org/10.1080/1828051X.2021.1984852 . Antanaitis R, Juozaitienė V, Jonike V, Baumgartner W, Paulauskas A. Milk lactose as a biomarker of subclinical mastitis in dairy cows. Animals. 2021;11. https://doi.org/10.3390/ani11061736 . Kayano M, Itoh M, Kusaba N, Hayashiguchi O, Kida K, Tanaka Y, et al. Associations of the first occurrence of pathogen-specific clinical mastitis with milk yield and milk composition in dairy cows. J Dairy Res. 2018;85. https://doi.org/10.1017/S0022029918000456 . Costa A, Bovenhuis H, Egger-Danner C, Fuerst-Waltl B, Boutinaud M, Guinard-Flament J, et al. Mastitis has a cumulative and lasting effect on milk yield and lactose content in dairy cows. J Dairy Sci. 2025;108. https://doi.org/10.3168/jds.2024-25467 . Boas DFV, Filho AEV, Pereira MA, Junior LCR, Faro L, El. Association between electrical conductivity and milk production traits in dairy Gyr cows. J Appl Anim Res. 2017;45. https://doi.org/10.1080/09712119.2016.1150849 . Norberg E. Electrical conductivity of milk as a phenotypic and genetic indicator of bovine mastitis: A review. Livest Prod Sci. 2005;96. https://doi.org/10.1016/j.livprodsci.2004.12.014 . Norberg E, Hogeveen H, Korsgaard IR, Friggens NC, Sloth KHMN, Løvendahl P. Electrical conductivity of milk: Ability to predict mastitis status. J Dairy Sci. 2004;87. https://doi.org/10.3168/jds.S0022-0302(04)73256-7 . Dadousis C, Pegolo S, Rosa GJM, Gianola D, Bittante G, Cecchinato A. Pathway-based genome-wide association analysis of milk coagulation properties, curd firmness, cheese yield, and curd nutrient recovery in dairy cattle. J Dairy Sci. 2017;100. https://doi.org/10.3168/jds.2016-11587 . Dadousis C, Pegolo S, Rosa GJM, Bittante G, Cecchinato A. Genome-wide association and pathway-based analysis using latent variables related to milk protein composition and cheesemaking traits in dairy cattle. J Dairy Sci. 2017;100. https://doi.org/10.3168/jds.2017-13219 . Heuer C, Wangler A, Schukken YH, Noordhuizen JPTM. Variability of Acetone in Milk in a Large Low-Production Dairy Herd: A Longitudinal Case Study. Vet J. 2001;161. https://doi.org/10.1053/tvjl.2000.0562 . Dobson H, Smith RF, Royal MD, Knight CH, Sheldon IM. The high-producing dairy cow and its reproductive performance. Reproduction in Domestic Animals. 2007;42 SUPPL. 2. https://doi.org/10.1111/j.1439-0531.2007.00906.x Werkheiser I. Precision Livestock Farming and Farmers’ Duties to Livestock. J Agric Environ Ethics. 2018;31. https://doi.org/10.1007/s10806-018-9720-0 . Kleen JL, Guatteo R. Precision Livestock Farming: What Does It Contain and What Are the Perspectives? Animals. 2023;13. https://doi.org/10.3390/ani13050779 . Tekın K, Yurdakök-Dıkmen B, Kanca H, Guatteo R. Precision livestock farming technologies: Novel direction of information flow. Ankara Universitesi Veteriner Fakultesi Dergisi. 2021;68. https://doi.org/10.33988/auvfd.837485 . Jiang B, Tang W, Cui L, Deng X. Precision Livestock Farming Research: A Global Scientometric Review. Animals. 2023;13. https://doi.org/10.3390/ani13132096 . VanNostrand PM, Hofmann DM, Ma L, Genin B, Huang R, Rundensteiner EA. Counterfactual explanation analytics: Empowering lay users to take action against consequential automated decisions. Proceedings of the VLDB Endowment. 2024;17:4349–52. Tsirtsis S, Gomez-Rodriguez M. Decisions, counterfactual explanations and strategic behavior. In: Advances in Neural Information Processing Systems. 2020. Tan J, Xu S, Ge Y, Li Y, Chen X, Zhang Y. Counterfactual Explainable Recommendation. In: International Conference on Information and Knowledge Management, Proceedings. 2021. https://doi.org/10.1145/3459637.3482420 Shang R, Feng KJK, Shah C, Why Am I. Not Seeing It? Understanding Users’ Needs for Counterfactual Explanations in Everyday Recommendations. In: ACM International Conference Proceeding Series. 2022. https://doi.org/10.1145/3531146.3533189 Zandkarimi F, Vanegas J, Fern X, Maier CS, Bobe G. Metabotypes with elevated protein and lipid catabolism and inflammation precede clinical mastitis in prepartal transition dairy cows. J Dairy Sci. 2018;101. https://doi.org/10.3168/jds.2017-13977 . Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviews received at journal 07 May, 2026 Reviewers agreed at journal 27 Apr, 2026 Reviewers agreed at journal 27 Apr, 2026 Reviewers agreed at journal 27 Apr, 2026 Reviews received at journal 14 Apr, 2026 Reviewers agreed at journal 28 Mar, 2026 Reviewers agreed at journal 26 Mar, 2026 Reviewers invited by journal 26 Mar, 2026 Editor invited by journal 17 Mar, 2026 Editor assigned by journal 17 Mar, 2026 Submission checks completed at journal 16 Mar, 2026 First submitted to journal 16 Mar, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9082135","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":612707957,"identity":"cab1c44a-1428-4f5a-9115-75dc1cac6a6c","order_by":0,"name":"Pablo Augusto Souza Fonseca","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAzUlEQVRIiWNgGAWjYFACxgcQmr2BgSGBOC3MBhCa5wDJWiSIVM9gzsDM+Ljgj408/8zHjz88qGGQk28goMWygZnZeGZbmuGM22lmEgnHGIwNDhDQYnCA/5g0b8PhBIbbOWwMiQ0MiRsIOczgADObNM+f/wnyN88wfwBqqZ9PyGEQLWwHEgxu8DBIALUkMBBymGUz0C+8bcmGG8+A/SJhuIGQFnP2ZsbHPH/s5OWOH3788UeNjTzBEDNgRuVLEFAP0kJYySgYBaNgFIx4AADlmjjVtkdTrQAAAABJRU5ErkJggg==","orcid":"","institution":"Instituto de Ganadería de Montaña (CSIC-Univ. de León)","correspondingAuthor":true,"prefix":"","firstName":"Pablo","middleName":"Augusto Souza","lastName":"Fonseca","suffix":""},{"id":612707958,"identity":"84045186-ed55-4091-9ec1-0ceacd8e30f4","order_by":1,"name":"Aroa Suárez-Vega","email":"","orcid":"","institution":"Universidad de León","correspondingAuthor":false,"prefix":"","firstName":"Aroa","middleName":"","lastName":"Suárez-Vega","suffix":""},{"id":612707959,"identity":"877acaad-65d1-49c0-8e94-834ce801768c","order_by":2,"name":"Beatriz Gutiérrez-Gil","email":"","orcid":"","institution":"Universidad de León","correspondingAuthor":false,"prefix":"","firstName":"Beatriz","middleName":"","lastName":"Gutiérrez-Gil","suffix":""},{"id":612707960,"identity":"b172a6e5-e03b-4a19-8ab5-4258678573e5","order_by":3,"name":"Juan Christos Dadousis","email":"","orcid":"","institution":"University of Surrey","correspondingAuthor":false,"prefix":"","firstName":"Juan","middleName":"Christos","lastName":"Dadousis","suffix":""},{"id":612707961,"identity":"f53a3cf5-fe73-4e56-b278-b537a314faf6","order_by":4,"name":"Nophar Geifman","email":"","orcid":"","institution":"University of Surrey","correspondingAuthor":false,"prefix":"","firstName":"Nophar","middleName":"","lastName":"Geifman","suffix":""},{"id":612707962,"identity":"295f2a93-548e-4f89-9bc1-808b2a347a11","order_by":5,"name":"Anthony D. Whetton","email":"","orcid":"","institution":"University of Surrey","correspondingAuthor":false,"prefix":"","firstName":"Anthony","middleName":"D.","lastName":"Whetton","suffix":""},{"id":612707963,"identity":"7902042a-7564-4022-a184-757c2a210b2e","order_by":6,"name":"Juan José Arranz","email":"","orcid":"","institution":"Universidad de León","correspondingAuthor":false,"prefix":"","firstName":"Juan","middleName":"José","lastName":"Arranz","suffix":""}],"badges":[],"createdAt":"2026-03-10 09:39:11","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9082135/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9082135/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":105607601,"identity":"12632e3a-fa56-495f-a0eb-87899ffc0e38","added_by":"auto","created_at":"2026-03-28 00:41:09","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":122396,"visible":true,"origin":"","legend":"\u003cp\u003eEvaluation of the misclassification of true good (G) overall welfare scores. A) Root mean squared errors (RMSE) for the individual welfare indicators in the subset of true G overall welfare score misclassified. B) Number of misclassified individual welfare indicators in the subset of true G overall welfare score misclassified (left-hand side) and number of records of individual welfare indicators that are true G class (right-hand side). C) Feature distribution values in the subset composed of true G overall welfare scores misclassified (red), true G overall welfare scores classified properly (green), true intermediary (I) overall welfare scores classified properly (grey), and true B overall welfare scores classified properly (orange).\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/96bd1aac3bb779f7eff28bb7.png"},{"id":105607604,"identity":"b81adccc-ca80-4b6b-9507-e5209ad24435","added_by":"auto","created_at":"2026-03-28 00:41:09","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1016007,"visible":true,"origin":"","legend":"\u003cp\u003eBeeswarm plots presenting the relationship between the SHAP values and the feature distribution for the subset of risk (B), intermediary (I) and good (G) individual welfare indicators properly classified. A) Mastitis, B) Subclinical ketosis, C) Subclinical acidosis, D) Longevity, and E) Reproduction.\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/5b3e2990db65b017a279f8c3.png"},{"id":105728357,"identity":"078dbf31-38ad-494d-baac-5c7251946165","added_by":"auto","created_at":"2026-03-30 11:11:34","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":99857,"visible":true,"origin":"","legend":"\u003cp\u003eInteractions between SHAP values and pairs of features employed in the predictions of mastitis welfare indicators for the class B (risk welfare). The dot colors represent the scale of milk conductivity (milk cond.), where darker dots represent smaller milk cond. values. The histogram at the bottom, on the right-hand side, shows the distribution of milk cond. in the whole sample. To improve visualization, based on milk concentration distribution, values below 600 and above 1200 were truncated at these levels.\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/4b395f5778c1ce8b20966860.png"},{"id":105728330,"identity":"69413f9e-3386-4754-b7d3-ffe7fa62fea9","added_by":"auto","created_at":"2026-03-30 11:11:28","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":321307,"visible":true,"origin":"","legend":"\u003cp\u003eInteractions between urea and acetone across other features and SHAP values observed in the subsets of good (G) and risk (B) classes for subclinical ketosis welfare indicator. A) Urea values in the B class, B) Urea values in the G class, C) Acetone values in the G class, D) Acetone values in the B class.\u003c/p\u003e","description":"","filename":"image4.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/c56756031a1567623e62a832.png"},{"id":105728719,"identity":"e515758e-7b0f-4f20-8ca9-5a4f8b478470","added_by":"auto","created_at":"2026-03-30 11:12:32","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":92869,"visible":true,"origin":"","legend":"\u003cp\u003eInteractions between milk conductivity (Milk cond.), urea, acetone, fat, and protein values with the SHAP value in the good and risk classes for acidosis welfare indicator.\u003c/p\u003e","description":"","filename":"image5.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/3e178357baa8fc8b37ba8670.png"},{"id":105728391,"identity":"aae7281a-760b-4333-8279-f5387cba2f31","added_by":"auto","created_at":"2026-03-30 11:11:42","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":79718,"visible":true,"origin":"","legend":"\u003cp\u003eRelationship between milk conductivity (Milk cond.), urea, lactose, fat, protein, and acetone values with the SHAP value in the good and risk classes for longevity (A) and reproduction (B) welfare indicators.\u003c/p\u003e","description":"","filename":"image6.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/eb26549621d483415b40aabb.png"},{"id":105752054,"identity":"f19dd732-f998-4e4b-8a86-3f9ae0ca93ca","added_by":"auto","created_at":"2026-03-30 15:53:54","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":340156,"visible":true,"origin":"","legend":"\u003cp\u003eComparison of feature contributions for the individual welfare indicators WI) between a good (G) and risk (B) overall welfare score for the same cow (\"-9223372036849293312\"). A) Bar plots for the five individual WIs scores between G and B classes. The SHAP values obtained for the features used to predict the individual WIs for longevity (B), mastitis (C), reproduction (D), subclinical acidosis (E), and subclinical ketosis (F) in the B and G classes, upper and lower plots, respectively.\u003c/p\u003e","description":"","filename":"image7.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/02a313eb2172d8ce5770f93a.png"},{"id":105607605,"identity":"5fd5cd44-8ed8-4759-82b8-43cab9333b9b","added_by":"auto","created_at":"2026-03-28 00:41:09","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":234136,"visible":true,"origin":"","legend":"\u003cp\u003eCounterfactual explanations obtained as minimal changes (orange lines) in the feature values to shift the risk overall welfare score (B) prediction for the animal \"-9223372036849293312\" to a good (G, left-hand side) and intermediary (I, right-hand side) overall welfare score. Changes were proposed for Longevity (A), Mastitis (B), and Reproduction (C) individual welfare indicators, as those were classified as B classes for the same animal. The proposed feature values were compared with the observed values in a monthly record originally classified as G (green line) and B (red line).\u003c/p\u003e","description":"","filename":"image8.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/d45daf72eb6f0d4c73724a88.png"},{"id":105607603,"identity":"2b94faa1-3851-4cb4-ad2e-7522023f4819","added_by":"auto","created_at":"2026-03-28 00:41:09","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":116999,"visible":true,"origin":"","legend":"\u003cp\u003eMean minimal changes obtained across all the counterfactual explanations (orange lines) obtained to shift the risk welfare predictions to good welfare for the mastitis (A), subclinical acidosis (B), subclinical ketosis (C), longevity (D), and reproduction (E) individual welfare indicators. The proposed feature values were compared with the observed values in the mean monthly records originally classified as G (green line) and B (red line).\u003c/p\u003e","description":"","filename":"image9.png","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/5309c6be984f376d09815839.png"},{"id":105752615,"identity":"149f9418-7936-476f-8a26-d47171160638","added_by":"auto","created_at":"2026-03-30 16:03:02","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2926862,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9082135/v1/08d88e72-9e21-446f-b5f0-dab4a2ab1280.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Improving interpretability and applicability of welfare management decisions in dairy cows through explainable artificial intelligence","fulltext":[{"header":"Background","content":"\u003cp\u003eThe effective and sustained improvement of animal welfare is a global demand in the livestock sector, with increasingly relevant health, ethical, legal, and economic implications in livestock production [\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. The welfare of farmed animals is one of the objectives pursued by the Common Agricultural Policy in the European Union, including Italy. An enhanced welfare is associated with better health, productivity, longevity, and sustainability in the production chain [\u003cspan additionalcitationids=\"CR5 CR6\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Therefore, monitoring animal welfare is a crucial process in the current context of the livestock production chain. Due to their complexity, animal welfare traits must be improved using complementary approaches that use both breeding and environmental data, plus animal-specific data (ketosis, acidosis, mastitis) [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. Therefore, strategies that allow early detection of dairy animals with higher genetic merits for a better welfare profile are important. Additionally, forecasting stressful conditions resulting in risks to animal welfare might help shape management conditions and reduce the impact of these conditions on the animal\u0026rsquo;s health and productivity. In the last few decades, the growth of precision livestock farming technologies has allowed the constant monitoring of animals and the measurement of a broad range of parameters that can be used to track and forecast stressful conditions and welfare risk [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe dimensionality and complexity of the data generated by such technologies might be challenging to fit into traditional statistical models. Consequently, in recent years, the use of machine learning (ML) models has become popular in the analysis of precision farming data and in the creation of predictive models for different traits in livestock species, including welfare-related traits [\u003cspan additionalcitationids=\"CR12 CR13\" citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. The use of ML models offers a robust alternative to handle dimensionality-related problems, uncover hidden patterns, and better model non-linear relationships in the data, among other advantages. In this context, ML models offer a promising approach to predict complex traits, enabling earlier and more efficient animal selection. However, data-driven decisions based exclusively on the ML model\u0026rsquo;s output are still challenging. The higher the complexity of ML models (high dimensionality, multicollinearity, nonlinear relationships, etc.), the smaller their interpretability potential, leading to the concept of black-boxes models [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIn the livestock sector, models that are easy to interpret and explain are crucial not only for selecting animals but also for informed decisions, such as adjusting diets or infrastructure to enhance animal welfare. Despite the absence of a clear legal requirement for interpretability in predictive models used in the livestock sector, the decision-maker, ultimately the farmer, who relies on these models and is affected by their outcomes, should be able to understand the reasons behind the decisions. The increasing adoption of AI in livestock farming offers important opportunities to improve productivity, sustainability, and animal welfare through more precise and data-driven management. However, these advances also raise important concerns related to data quality, privacy, trust, and the risk of transferring critical decisions from farmers to opaque automated systems [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. In this context, the explication of ML models is a crucial step to understand the relationship between the variables in a model and how these variables contribute to the final output [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Explainable artificial intelligence (XAI) is a collective term for the methods and techniques that make the decisions and inner workings of ML models understandable to humans, rather than being \u0026ldquo;black boxes\u0026rdquo;. The main goal of XAI techniques is to show why a model made a specific prediction or decision, in a way that is transparent, trustworthy, and useful for users [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. The application of XAI techniques to predict welfare-related traits is a promising approach to enhance the efficiency of detecting welfare risk events. Additionally, XAI techniques can also help refine individual management decisions aimed at improving animal welfare at individual and herd levels.\u003c/p\u003e \u003cp\u003eIn light of the above, the current study aimed to develop a proof of concept for the use of ML models for predicting welfare indicators in dairy cows and to integrate XAI techniques, such as Shapley values and Counterfactual explanations, to identify patterns in the features used for prediction that might help better define management decisions to reduce potential welfare risks.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eSample and welfare indicators\u003c/h2\u003e \u003cp\u003eAll the information used in the current study was retrieved from the Italian Breeders Association (AIA) Livestock Environment Opendata (LEO - \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://opendata.leo-italy.eu/portale/home\u003c/span\u003e\u003cspan address=\"https://opendata.leo-italy.eu/portale/home\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Five individual cow welfare indicators (WI) parameters were selected: longevity, mastitis, subclinical ketosis, subclinical acidosis, and reproduction. These five WIs are used to compute an overall WI score for dairy cows. The five individual WIs are on a scale from 0\u0026ndash;30, where values between 0\u0026ndash;10 indicate G (Good) welfare, from 10\u0026ndash;20 a I Intermediate) welfare, and from \u0026gt;\u0026thinsp;20\u0026ndash;30 a B (Risk) to the welfare. A complete description of how each WI is estimated is available in the manual 14 of the AIA (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.sialleva.it/Media/Default/document//ManR14%20-%20REPORT%20E%20GRAFICI%20RISCHIO%20BENESSERE.pdf\u003c/span\u003e\u003cspan address=\"http://www.sialleva.it/Media/Default/document//ManR14%20-%20REPORT%20E%20GRAFICI%20RISCHIO%20BENESSERE.pdf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe overall welfare score (G, I, and B) is computed based on the individual WI within each of the previously mentioned classes. Cows with all five individual WI in the good class are classified in the good overall welfare class; cows without any individual bad WI but with at least one intermediary WI are classified in the intermediary overall welfare class; and the rest of the cases are classified in the moderately acceptable (G) welfare class.\u003c/p\u003e \u003cp\u003eMonthly means for these five individual WIs were calculated for 843 cows, resulting in 227,603 observations. After a careful evaluation of potential features to be included in the models to predict the individual WI, among the traits available on the LEO database, the mean monthly values of acetone (mM/l), milk fat content (g/100g), milk lactose content (g/100g), milk protein content (g/100g), milk urea content (mg/dl), and milk average conductivity were chosen as features. The selection of these features was based on availability for animals with WI records and reducing collinearity among the features.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003ePrediction of welfare indicators and welfare scores using random forest\u003c/h3\u003e\n\u003cp\u003ePrevious to the fit of the predictive models, quality control analysis was performed on the feature dataset, and all feature records with values above or below the feature mean \u0026plusmn; 5 standard deviations were removed. A threshold of \u0026plusmn;\u0026thinsp;5 standard deviations was adopted to avoid excluding biologically plausible extreme observations, which are expected in large longitudinal datasets and may reflect true individual variability rather than measurement or recording errors. After this step, the dataset consisted of 798 cows and 125,285 observations (i.e. individual cow monthly means). These records were split into training (70%) and test (30%) sets for each overall welfare score (G, I, and B), yielding 87,699 and 37,586 observations in the train and test sets, respectively.\u003c/p\u003e \u003cp\u003eThe h2o R package [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e] was used to fit Random Forest (RF) models for the prediction of each individual WI. A random discrete grid search was performed in the hyperparameter tuning stage using root mean squared error (RMSE) as a stopping metric, a stopping tolerance of 0.005, and a maximum of training rounds equal to 100. The hyperparameter space explored in this study was intentionally limited to a small set of biologically and computationally relevant values. Therefore, a random discrete grid search was preferred over a more extensive search because it allowed efficient identification of suitable hyperparameter combinations while reducing computational burden and avoiding unnecessary evaluation of a large number of similar models. The hyperparameter grid was defined by combining a vector of numbers of trees from 10 to 100, incremented by 10, and row sample rates of 0.8 and 1. A total of 10-fold was used in the hyperparameter tuning. Moreover, to predict a WI value in a given month, the closest previous monthly mean for each feature was used.\u003c/p\u003e \u003cp\u003eThe individual WIs were treated as continuous scores, and the best model for each individual WI was defined based on the smallest RMSE. The models were employed in the test set to predict the individual WI, and the prediction accuracy was evaluated based on the RMSE between the predicted and observed individual WI. Subsequently, the predicted individual WIs were used to reconstruct the overall welfare score in the G, I, and B classes. The reconstruction accuracy was assessed using balanced accuracy, and weighted versions of precision, recall, and F1-scores. The weighted versions of precision, recall, and F1 scores were estimated using a support-weighted average based on the following formula:\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\n$${M}_{w}=\\frac{\\sum_{i=1}^{n}\\left({M}_{i}{N}_{i}\\right)}{N}$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003ewhere M\u003csub\u003ew\u003c/sub\u003e is the weighted version of the metric for class i (M\u003csub\u003ei\u003c/sub\u003e) and N\u003csub\u003ei\u003c/sub\u003e is the number of observations of class i, and N is the total number of observations.\u003c/p\u003e\n\u003ch3\u003eFeature importance and relationships evaluation through Shapley values\u003c/h3\u003e\n\u003cp\u003eThe R package shapviz [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e] was used to obtain SHAP values for each feature included in the selected models in the test set. The values were analyzed based on the relationship between SHAP values and feature distribution by importance plots and the interaction between features and the SHAP values by dependence plots. Additionally, the SHAP values for the five individual WIs were compared across contrasting overall welfare scores (G vs B) observed in the same individuals (considering only true positive predictions). These comparisons were contrasted with the individual WI and the overall welfare scores to assess the certainty of the feature importance.\u003c/p\u003e\n\u003ch3\u003eCounterfactual explanations for enhanced management decisions\u003c/h3\u003e\n\u003cp\u003eThe R package counterfactuals (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/dandls/counterfactuals\u003c/span\u003e\u003cspan address=\"https://github.com/dandls/counterfactuals\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) was used to assess the minimum changes required in the feature composition to transform a B individual WI (score higher than 20) to a G individual WI (score lower than or equal to five). The distribution of all counterfactual explanations of each feature obtained for the cows predicted as B welfare for each individual WI was compared with the observed feature distributions in the observed G welfare scores. The comparison was performed to assess the certainty of the suggested changes. These comparisons were performed only for correctly predicted B scores, excluding false negatives.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eSample, welfare indicators, and features distribution\u003c/h2\u003e \u003cp\u003eThe observed number of the individual WI for each of five classes (longevity, mastitis, subclinical ketosis, subclinical acidosis, and reproduction) and overall welfare scores across the 125,285 monthly observations from the 798 cows analyzed are shown in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. The number of cows per breed is shown in Supplementary Table S1. The distribution of the feature values (acetone, fat, lactose, protein, urea, and milk conductivity) used for the prediction of individual WI and the correlation across the features are shown in Supplementary Figs.\u0026nbsp;1 and 2. The analysis of the distribution of welfare classes indicated an unbalanced scenario for some individual WI. Therefore, it was chosen to predate the individual welfare core rather than the welfare class.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDistribution of monthly observations for the good (G), intermediate (I) and risk (B) welfare classes for individual welfare scores and the overall welfare indicator.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eWelfare indicator\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"3\" nameend=\"c4\" namest=\"c2\"\u003e \u003cp\u003eWelfare classes\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eG\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eI\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eB\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLongevity\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e10,233\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e86,815\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e28,237\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMastitis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e74,913\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e26,221\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e24,151\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSubclinical ketosis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e111,314\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e13,971\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSubclinical acidosis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e115,985\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e9,276\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eReproduction\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e51,913\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e50,957\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e22,415\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOverall welfare indicator\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e2,132\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e55,252\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e67,901\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003ePredictive performance of individual welfare indicators and overall welfare score\u003c/h3\u003e\n\u003cp\u003eThe Random Forest (RF) models used to predict each individual WI showed strong performance. The Pearson correlation coefficient (r) and the RMSE obtained for the individual WI in the test set indicated a good accuracy for the selected models: Longevity (r\u0026thinsp;=\u0026thinsp;0.982; RMSE\u0026thinsp;=\u0026thinsp;0.937); Mastitis (r\u0026thinsp;=\u0026thinsp;0.987; RMSE\u0026thinsp;=\u0026thinsp;1.476); Subclinical ketosis (r\u0026thinsp;=\u0026thinsp;0.971; RMSE\u0026thinsp;=\u0026thinsp;1.246); Subclinical acidosis (r\u0026thinsp;=\u0026thinsp;0.979; RMSE\u0026thinsp;=\u0026thinsp;0.839); and, Reproduction (r\u0026thinsp;=\u0026thinsp;0.978; RMSE\u0026thinsp;=\u0026thinsp;1.597). Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e shows the reconstructive performance for the overall welfare scores based on the predicted individual WI. A balanced accuracy of 0.841 was achieved for the reconstruction of the overall welfare score with 0.938 precision, recall of 0.930, and an F1-score of 0.930, suggesting a good predictive performance. However, results varied between the three classes [good (G), intermediary (I), and risk (B) class; Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e]. For the G class, the model resulted in a precision equal to 1, indicating that whenever the model predicts G, it is correct (false positive\u0026thinsp;=\u0026thinsp;0), and a recall equal to 0.665. The latter implies that only\u0026thinsp;~\u0026thinsp;67% of true G were classified as G (with 179 false negatives, all predicted as I class). A slight change in the classification threshold for the individual overall welfare score, where the values were rounded (without decimals), allowed an increase in the true positive rate to ~\u0026thinsp;80.5% for the G class. However, this increment resulted in 70 false positive classifications (I-class cows classified as G). Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA shows the RMSE obtained for the individual WIs predictions in the cases where the overall welfare score was misclassified as I instead of G. It is possible to observe higher RMSE for the longevity, mastitis, and reproduction WIs. Additionally, most errors were caused by misclassification of one individual WI, with the most frequent misclassified class being reproduction, followed by the subclinical ketosis and longevity (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). The distribution of the features used for the prediction of the individual WIs indicated that the misclassified G classes (classified as I) have larger differences for acetone and fat when compared with the cows correctly classified as the overall welfare G class (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). For class I, a precision equal to 0.885 and a recall equal to 0.998 were observed (false positives\u0026thinsp;=\u0026thinsp;2,582 and false negatives\u0026thinsp;=\u0026thinsp;37). These results indicate that the model misclassifies some cows belonging to the G and B classes as belonging to the I class. In the case of the B class of predictions, a precision equal to 0.997 and a recall equal to 0.859 were observed (false positives\u0026thinsp;=\u0026thinsp;37 and false negatives\u0026thinsp;=\u0026thinsp;2,403). It is important to highlight that all the false-negative predictions were misclassifications between the B and I classes. Therefore, any B score was misclassified as G class.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003ePredictive performance for the reconstruction of the overall welfare score classes (Good (G), Intermediary (I), and risk (B)) based on the predicted individual welfare indicators.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"10\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c10\" colnum=\"10\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIndividual classes\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eObs\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTP\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eFP\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eFN\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eTN\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003eF1-score\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eB\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e17,059\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e14,656\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e37\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e2,403\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e20,490\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e0.997\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e0.859\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e0.923\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eG\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e534\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e355\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e179\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e37,052\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e1.000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e0.665\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e0.799\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e19,993\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e19,956\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e2,582\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e37\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e15,011\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e0.885\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c9\"\u003e \u003cp\u003e0.998\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c10\"\u003e \u003cp\u003e0.938\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"10\"\u003eObs: Number of records observed for each class; TP: True Positive; FP: False Positive; FN: False Negative; TN: True Negative.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eFeatures importance and feature interactions in the influence on predictive outcomes\u003c/h3\u003e\n\u003cp\u003eThe analysis of beeswarm plots for the SHAP (SHapley Additive exPlanations) values obtained for the features used in the prediction of the five individual WIs is summarised in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. It should be noted that to predict a WI value in a given month, the closest previous monthly mean for each of the six features selected was used. Therefore, the SHAP values correspond to the contribution of the previously available feature values to the predicted WI scores.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFor mastitis WI, lower lactose, protein, and fat are associated with higher SHAP values, especially in the B class, suggesting a higher welfare score (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). In the B class of the mastitis WI, lower levels of urea and acetone were associated with smaller SHAP values, consequently, reducing the welfare score. Regarding milk conductivity, in the B and I classes for mastitis WI, higher milk conductivity was associated with higher SHAP values, while in the G class, higher values decreased the SHAP values. In the G class, milk solid content (mainly lactose) tends to be higher, which may also increase milk conductivity (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). Higher lactose levels appear to be associated with lower milk conductivity, but there is no clear association with SHAP values in the G and I classes of mastitis WI (Supplementary Figs.\u0026nbsp;3 and 4).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe SHAP values obtained for the ketosis WI showed a varied pattern. The beeswarm plots indicated that higher protein and acetone values are associated with increases in SHAP values, whereas lower milk conductivity, lactose, and urea (especially for the G class) content are associated with higher SHAP values (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). Results of the urea concentration in the milk in association with the SHAP values are presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA and B. For both G and B classes, higher SHAP values were associated with higher urea values below 20 and above 30 (this pattern was intensified for the B class). No specific interaction with other features was observed for the urea WI. The acetone values showed an increasing pattern in association with the SHAP values in the B class, which was not found in the G class (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC and D). However, it is important to highlight that the majority of the acetone values observed in the test set were below 0.3 (Supplementary Fig.\u0026nbsp;5).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe beeswarm plot of the SHAP values for the features used in the acidosis WI showed that, especially for cows classified in the G class, lower milk conductivity and protein are associated with smaller SHAP values (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC). On the other hand, higher values of urea and acetone were associated with smaller SHAP values. Figure\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e presents a clear visualization of the points associated with the G and B classes for all the features, except for lactose, where the dots representing the B class are located in areas with higher SHAP values. Despite a clear interaction pattern with other features, the low-fat and high-protein milk contents also showed a relationship with high SHAP values for acidosis in the B class. An interaction between lactose content and milk conductivity was observed, where higher milk conductivity and lower lactose content were associated with higher SHAP values (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe analysis of SHAP values for the longevity WI indicated that the extreme values for milk conductivity, urea, lactose, fat, and protein were associated with an absolute increase of SHAP values, in both positive and negative zones (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA). In addition, an inverse pattern was observed between the G (dots below zero on the y-axis) and B classes (dots above zero on the y-axis). For example, smaller protein values in the G class were associated with smaller SHAP values. In contrast, higher values in the B class were associated with higher SHAP values (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA). The urea values above 40 were associated with higher SHAP values in both classes. For the G class, higher acetone levels (below 0.3) were associated with smaller SHAP values.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe beeswarm plot for the reproduction WI indicated that higher protein, urea, and fat values were associated with higher SHAP values in the B class, while smaller milk conductivity and higher lactose were associated with smaller SHAP values in the G class (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eE). Additionally, Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eB indicates that for the B class (dots above zero on the y-axis), values in both extremes for milk conductivity, urea, lactose, fat, and acetone are associated with higher SHAP values.\u003c/p\u003e \u003cp\u003eThe SHAP values can also be used to evaluate the feature contribution within individual WIs used to reconstruct the overall welfare score. One of the animals with the largest number of individual WIs classified in the risk welfare (B class) in a month was selected as an example. The cow with identification number \"-9223372036849293312\" in the LEO database had a monthly record composed of three individual WIs above 20 (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eA): longevity (20.5), mastitis (23.6), and reproduction (21.5). For this cow, a good overall welfare score was selected from another monthly record for comparison. As expected, the SHAP values for the individual WIs showed opposite contribution patterns for longevity (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eB), mastitis (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eC), and reproduction (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eD) in the B (positive SHAP value) and G (negative SHAP values) classes. In addition, the SHAP values for the WI of subclinical acidosis in the B class have a mix of positive and negative contributions (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eE). In contrast, for the WI for subclinical ketosis, all features had negative SHAP values (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eF), which corroborates the absence of contribution of these individual WIs to the overall welfare score.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eCounterfactual explanations for individualized management decisions\u003c/h2\u003e \u003cp\u003eCounterfactual explanations add interpretability to the ML model and thereby can provide important information for management purposes in the farms. Figure\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e illustrates the minimal changes in the input features required to convert a welfare risk outcome (class B) into either class G (left-hand side) or class I (right-hand side) for the longevity, mastitis, and reproduction WIs. For the longevity and mastitis WI, increases in fat, lactose, protein, and urea, together with a decrease in milk conductivity, were identified as the minimal changes needed to shift the prediction from B to G class. Similar patterns were observed for changing B to I class, with the main difference being the absence of changes in milk conductivity and protein (based on the raw, unscaled values). For the mastitis WI, the same pattern of changes as for longevity was observed, converting from B to G class. The shift from B to I class for mastitis differed only in fat content, which was instead suggested to decrease. For the reproduction WI, increases in acetone, lactose, protein, and urea, combined with reductions in fat and milk conductivity, were suggested as the minimal changes required to convert a B outcome to a G outcome. In contrast, to change the prediction from B class to I class, only a decrease in fat content and an increase in lactose and milk conductivity were indicated. It is important to note that, for some features, the counterfactual explanations for the reproduction WI exhibited patterns opposite to those observed for the longevity and mastitis WIs. Notably, most of the proposed minimal changes in feature values align with the observed values for the same features in a month when the same animal was classified in the G class for overall welfare score (green lines in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eOverall, the mean counterfactual explanations used to shift WI classes from B to G indicated that, in most cases, the suggested feature values lay between the mean values observed in the true B and G classes (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e). Exceptions were found for specific WIs: for mastitis, the counterfactuals suggested higher milk conductivity than in the B group and lower urea than in the G group; for subclinical acidosis, they suggested lower lactose than in both B and G; for subclinical ketosis, they indicated higher lactose but lower milk conductivity than in both B and G; and for reproduction, they suggested lower milk conductivity than in both B and G (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e). Summary statistics of the counterfactual explanations used to shift WIs classes from B to G are reported in Supplementary Table S2.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe interpretability of a predictive ML model is a key component to enable applicability and efficiency of data-driven decisions. The value of information, i.e, benefit gained from using information to improve a decision, often by reducing uncertainty, is crucial in the context of livestock precision to help the development of better, scalable, and transparent decision-making frameworks [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. Interpretability and visualization have been reported as key feaurures in the livestock sector [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. The inclusion of XAI techniques in these frameworks has the potential to improve not only the interpretability of the ML models but also the efficiency of predictions, and to provide better guidance for decision-making [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. Additionally, the interpretability of automated decision-making systems is already a requirement in different areas, such as in human health [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]. The use of XAI techniques can be applied in different stages of a framework, including pre-modeling, interpretation, and post-modeling [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. The methodology used here applies XAI in these three areas.Therefore, XAI tools (Shapley values and counterfactual explanations) were used to evaluate model performance, feature importance, and to provide management insights from the predictive outcomes.\u003c/p\u003e \u003cp\u003eIn general, the performance of the models predicting the individual WIs was good (r\u0026thinsp;\u0026gt;\u0026thinsp;0.97) for all the individual WIs. Regarding the reconstruction of the overall welfare score, the RF classifier was accurate, especially for the larger classes (B and I), but it under-detected the minority class G. When each class was given equal importance, the RF performance was decreased (~\u0026thinsp;0.84 balanced accuracy). However, this decrease in classification accuracy was a consequence of a conservative decision to avoid false positive predictions for the G class. Indeed, this is confirmed by a slight change applied to the predicted WI (values were rounded to the nearest integer), which improved the true positive rate to ~\u0026thinsp;80.5%, but also allowed for false negatives. The selection of the threshold must thus be considered in each specific case, based on the consequences of including false-positive classifications within each overall welfare class. Another relevant point is the high RMSE observed in the reproduction and longevity WIs due to G class misclassification. Based on the original WIs estimations obtained from AIA, the reproduction and longevity WIs exhibit lower monthly variability, which might have impacted the predictive performance of the trained models.\u003c/p\u003e \u003cp\u003eThe contributions of each feature to the predictive outcome were analysed for validating the accuracy of the predictions and for ensuring the absence of non-logical patterns, such as a hypothetical increment in milk somatic cell counts to improve mastitis welfare scores. The analysis of the feature value distribution and the association with SHAP values indicated that for all the individual WIs, extreme values (in the observed distribution) of fat, protein, lactose, and urea were associated with higher SHAP values, especially in the B (risk) overall welfare class. Biologically, these results are supported by the fact that disturbances in energy balance can have negative effects in dairy cows in both positive and negative energy balance [\u003cspan additionalcitationids=\"CR28 CR29 CR30 CR31 CR32\" citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. These results were especially reinforced by the pattern observed for the urea SHAP values. Both excessively high and low milk urea concentrations are associated with metabolic disorders, reproductive performance, and other health and welfare issues in dairy cattle, related to the nitrogen metabolism [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. For example, low milk urea concentrations are associated with lameness [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e] while high milk urea concentrations are one of the main indicators of ketosis [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e] in dairy cattle. Acetone also showed a direct association with higher ketosis welfare risk (higher SHAP values). These results are reinforced by the physiological alterations that occur after calving, where the cow often enters a state of negative energy balance, in which glucose demand exceeds supply, triggering the mobilization of adipose tissue and the conversion of fatty acids into ketone bodies, particularly acetone and β-hydroxybutyrate [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. Indeed, previous studies have reported that high levels of milk acetone is an indicator of ketosis in dairy cows [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e] and suggested the in-line evaluation of acetone levels for subclinical ketosis risk assessment [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. For the acidosis WI, fat and protein concentrations in the milk were associated with changes in the SHAP values for welfare risk.\u003c/p\u003e \u003cp\u003eRuminal acidosis is another common metabolic condition in dairy cattle and is associated with alterations in milk components, such as fat and protein [\u003cspan additionalcitationids=\"CR42\" citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]. Subacute ruminal acidosis can cause a reduction in fat percentage and an increase in protein percentage in milk [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e, \u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]. In our study, lower milk fat and higher milk protein percentage in the B class for the welfare WI were associated with higher SHAP values. This is in agreement with the known relationship between the the fat-to-protein ratio and the metabolic status in dairy cattle, including acidosis (low fat-to-protein ratio is an indicator of sabucte ruminal acidosis) [\u003cspan additionalcitationids=\"CR47 CR48\" citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]. An interaction between lactose (lower values,see Supplementary Fig.\u0026nbsp;5) and milk conductivity (higher values) was observed in the increase of SHAP values and, consequently, the acidosis welfare risk. The lactose content in the milk showed negative correlations with energy balance measures (condition score, live weight in kilograms, and the product of condition score and live weight) in dairy cows [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. A previous study reported that milk conductivity tends to rise several days before dairy cows are diagnosed with acidosis or ketosis [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e].\u003c/p\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eMastitis WI\u003c/h2\u003e \u003cp\u003eAs expected, the risk of mastitis was associated with lower levels of milk components such as lactose, fat, and protein, in line with current evidence [\u003cspan additionalcitationids=\"CR52\" citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e]. Similar to other WIs, urea levels were also associated with changes in the SHAP values for the mastitis WI. For urea, both small and high values of urea in the milk were associated with a higher mastitis welfare risk. Milk urea levels were previously associated with pathogen-specific clinical mastitis [\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e]. The interaction between low lactose levels and high milk conductivity with higher SHAP values was also observed for mastitis WI. The milk electrical conductivity is a well-defined biomarker for mastitis in dairy cows and is largely determined by the concentration of the main milk ions [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e, \u003cspan additionalcitationids=\"CR55\" citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e]. The milk conductivity is especially determined by Na+, K+, and Cl\u0026minus; ions [\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e]. After intramammary infection, disruption of mammary epithelial tight junctions and of the ion-pumping system promotes the movement of Na\u0026thinsp;+\u0026thinsp;and Cl\u0026minus; into milk and the leakage of lactose and K+ toward the extracellular fluid, resulting in increased milk electrical conductivity [\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e]. This mechanism is consistent with the pattern observed here, in which higher mastitis risk was associated with the combination of increased conductivity and reduced lactose levels. In Gyr cows, milk conductivity is negatively associated with lactose content, positively associated with somatic cell counts, and lactose content itself shows a negative relationship with somatic cell counts, reinforcing the pattern observed here [\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eLongevity WI\u003c/h2\u003e \u003cp\u003eFor the longevity WI, both low and high urea levels were associated with higher welfare risk. Zhao et al. [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e] suggested that both low and high milk urea concentrations may have a direct or indirect relationship with cow longevity based on their association with health issues, and consequently, the animal culling. The same pattern of contribution to the increase in SHAP values was observed in the ketosis, acidosis, and mastitis welfare indicators, reinforcing this association. For the G class of longevity welfare, inverse U-shapes were observed for milk conductivity, lactose, and protein content. These results need to be interpreted carefully, as this pattern is observed within a narrow interval for a specific group and might indicate the absence of intermediary points caused by sample size or measurement biases. In the G class, higher milk acetone levels (till \u0026lt;\u0026thinsp;0.4) were associated with lower longevity welfare risk. This relationship is complex to explain, but in low-productivity dairy herds, the increase in acetone level was associated with higher productivity [\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e]. The increase in the risk of reproduction WI was associated with higher fat, protein, and lactose content (despite a slight U-shaped pattern) in the B class. This might be explained by the association observed between highly productive cows and fertility problems [\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e]. Interestingly, the Wis for reproduction and longevity exhibit opposite U-shape patterns between the G and B classes for the SHAP values of almost all features. These results reinforce the necessity to evaluate the feature importance in using group-specific approaches. Therefore, this pattern should be better evaluated to understand the relationship between these variables and welfare risk across different animal classes.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eSHAP values\u003c/h2\u003e \u003cp\u003eThe SHAP values were also used to assess model's reliability. The SHAP values for the individual WIs were compared, contrasting overall welfare scores (B vs G classes) for the same cow in different months. In the G scenario, all features across all individual WIs were associated with negative SHAP values. In contrast, in the B scenario, only longevity, mastitis, and reproduction WIs were associated with negative SHAP values. In the G scenario, negative SHAP values for all features were observed for the subclinical ketosis WI, reinforcing the impact of this indicator on the overall welfare score. On the contrary, for the subclinical acidosis WI, positive SHAP values were observed for urea, lactose, protein, and acetone. This suggests that urea, lactose, protein, and acetone should be carefully monitored to avoid an increasing risk of subclinical acidosis later in life.\u003c/p\u003e \u003cp\u003ePrecision farming aims to enhance production efficiency by using advanced information and communication technologies that optimize resource use and precisely control the production process, while maintaining high standards of animal welfare [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. The evaluation of individual requirements for customized management decisions may be an important tool for achieving this goal, since it allows the identification of animal-specific needs and supports targeted strategies to reduce somatic cell counts, increase milk lactose content, decrease milk urea concentration, and ultimately improve production efficiency, udder health, and animal welfare [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e, \u003cspan additionalcitationids=\"CR62 CR63\" citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e64\u003c/span\u003e]. The use of XAI techniques can be a powerful tool to leverage customized data-driven decisions in the livestock sector. In this context, the use of statistical techniques, such as counterfactual explanations, provide a useful tool for vets and farm consultants to reduce the risk posed by consequential automated decisions, increase strategic behavior, and better understand model recommendations [\u003cspan additionalcitationids=\"CR66 CR67\" citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e]. In the present study, the comparison of contrasting overall welfare scores (B vs G classes) for the same animal was used as a proof of concept.\u003c/p\u003e \u003cp\u003eIn general, the proposed shifts in feature values suggested by the counterfactual explanations followed a logic pattern, with proposed values between the observed values in true G and B predicted classes. Nevertheless, for the mastitis WI an increase in the milk conductivity was suggested, which is not logical. Thus, counterfactual explanations, should be interpreted with caution. As mentioned above, higher milk conductivity is associated with mastitis cases [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e, \u003cspan additionalcitationids=\"CR55\" citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e]. However, an increase in fat and lactose, and a decrease in urea compared with the B class were also proposed, which coincides with previously reported patterns of association with mastitis in dairy cows [\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e, \u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e, \u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e]. In addition, counterfactual explanations are best applied at the level of individual cows, where they can support data-driven management decisions while allowing for biological plausibility and features interactions. For example, if a counterfactual suggests increasing milk conductivity above the levels typically observed in the G (healthy) class, this recommendation should not be followed blindly; instead, its impact on other traits (such as milk composition) should be evaluated, and any adjustments should remain within realistic ranges for healthy animals. In this sense, counterfactuals should be interpreted as tools to explore feasible management scenarios rather than rigid prescriptions. Nevertheless, their contribution to decision-making in dairy cow production systems needs to be tested in practice to assess their true applicability, limitations, and potential to improve on-farm management. In addition, the inclusion of other biomarkers, such as beta-hydroxibutyrate, specific fatty acids, and lactate dehydrogenase, among others, might help to improve the predictive performance and explainability of the models proposed here, and enhnance accurate management decisions.\u003c/p\u003e \u003c/div\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThe results shown here demonstrate how integrating XAI tools into predictive frameworks for dairy cow welfare has the potential to enhance both the transparency and practical usefulness of data-driven decisions on the farm. The models developed here achieved high predictive performance for individual WIs and acceptable reconstruction of the overall welfare score. The SHAP-based analyses revealed feature\u0026ndash;outcome relationships that are largely consistent with established physiology and metabolism knowledge, thereby reinforcing confidence in the biological plausibility of the predictions. At the same time, patterns such as U-shaped effects and class-specific differences in feature importance highlight the need for group- and context-specific interpretation rather than one-size-fits-all rules. Complementarily, counterfactual explanations proved valuable for translating model output into actionable management scenarios. However, occasional non-intuitive suggestions, such as increased milk conductivity to improve mastitis scores, underline that counterfactuals must be interpreted critically, in light of domain knowledge and potential interactions among traits, and used as decision-support rather than prescriptive recommendations. Overall, our findings suggest that the combined use of SHAP and counterfactual explanations offers a promising avenue to build more interpretable, robust, and customizable decision-support systems in precision livestock farming. However, their on-farm applicability, economic impact, and usability for farmers and advisors still need to be systematically evaluated in real production environments with approipriate health/economic assessment.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch3\u003e\u003cem\u003eEthics approval and consent to participate\u003c/em\u003e\u003c/h3\u003e\n\u003cp\u003eNot aplicable\u003c/p\u003e\n\u003ch3\u003e\u003cem\u003eConsent for publication\u003c/em\u003e\u003c/h3\u003e\n\u003cp\u003eNot aplicable\u003c/p\u003e\n\u003ch3\u003e\u003cem\u003eAvailability of data and materials\u003c/em\u003e\u003c/h3\u003e\n\u003cp\u003eThe datasets supporting the results of this article are available in the Zenodo repository (https://doi.org/10.5281/zenodo.18834616).\u003c/p\u003e\n\u003ch3\u003e\u003cem\u003eCompeting interests\u003c/em\u003e\u003c/h3\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eFunding\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003eThis work was supported by the Veterinary Health Innovation Engine sponsored by Zoetis.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eAuthors\u0026apos; contributions\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003ePASF participate of the conceptualization, data retrieve, formal analysis, implementation of ML and XAI algorithms, writing and review of the manuscript. ASV participate of the conceptualization, data retrieve, writing and review of the manuscript. BGC ASV participate of the conceptualization, data retrieve, writing and review of the manuscript. JCD participate of the data retrieve, writing and review of the manuscript. NG participate of the data retrieve, writing and review of the manuscript. ADW provided funding and contributed to the writing and review of the manuscript. JJA of the conceptualization, data retrieve, formal analysis, writing and review of the manuscript.\u003c/p\u003e\n\u003ch3\u003e\u003cem\u003eAcknowledgements\u003c/em\u003e\u003c/h3\u003e\n\u003cp\u003eProject LEO funded under Submeasure 16.2 \u0026ndash; PSRN 2014/2020.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eGrethe H. The Economics of Farm Animal Welfare. Annual Rev Resource Econ. 2017;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1146/annurev-resource-100516-053419\u003c/span\u003e\u003cspan address=\"10.1146/annurev-resource-100516-053419\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTuyttens FAM, Molento CFM, Benaissa S. Twelve Threats of Precision Livestock Farming (PLF) for Animal Welfare. Front Veterinary Sci. 2022;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fvets.2022.889623\u003c/span\u003e\u003cspan address=\"10.3389/fvets.2022.889623\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVerbeke WAJ, Viaene J. Ethical challenges for livestock production: Meeting consumer concerns about meat safety and animal welfare. J Agric Environ Ethics. 2000;12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1023/A:1009538613588\u003c/span\u003e\u003cspan address=\"10.1023/A:1009538613588\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePetherick JC. Animal welfare issues associated with extensive livestock production: The northern Australian beef cattle industry. Appl Anim Behav Sci. 2005. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.applanim.2005.05.009\u003c/span\u003e\u003cspan address=\"10.1016/j.applanim.2005.05.009\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFraser D. Animal Welfare and the Intensification of Animal Production. In: International Library of Environmental, Agricultural and Food Ethics. 2008. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/978-1-4020-8722-6_12\u003c/span\u003e\u003cspan address=\"10.1007/978-1-4020-8722-6_12\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLlonch P, Haskell MJ, Dewhurst RJ, Turner SP. Current available strategies to mitigate greenhouse gas emissions in livestock systems: An animal welfare perspective. Animal. 2017;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1017/S1751731116001440\u003c/span\u003e\u003cspan address=\"10.1017/S1751731116001440\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBroom DM. Animal welfare complementing or conflicting with other sustainability issues. Appl Anim Behav Sci. 2019;219. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.applanim.2019.06.010\u003c/span\u003e\u003cspan address=\"10.1016/j.applanim.2019.06.010\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLawrence AB, Conington J, Simm G. Breeding and animal welfare: Practical and theoretical advantages of multi-trait selection. Anim Welf. 2004. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1017/s0962728600014585\u003c/span\u003e\u003cspan address=\"10.1017/s0962728600014585\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. 13 SUPPL.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePapakonstantinou GI, Voulgarakis N, Terzidou G, Fotos L, Giamouri E, Papatsiros VG. Precision livestock farming technology: applications and challenges of animal welfare and climate change. Agriculture. 2024;14:620.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBanhazi TM, Babinszky L, Halas V, Tscharke M. Precision livestock farming: Precision feeding technologies and sustainable livestock production. Int J Agricultural Biol Eng. 2012;5:54\u0026ndash;61. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3965/j.ijabe.20120504.006\u003c/span\u003e\u003cspan address=\"10.3965/j.ijabe.20120504.006\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchillings J, Bennett R, Rose DC. Exploring the Potential of Precision Livestock Farming Technologies to Help Address Farm Animal Welfare. Front Anim Sci. 2021;2. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fanim.2021.639678\u003c/span\u003e\u003cspan address=\"10.3389/fanim.2021.639678\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNeethirajan S, Scott S, Mancini C, Boivin X, Strand E. Human-computer interactions with farm animals\u0026mdash;enhancing welfare through precision livestock farming and artificial intelligence. Front Vet Sci. 2024;11:1490851.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGarc\u0026iacute;a R, Aguilar J, Toro M, Pinto A, Rodr\u0026iacute;guez P. A systematic literature review on the use of machine learning in precision livestock farming. Comput Electron Agric. 2020. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.compag.2020.105826\u003c/span\u003e\u003cspan address=\"10.1016/j.compag.2020.105826\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBenjamin M, Yik S. Precision livestock farming in swinewelfare: A review for swine practitioners. Animals. 2019;9:1\u0026ndash;21. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/ani9040133\u003c/span\u003e\u003cspan address=\"10.3390/ani9040133\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMahinpei A, Clark J, Lage I, Doshi-Velez F, Pan W. Promises and pitfalls of black-box concept learning models. arXiv preprint arXiv:210613314. 2021.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRosati A. Guiding principles of AI: application in animal husbandry and other considerations. Anim Front. 2024;14. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/af/vfae045\u003c/span\u003e\u003cspan address=\"10.1093/af/vfae045\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAzodi CB, Tang J, Shiu SH. Opening the Black Box: Interpretable Machine Learning for Geneticists. Trends Genet. 2020. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.tig.2020.03.005\u003c/span\u003e\u003cspan address=\"10.1016/j.tig.2020.03.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. 36.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRudin C. Why black box machine learning should be avoided for high-stakes decisions, in brief. Nat Reviews Methods Primers. 2022. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s43586-022-00172-0\u003c/span\u003e\u003cspan address=\"10.1038/s43586-022-00172-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. 2.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAngelov PP, Soares EA, Jiang R, Arnold NI, Atkinson PM. Explainable artificial intelligence: an analytical review. Wiley Interdiscip Rev Data Min Knowl Discov. 2021;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1002/widm.1424\u003c/span\u003e\u003cspan address=\"10.1002/widm.1424\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMinh D, Wang HX, Li YF, Nguyen TN. Explainable artificial intelligence: a comprehensive review. Artif Intell Rev. 2022;55. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10462-021-10088-y\u003c/span\u003e\u003cspan address=\"10.1007/s10462-021-10088-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCandel A, Parmar V. Erin LeDell AA. Deep learning with H2O. H2O ai Inc. 2016.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMayer M, shapviz. SHAP Visualizations. R package version 0.9. 3. 2024.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRojo-Gimeno C, van der Voort M, Niemi JK, Lauwers L, Kristensen AR, Wauters E. Assessment of the value of information of precision livestock farming: A conceptual framework. NJAS - Wageningen J Life Sci. 2019;90\u0026ndash;1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.njas.2019.100311\u003c/span\u003e\u003cspan address=\"10.1016/j.njas.2019.100311\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eElliott KC, Werkheiser I. A Framework for Transparency in Precision Livestock Farming. Animals. 2023;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/ani13213358\u003c/span\u003e\u003cspan address=\"10.3390/ani13213358\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMouzakitis S, Tsapelas G, Pelekis S, Ntanopoulos S, Askounis D, Osinga S et al. Investigation of Common Big Data Analytics and Decision-Making Requirements Across Diverse Precision Agriculture and Livestock Farming Use Cases. In: IFIP Advances in Information and Communication Technology. 2020. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/978-3-030-39815-6_14\u003c/span\u003e\u003cspan address=\"10.1007/978-3-030-39815-6_14\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlbrecht JP. How the GDPR Will Change the World. Eur Data Prot Law Rev. 2017. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.21552/edpl/2016/3/4\u003c/span\u003e\u003cspan address=\"10.21552/edpl/2016/3/4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. 2.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGunn PJ, Schoonmaker JP, Lemenager RP, Bridges GA. Feeding excess crude protein to gestating and lactating beef heifers: Impact on parturition, milk composition, ovarian function, reproductive efficiency and pre-weaning progeny growth. Livest Sci. 2014;167. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.livsci.2014.05.010\u003c/span\u003e\u003cspan address=\"10.1016/j.livsci.2014.05.010\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSinclair KD, Garnsworthy PC, Mann GE, Sinclair LA. Reducing dietary protein in dairy cow diets: Implications for nitrogen utilization, milk production, welfare and fertility. Animal. 2014;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1017/S1751731113002139\u003c/span\u003e\u003cspan address=\"10.1017/S1751731113002139\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFriggens NC, Ridder C, L\u0026oslash;vendahl P. On the use of milk composition measures to predict the energy balance of dairy cows. J Dairy Sci. 2007;90. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.2006-821\u003c/span\u003e\u003cspan address=\"10.3168/jds.2006-821\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCarlsson J, Pehrson B. The influence of the dietary balance between energy and protein on milk urea concentration. Experimental trials assessed by two different protein evaluation systems. Acta Vet Scand. 1994;35.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKirchgessner M, Kreuzer M, Roth-Maier DA. Milk urea and protein content to diagnose energy and protein malnutrition of dairy cows. Arch Tierernahr. 1986;36. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/17450398609425260\u003c/span\u003e\u003cspan address=\"10.1080/17450398609425260\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhao X, Zheng N, Zhang Y, Wang J. The role of milk urea nitrogen in nutritional assessment and its relationship with phenotype of dairy cows: A review. Anim Nutr. 2025;20:33\u0026ndash;41.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRius AG, McGilliard ML, Umberger CA, Hanigan MD. Interactions of energy and predicted metabolizable protein in determining nitrogen efficiency in the lactating dairy cow. J Dairy Sci. 2010;93. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.2008-1777\u003c/span\u003e\u003cspan address=\"10.3168/jds.2008-1777\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDreyer C, Losand B, Spiekers H, Hummel J. Influence of fat-to-protein ratio and udder health parameters on the milk urea content of dairy cows. J Dairy Sci. 2025;108. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.2024-25492\u003c/span\u003e\u003cspan address=\"10.3168/jds.2024-25492\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNecula DC, Warren HE, Taylor-Pickard J, Simiz E, Stef L. Associations of Lameness with Indicators of Nitrogen Metabolism and Excretion in Dairy Cows. Agric (Switzerland). 2022;12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/agriculture12122109\u003c/span\u003e\u003cspan address=\"10.3390/agriculture12122109\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSlov\u0026aacute;k P, Hisira V, Marčekov\u0026aacute; P, Mudroň P. The relationship between claw diseases of dairy cows and the protein and urea content of the milk. Acta Fytotechnica et Zootechnica. 2021;24. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.15414/AFZ.2021.24.MI-PRAP.102-104\u003c/span\u003e\u003cspan address=\"10.15414/AFZ.2021.24.MI-PRAP.102-104\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDavid Baird G. Primary Ketosis in the High-Producing Dairy Cow: Clinical and Subclinical Disorders, Treatment, Prevention, and Outlook. J Dairy Sci. 1982. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.S0022-0302(82)82146-2\u003c/span\u003e\u003cspan address=\"10.3168/jds.S0022-0302(82)82146-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. 65.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeuer C, Luinge HJ, Lutz ETG, Schukken YH, Van Der Maas JH, Wilmink H, et al. Determination of acetone in cow milk by Fourier transform infrared spectroscopy for the detection of subclinical ketosis. J Dairy Sci. 2001;84. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.S0022-0302(01)74510-9\u003c/span\u003e\u003cspan address=\"10.3168/jds.S0022-0302(01)74510-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKlein SL, Scheper C, May K, K\u0026ouml;nig S. Genetic and nongenetic profiling of milk β-hydroxybutyrate and acetone and their associations with ketosis in Holstein cows. J Dairy Sci. 2020;103. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.2020-18339\u003c/span\u003e\u003cspan address=\"10.3168/jds.2020-18339\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbeni F, Ferla A, Negrini R, Galli A. Large scale subclinical ketosis risk assessment in dairy herds using predicted milk acetone and β-hydroxybutyrate via MIR technology. J Dairy Res. 2025;92:45\u0026ndash;51.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStone W. The effect of subclinical rumen acidosis on milk components. 1999.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePlaizier JC, Krause DO, Gozho GN, McBride BW. Subacute ruminal acidosis in dairy cows: The physiological causes, incidence and consequences. Vet J. 2008;176. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.tvjl.2007.12.016\u003c/span\u003e\u003cspan address=\"10.1016/j.tvjl.2007.12.016\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlzahal O, Or-Rashid MM, Greenwood SL, McBride BW. Effect of subacute ruminal acidosis on milk fat concentration, yield and fatty acid profile of dairy cows receiving soybean oil. J Dairy Res. 2010;77. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1017/S0022029910000294\u003c/span\u003e\u003cspan address=\"10.1017/S0022029910000294\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFairfield AM, Plaizier JC, Duffield TF, Lindinger MI, Bagg R, Dick P, et al. Effects of prepartum administration of a monensin controlled release capsule on rumen pH, feed intake, and milk production of transition dairy cows. J Dairy Sci. 2007;90. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.S0022-0302(07)71577-1\u003c/span\u003e\u003cspan address=\"10.3168/jds.S0022-0302(07)71577-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhafipoor E, Krause DO, Plaizier JC. Induction of subacute ruminal acidosis (SARA) by replacing alfalfa hay with alfalfa pellets does not stimulate inflammatory response in lactating dairy cows. In: JOURNAL OF DAIRY SCIENCE; 2007. p. 654.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAntanaitis R, Džermeikaitė K, Januškevičius V, Šimonytė I, Baumgartner W. In-Line Registered Milk Fat-to-Protein Ratio for the Assessment of Metabolic Status in Dairy Cows. Animals. 2023;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/ani13203293\u003c/span\u003e\u003cspan address=\"10.3390/ani13203293\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAtalay H. Milk fat/protein ratio in ketosis and acidosis. Balkesir Sağlk Bilimleri Dergisi. 2019;8:143\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZschiesche M, Mensching A, Sharifi AR, Hummel J. The Milk Fat-to-Protein Ratio as Indicator for Ruminal pH Parameters in Dairy Cows: A Meta-Analysis. Dairy. 2020;1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/dairy1030017\u003c/span\u003e\u003cspan address=\"10.3390/dairy1030017\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOetzel GR. Subacute ruminal acidosis in dairy herds: physiology, pathophysiology, milk fat responses, and nutritional management. In: 40th Annual Conference, American Association of Bovine Practitioners. 2007. pp. 89\u0026ndash;119.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eInzaghi V, Zucali M, Thompson PD, Penry JF, Reinemann DJ. Changes in electrical conductivity, milk production rate and milk flow rate prior to clinical mastitis confirmation. Ital J Anim Sci. 2021;20. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/1828051X.2021.1984852\u003c/span\u003e\u003cspan address=\"10.1080/1828051X.2021.1984852\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAntanaitis R, Juozaitienė V, Jonike V, Baumgartner W, Paulauskas A. Milk lactose as a biomarker of subclinical mastitis in dairy cows. Animals. 2021;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/ani11061736\u003c/span\u003e\u003cspan address=\"10.3390/ani11061736\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKayano M, Itoh M, Kusaba N, Hayashiguchi O, Kida K, Tanaka Y, et al. Associations of the first occurrence of pathogen-specific clinical mastitis with milk yield and milk composition in dairy cows. J Dairy Res. 2018;85. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1017/S0022029918000456\u003c/span\u003e\u003cspan address=\"10.1017/S0022029918000456\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCosta A, Bovenhuis H, Egger-Danner C, Fuerst-Waltl B, Boutinaud M, Guinard-Flament J, et al. Mastitis has a cumulative and lasting effect on milk yield and lactose content in dairy cows. J Dairy Sci. 2025;108. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.2024-25467\u003c/span\u003e\u003cspan address=\"10.3168/jds.2024-25467\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBoas DFV, Filho AEV, Pereira MA, Junior LCR, Faro L, El. Association between electrical conductivity and milk production traits in dairy Gyr cows. J Appl Anim Res. 2017;45. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/09712119.2016.1150849\u003c/span\u003e\u003cspan address=\"10.1080/09712119.2016.1150849\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNorberg E. Electrical conductivity of milk as a phenotypic and genetic indicator of bovine mastitis: A review. Livest Prod Sci. 2005;96. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.livprodsci.2004.12.014\u003c/span\u003e\u003cspan address=\"10.1016/j.livprodsci.2004.12.014\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNorberg E, Hogeveen H, Korsgaard IR, Friggens NC, Sloth KHMN, L\u0026oslash;vendahl P. Electrical conductivity of milk: Ability to predict mastitis status. J Dairy Sci. 2004;87. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.S0022-0302(04)73256-7\u003c/span\u003e\u003cspan address=\"10.3168/jds.S0022-0302(04)73256-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDadousis C, Pegolo S, Rosa GJM, Gianola D, Bittante G, Cecchinato A. Pathway-based genome-wide association analysis of milk coagulation properties, curd firmness, cheese yield, and curd nutrient recovery in dairy cattle. J Dairy Sci. 2017;100. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.2016-11587\u003c/span\u003e\u003cspan address=\"10.3168/jds.2016-11587\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDadousis C, Pegolo S, Rosa GJM, Bittante G, Cecchinato A. Genome-wide association and pathway-based analysis using latent variables related to milk protein composition and cheesemaking traits in dairy cattle. J Dairy Sci. 2017;100. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.2017-13219\u003c/span\u003e\u003cspan address=\"10.3168/jds.2017-13219\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeuer C, Wangler A, Schukken YH, Noordhuizen JPTM. Variability of Acetone in Milk in a Large Low-Production Dairy Herd: A Longitudinal Case Study. Vet J. 2001;161. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1053/tvjl.2000.0562\u003c/span\u003e\u003cspan address=\"10.1053/tvjl.2000.0562\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDobson H, Smith RF, Royal MD, Knight CH, Sheldon IM. The high-producing dairy cow and its reproductive performance. Reproduction in Domestic Animals. 2007;42 SUPPL. 2. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/j.1439-0531.2007.00906.x\u003c/span\u003e\u003cspan address=\"10.1111/j.1439-0531.2007.00906.x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWerkheiser I. Precision Livestock Farming and Farmers\u0026rsquo; Duties to Livestock. J Agric Environ Ethics. 2018;31. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10806-018-9720-0\u003c/span\u003e\u003cspan address=\"10.1007/s10806-018-9720-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKleen JL, Guatteo R. Precision Livestock Farming: What Does It Contain and What Are the Perspectives? Animals. 2023;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/ani13050779\u003c/span\u003e\u003cspan address=\"10.3390/ani13050779\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTekın K, Yurdak\u0026ouml;k-Dıkmen B, Kanca H, Guatteo R. Precision livestock farming technologies: Novel direction of information flow. Ankara Universitesi Veteriner Fakultesi Dergisi. 2021;68. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.33988/auvfd.837485\u003c/span\u003e\u003cspan address=\"10.33988/auvfd.837485\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJiang B, Tang W, Cui L, Deng X. Precision Livestock Farming Research: A Global Scientometric Review. Animals. 2023;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/ani13132096\u003c/span\u003e\u003cspan address=\"10.3390/ani13132096\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVanNostrand PM, Hofmann DM, Ma L, Genin B, Huang R, Rundensteiner EA. Counterfactual explanation analytics: Empowering lay users to take action against consequential automated decisions. Proceedings of the VLDB Endowment. 2024;17:4349\u0026ndash;52.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTsirtsis S, Gomez-Rodriguez M. Decisions, counterfactual explanations and strategic behavior. In: Advances in Neural Information Processing Systems. 2020.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTan J, Xu S, Ge Y, Li Y, Chen X, Zhang Y. Counterfactual Explainable Recommendation. In: International Conference on Information and Knowledge Management, Proceedings. 2021. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1145/3459637.3482420\u003c/span\u003e\u003cspan address=\"10.1145/3459637.3482420\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShang R, Feng KJK, Shah C, Why Am I. Not Seeing It? Understanding Users\u0026rsquo; Needs for Counterfactual Explanations in Everyday Recommendations. In: ACM International Conference Proceeding Series. 2022. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1145/3531146.3533189\u003c/span\u003e\u003cspan address=\"10.1145/3531146.3533189\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZandkarimi F, Vanegas J, Fern X, Maier CS, Bobe G. Metabotypes with elevated protein and lipid catabolism and inflammation precede clinical mastitis in prepartal transition dairy cows. J Dairy Sci. 2018;101. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3168/jds.2017-13977\u003c/span\u003e\u003cspan address=\"10.3168/jds.2017-13977\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"discover-artificial-intelligence","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"diai","sideBox":"Learn more about [Discover Artificial Intelligence](https://www.springer.com/44163)","snPcode":"","submissionUrl":"","title":"Discover Artificial Intelligence","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Discover Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Dairy cattle, Animal welfare, Machine learning, data-driven decisions, XAI, welfare scores","lastPublishedDoi":"10.21203/rs.3.rs-9082135/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9082135/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eImproving animal welfare and health is a key objective in modern dairy systems. However, translating the nowcommonly available high-frequency bovine-dedicated sensor instrument data into transparent, actionable decisions remains challenging, particularly when using complex machine learning (ML) models. In this study, we used ML to predict several welfare indicators (WIs) from routinely recorded milk-related data in dairy cows, and explainable AI (XAI) to interpret and visualise ML models for practical use.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eMonthly individual WIs for mastitis, subclinical acidosis, subclinical ketosis, longevity, and reproduction were predicted used with random forest models trained on routinely available test-day traits in dairy cows (milk acetone, fat, lactose, protein, urea, and electrical conductivity). The individual WIs were then used to derive an overall welfare score, with both individual and overall scores classified into good (G), intermediate (I), and risk (B) classes. SHapley Additive exPlanations (SHAP) values quantified feature importance and interactions, revealing relationships largely consistent with known physiology (e.g., extreme high and low values of milk components and urea associated with increased welfare risk) and highlighting class-specific and U-shaped effects that call for context-dependent interpretation. Counterfactual explanations were used to identify minimal changes in milk traits required to shift predictions from B to G class, thereby translating model outputs into candidate management adjustments. While most counterfactuals followed biologically plausible patterns, occasional non-intuitive suggestions underscored the need for expert oversight.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eThis study illustrates how SHAP and counterfactual explanations can be layered on top of ML models to generate interpretable, customizable decision-support tools for precision dairy cattle welfare, while emphasizing the need to field-validate their usability, economic value, and ethical implications.\u003c/p\u003e","manuscriptTitle":"Improving interpretability and applicability of welfare management decisions in dairy cows through explainable artificial intelligence","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-03-28 00:41:04","doi":"10.21203/rs.3.rs-9082135/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"editorInvitedReview","content":"","date":"2026-05-08T01:58:16+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"20119967788856359749030245585470298082","date":"2026-04-27T22:11:24+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"103044492044068887596351184313066913067","date":"2026-04-27T18:07:51+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"323978044194755062947936880251148850811","date":"2026-04-27T17:57:46+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-04-14T20:06:33+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"187570957657825483827153906350455082649","date":"2026-03-28T12:27:36+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"151240933992972250964880052136204117705","date":"2026-03-26T13:28:56+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-03-26T07:57:39+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-03-17T18:38:52+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-03-17T04:54:23+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-03-16T11:21:43+00:00","index":"","fulltext":""},{"type":"submitted","content":"Discover Artificial Intelligence","date":"2026-03-16T08:40:14+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"discover-artificial-intelligence","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"diai","sideBox":"Learn more about [Discover Artificial Intelligence](https://www.springer.com/44163)","snPcode":"","submissionUrl":"","title":"Discover Artificial Intelligence","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Discover Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"6455817c-ccbc-48c7-9863-49b383477b91","owner":[],"postedDate":"March 28th, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"","date":"2026-05-08T01:58:16+00:00","index":194,"fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-03-28T00:41:04+00:00","versionOfRecord":[],"versionCreatedAt":"2026-03-28 00:41:04","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9082135","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9082135","identity":"rs-9082135","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.