PickAMoo: LIDAR-Enhanced Mask R-CNN segmentation for Precision Weight Estimation in Dairy Cattle Using Smartphone Imaging.

doi:10.21203/rs.3.rs-7827424/v1

PickAMoo: LIDAR-Enhanced Mask R-CNN segmentation for Precision Weight Estimation in Dairy Cattle Using Smartphone Imaging.

2025 · doi:10.21203/rs.3.rs-7827424/v1

preprint OA: closed

Full text JSON View at publisher

Full text 151,508 characters · extracted from preprint-html · click to expand

PickAMoo: LIDAR-Enhanced Mask R-CNN segmentation for Precision Weight Estimation in Dairy Cattle Using Smartphone Imaging. | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article PickAMoo: LIDAR-Enhanced Mask R-CNN segmentation for Precision Weight Estimation in Dairy Cattle Using Smartphone Imaging. Oleksiy Guzhva, Emma Ternman, Mikaela Lindberg, Evgenij Telezhenko, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7827424/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 10 You are reading this latest preprint version Abstract Data on body weight, as well as objective measures of body condition and size, are essential for appropriate decision-making on farm level, e.g. for calculations of nutrient requirements, health control and assessments for breeding purposes. Cows with suboptimal body condition score are at higher risk for transition diseases (e.g. metritis, subclinical ketosis, retained placenta) and lameness. Weighing dairy cattle and assessing their body condition is laborious and therefore often not performed on farms as frequently as desired for best production results. Despite recent research findings advocating a strong potential of using computer vision and image analysis for automated estimation of dairy cows’ weight, body condition score (BCS) and conformation, current technologies are still not widely applied in everyday practice, and the majority of methods used for BCS or weight estimation in cattle utilize the multi-camera stationary setups or 3D-cameras, which leads to high computational costs. We propose a new, two-step, AI-based method for easy live weight estimation. The first step includes Mask R-CNN segmentation network trained on 565 unique cow images (both left and right side) collected at distances varying from 1.90 meters to 2.10 meters, under different lightning conditions and at various angles. The final segmentation accuracy of Mask R-CNN was 0.98 in this first step. In the second step, weight was discretized into nine data-driven categories using a Gaussian Mixture Model (BIC-selected), after which the source weight variable was removed to prevent leakage and a leak-safe pipeline (imputation, robust scaling, fold-internal SMOTE, Extra Trees) was trained with stratified cross-validation and evaluated on an untouched holdout; a PyCaret implementation was used as an independent cross-check. On the 216-animal holdout, the tuned Extra Trees model achieved a macro-F1 of 0.936 (95% CI 0.913–0.956), with a 4.2% error rate composed entirely of adjacent (neighbouring-bin) mistakes. These results were obtained on 1080 images collected using the developed camera app and not used during the Mask R-CNN training. The idea is to further streamline the algorithm to allow its downscaling and transition in the form of a smartphone application to be used on-farm as an open-source support tool. Biological sciences/Computational biology and bioinformatics Physical sciences/Engineering Physical sciences/Mathematics and computing Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Introduction The competitiveness of dairy production, along with the complexity of management practices aimed at higher profitability and sustainability while maintaining animal health and welfare, creates a demand for automation of standard farming procedures. Currently, methods for weighing dairy cattle range from manual measuring tapes to platform scales or crates equipped with technology that records animal identification and weight data automatically. Traditional weight estimation methods, such as visual assessment and manual measurements using heart girth tapes, are labour-intensive, time-consuming, and often inaccurate [ 17 ]. These methods also pose injury risks to animals and farm personnel due to the necessity of close physical contact. Consequently, there is a growing need for automated solutions that can provide quick and accurate weight estimates without stressing animals or endangering workers. Developing camera-based monitoring devices for weight and size assessment is therefore considered vital in the Precision Livestock Farming (PLF) research field. Recent scientific initiatives have explored the potential of automated methods for estimating body weight, size, and body condition score (BCS) in cattle through computer vision, image analysis, and machine learning techniques [ 4 , 7 , 8 , 15 , 16 , 21 , 25 ]. These systems claim to offer non-invasive and efficient alternatives to traditional weighing methods. For instance, [ 12 ] developed an image-based method using 2D images and deep neural networks to estimate cattle weight, achieving a mean absolute percentage error of 5.5% with fully supervised segmentation. Similarly, [ 20 ] utilized a 3D vision system to automatically measure morphological traits and predict body weight in dairy cattle with a root mean square error of 41.2 kg and a mean absolute percentage error of 5.2%. However, many of these studies rely heavily on 3D cameras or 2D multi-camera setups to perform full-body scans or capture top-down images as cows pass beneath a stationary camera(s). While this approach can provide valuable information — from body shape and composition to specific measurements like length, width, and angles between relevant anatomical points, it requires controlled conditions for reliable imaging from stationary equipment and involves complex modelling for analysis [ 19 ]. Furthermore, the quality of images obtained with 3D cameras is highly dependent on lighting, camera angle, dust particles, lens cleanliness, and the cows' movement or inability to stand still during data acquisition process. Additionally, analysing images from 3D or multiple cameras simultaneously often demands substantial time and incurs extra costs for hardware setup, data storage, and computational processing [ 14 ], which must be considered when evaluating the added value of automated body measurements. Advancements in artificial intelligence (AI) and computer vision (CV), along with algorithm miniaturization enabling deployment on mobile platforms (as investigated by Zhang et al., [ 24 ]), now allow for visual estimation of body mass index (BMI) from facial images in humans. This technology has primarily been applied in the public health domain [ 9 ], resulting in smartphone apps that estimate weight gain potential and recommend clothing sizes. Despite research highlighting the strong potential of CV and image analysis for automated estimation of dairy cows’ weight, BCS, and conformation, these technologies are not yet widely used in everyday farming practice. This reluctance is mainly due to the complexity of image recording procedures, equipment and modelling costs, and the steep learning curve for interpreting results to support farm operations. We envision that by combining state-of-the-art, scalable CV algorithms for mobile platforms (e.g., smartphones) with expertise in dairy production, and by utilizing external assessments and on-farm individual animal data, we can develop an all-in-one solution for daily use. Such a tool would enable farmers to record their cows' weight status using a smartphone, increasing production efficiency, reducing costs, and potentially supporting new management routines in dairy farming. This approach would facilitate the creation of an intermediate data platform where farmers, researchers, and dairy advisors can collaboratively gather and access body weight and size data, developing new routines and guidelines for a sustainable and competitive dairy value chain. Highly relevant individual animal data could then be collected with a simple "one-click" procedure. In the long term, precise measurements of size and weight in cattle would enhance opportunities for assistance in breeding decisions and continuous health assessment. This approach could improve nutrient utilization, farm economics and reduce greenhouse gas emissions [ 2 ]. The overall aim of this research project was to investigate the possibilities for accurate weight estimation of dairy cattle with the help of a smartphone camera and computer vision algorithms. Methods Ethical Consideration Prior to the start of the experiment, the procedures and details of the experiment were evaluated by the Board of Ethical use of Animals in Teaching and Research, Swedish University of Agricultural Sciences, Uppsala, Sweden, and an ethical permit was obtained from the Swedish Board of Agriculture, Uppsala, Sweden ID number: 5.8.18-05598/2021. All procedures were conducted in accordance with the ethical guidelines proposed by the Ethical Committee of the ISAE (International Society of Applied Ethology [18]) and met the ARRIVE guidelines [10]. Study location and animals The data was collected at two locations. At the first location, 248 lactating dairy cows of the breeds Swedish Red (n=148) and Swedish Holstein (n=100) were included for observations over twelve distinct days on nine separate occasions from October 2021 to February 2024. The cows were housed in a free-stall system at the Swedish Livestock Research Centre (Swedish University of Agricultural Sciences (SLU), Uppsala, Sweden). At the second location, 22 cows of the Swedish Red breed were included for observations for one day, to extend the pool of available animals and capture as much individual variability as possible. The cows were housed in a free-stall system at Röbäcksdalen’s Research Facility, SLU, Umeå, Sweden. Evaluating Manual Weight‑Estimation Equations as a Ground Truth for Machine‑Learning Models Accurate baseline data are essential for training and validating any automated weight estimation algorithms. Traditional weight estimation equations for cattle rely on manual measurements, such as heart girth and body length taken with a measuring tape, but their precision varies with operator skill, animal posture and breed. Over the years researchers have proposed a spectrum of equations to estimate body weight in cattle, reflecting differences in measurement techniques, breeds and management conditions. At the simplest end are linear ratios such as Agarwal’s formula, which multiplies chest girth by body length and divides by a girth dependent constant [3]; and the widely used Schaeffer equation, which scales the product of body length and the square of chest girth [22]. More elaborate polynomial models, like the Holstein heifer equation from Heinrichs and colleagues, fit second and third order terms of heart girth to capture curvature in the weight–girth relationship [5]. Breed-specific regressions for crossbred heifers and Senegalese dairy cattle substitute alternative landmarks such as hip width or the distance between coxal tuberosities when girth proves unreliable. Studies in low-input systems often favour straightforward heart girth or girth-plus-length equations because they require minimal equipment, whereas intensive dairy operations with a higher degree of automatization have adopted multiple regression models incorporating age, withers height or body condition to improve precision [13]. To understand the reliability of these manual weight estimation methods, we evaluated several commonly used equations and compared their outputs against a golden standard of automated weighing obtained from calibrated platform scales. To establish a solid foundation for automated weight classification, we have also performed manual measures using a measuring tape (ANIMETER measuring tape, Växa, Jönköping, Sweden) for cattle to be used as reference for image analysis part of the work (Batches 1 and 2). These measures included chest girth, stomach circumference, back length (diagonal and direct), height (back and front), and rump width. This comparison allowed us to quantify the bias and variance inherent in manual measurements and to identify which body dimensions correlate best with true body weight and which similar features could be extracted from images. Because both manual tapes and automated scales can be affected by calibration drift, animal movement and human error, there remains an unavoidable margin of uncertainty; nonetheless, establishing the degree of agreement between the two methods provides a solid starting point for training machine learning models. By anchoring the models to weights from the automated scales and understanding the limitations of manual tape measurements, analogous features from images could be extracted and used for building image-based weight prediction models that emulate, and eventually replace, manual and mechanical measurements. Image acquisition protocol High-quality images for a well-performing CV model are dependent on a standardized image acquisition protocol. In our study, the protocol specified (1) a stand-off distance that ensured full-body coverage without occlusion by other animals or obstacles, (2) a near-perpendicular camera angle to reduce perspective distortion, and (3) operator positioning that maintained safety and avoided disturbing normal animal behaviour. A commercial Bosch laser distance meter (Bosch GLM 40 Professional, Robert Bosch Power Tools GmbH, Leinfelden-Echterdingen, Germany; measurement accuracy ±0.15 mm) was used to control stand-off distance. We evaluated three configurations to quantify practical trade-offs and to guarantee seamless transition to a smartphone-only distance estimation: (a) handheld LDM with a handheld camera/smartphone; (b) tripod-mounted LDM and camera aligned to the cow’s midline and targeting the thoracolumbar region; and (c) a rigid bar (camera plus two LDMs) on a tripod to simultaneously sample distances to the cranial and caudal body, providing a simple approximation of animal depth. Across trials, stand-off distances ranged from 1.44 to 2.72 meters; empirically, ~2.00–2.05 meters yielded the most consistent full-body coverage with minimal occlusion and stable focus and was therefore adopted when barn layout and animal flow permitted. To further stabilise image quality, operators were instructed to avoid strong backlighting, minimise motion during exposure, keep lenses clean, and favour perpendicular incidence to mitigate measurement and future segmentation errors associated with dark, glossy or wet coat patches. Mobile application development After standardising the image-acquisition protocol, we initiated development of a dedicated mobile application to allow a full package solution: robust distance estimation, image acquisition and potentially, weight estimation. Design and implementation were contracted to digital interaction agency with expertise in app prototyping and user-interface engineering. Given smartphone platform constraints, Apple’s iPhone Pro/Pro Max devices (iPhone 12 Pro and newer; Apple Inc., Cupertino, CA, USA) were selected because they provided application-level access (Software Development Kit or SDK functionality) to the LIDAR sensor, enabling precise stand-off estimation — critical for image quality control and scale normalisation. The app development process was built around four coordinated workstreams: (1) Camera module: a configurable capture pipeline supporting controlled exposure, focus and resolution to obtain full-body frames with minimal motion blur; (2) Distance module: real-time LIDAR-based stand-off estimation with on-screen guidance to keep the operator within a target distance window suitable for analysis; (3) Classifier integration: ingestion of CV outputs (e.g., instance masks) and computation of image-derived features for on-device weight classification or deferred processing; (4) User interface and data layer: workflows for animal selection, capture confirmation, and longitudinal review of predicted weight to support management decisions. Development followed an iterative cycle with weekly internal builds delivered to the project Primary Investigator for field testing and feedback. The development cycle included seven different app versions, aiming to refine capture guidance, robustness to barn lighting and operator ergonomics. The final data-acquisition app — PickAMoo (Figure 1) — contained: (i) a burst mode that captured four consecutive images of the same animal to reduce pose-related variance; (ii) LIDAR-based distance estimation and a depth-map preview to verify framing and scale; and (iii) a gyroscope-driven tilt indicator to help the operator maintain a near-perpendicular view and minimise perspective distortion. LIDAR/LDM distance measurement consistency A central methodological challenge in developing a reliable image acquisition protocol was to guarantee that images were consistently obtained at the correct distance and angle. Such control was crucial for ensuring both user-friendliness of the procedure and reproducibility of the approach. The transition from manual LDMs to smartphone-integrated LiDAR was therefore a decisive step, as it allowed for direct, automated, and accurate distance estimation. During the preliminary phase, several factors known to affect the performance of LDMs were considered in detail. Previous studies have shown that reflectivity of the target surface strongly influences measurement precision: highly reflective materials typically return stable signals, whereas darker or absorptive surfaces decrease reliability [23]. Ambient illumination is another factor; strong direct sunlight can reduce laser spot contrast and hinder detection, while diffuse or low-light conditions are generally more favourable [1]. Furthermore, atmospheric factors such as dust and humidity, both common in livestock housing, may scatter the beam and degrade measurement precision [11]. Finally, geometric factors, particularly the angle of incidence, were also recognized as a limitation [1]. Taken together, these considerations demonstrated that while LDMs are suitable for controlled settings, their susceptibility to environmental and geometric variability renders them less reliable in barns, where surfaces, lighting, and air quality cannot be standardized. For this reason, smartphone LiDAR was adopted as an alternative solution. By integrating distance estimation directly into the image acquisition device, LiDAR minimizes operator error and maintains robust measurement accuracy under suboptimal field conditions [6]. The importance of distance accuracy became even more evident when the first version of the mobile application was tested. While the LIDAR system demonstrated an overall measurement accuracy of ±0.2 mm, occasional deviations were observed, where the app either underestimated or overestimated the distance substantially. Given the known sensitivity of smartphone LIDAR-based optical systems to surface reflectivity, a worst-case scenario test was performed (Figure 2). For this purpose, a toy cow with highly contrasting white and black fur was selected as the test object. The deep black fur posed a challenge for the LiDAR-based distance estimation system, as it absorbed rather than reflected the incident infrared light. The results confirmed that measurement errors were more pronounced on the darkest patches of fur, where accurate image capture with the desired parameters became problematic. This information was subsequently used to recalibrate the LiDAR sensor of the test device (iPhone 14 Pro Max), and the image acquisition protocol was adjusted accordingly. During the final round of data collection, a total of 1300 images were captured at consistent distances without any notable measurement errors. Data collection and image annotations Data (images, BW and manual measures) were gathered on 12 separate days spread over three and a half years, with different animals included at each session to capture a representative cross-section of the herd (Table 1). To avoid confounding our analyses with duplicated observations, we ensured that no individual cow was weighed or photographed more than once on the same day. Thus, while some cows were sampled multiple times across the study period, each imaging session provided a unique combination of animals, contributing to a diverse dataset spanning a wide range of body weights and sizes. Table 1. Overview of the data collection occasions (Batches) with number of images, animals for each occasion and whether manual measurements complementary to automatic weighing were taken ( *Updated protocol for distance measurement (two reference points instead of one) **Data collected using a PickAMoo app with LIDAR-functionality for distance estimation ) . Batch # Date # of Images # of Animals Photographed Manual measurements 1a May 2021 256 34 Yes 1b 258 28 Yes 2 July 2021 55 22 Yes 3a October and November 2021 232 92 No 3b 166 54 No 3c 100 52 No 4* October 2022 14 5 No 5* November 2022 305 37 No 6* November 2022 125 25 No 7* February 2023 110 17 No 8** May 2023 104 26 No 9** December 2023 1122 145 No Total 2847 537 Across nine data collection rounds (Batches), we assembled a dataset consisting of 2,847 images suitable for analysis. Imaging was performed with an Olympus μ-700 digital camera (Olympus Corporation; Hachioji, Tokyo) and a Motorola smartphone (Motorola Mobility LLC, Chicago, IL, USA; standard camera app) for Batches 1 - 6, and with an iPhone 14 Pro Max running our experimental application for Batches 7–9. The dataset comprised 248 unique cows photographed on multiple days and under varying conditions, yielding repeated observations per animal. Although more than 200+ lactating dairy cows were repeatedly weighed and photographed throughout the study, each animal’s body weight and conformation can change appreciably over time, and the imaging conditions (distance, angle and lighting) varied between data collection rounds. Consequently, images of the same cow taken days, weeks or months apart often differed in appearance and weight to such an extent that they effectively represented different individuals from a modelling perspective. This temporal variation, combined with subtle posture and camera related differences, created a large pool of individual variability within the dataset and reduced the risk that the models simply memorised specific animals. Manual body measurements (e.g., heart girth, body length, height, rump width) were collected for Batches 1–2 and paired with ground-truth body weights from the barn’s automated scales (Batches 1-9), providing a reference set to benchmark manual equations and to anchor subsequent image-based modelling. Together, these multi-device, multi-session data constituted a heterogeneous basis for developing and stress-testing our CV and machine learning pipelines. To create quality backbone for image segmentation model, 567 images were manually annotated using the VIA Image Annotator (Visual Geometry Group, University of Oxford, Oxford, UK). Annotators draw precise polygons around each focal cow to produce instance masks suitable for training Mask R-CNN model (Figure 3). Annotation guidelines prioritised full-body contours when visible and resolved partial occlusions by following the visible outline only; images with severe occlusion or motion blur were excluded from the gold-standard set. These masks formed the basis for training and validating our instance-segmentation models and for deriving silhouette-based features (e.g., projected area, aspect ratios and contour descriptors) used in weight classification. Development of object detection/segmentation model Object localisation for animal studies typically follows two paradigms; bounding-box (BB) detectors (e.g., Faster R-CNN) return rectangular boxes that roughly circumscribe each object and are well-suited to counting and coarse localisation tasks, whereas segmentation models assign labels at the pixel level. Within segmentation approach, semantic segmentation labels all pixels of a class without separating instances, whereas instance segmentation (e.g., Mask R-CNN) both detects objects and predicts a per-object mask. For weight estimation from single images, pixel-accurate silhouettes provide richer morphometric information than boxes in terms of projected mask area, aspect ratio, hull and contour-based measures, as well as pose-dependent ones. Having our reference dataset with manually annotated images (n = 567), the following split was used for model training and evaluation: 75% training (n = 425), 10% testing (n = 57) and 15% validation (n = 85). We compared two widely used segmentation architectures to evaluate the potential for transferring models into smartphone application: U-Net (semantic segmentation; efficient and often favoured for edge deployment) and Mask R-CNN (instance segmentation extending Faster R-CNN with a mask head). In preliminary tests, U-Net produced noticeably lower segmentation quality on cow images with clutter and partial occlusions, leading to leakage into background regions and unstable silhouettes (F1 score of 0.56). By contrast, Mask R-CNN was consistently able to delineate the cow from nearby animals and fixtures, providing masks that were robust enough to derive reliable image-based features for subsequent weight classification. Given these results and prior evidence that instance masks improve morphology-derived predictions in related human BMI/anthropometry work, we focused model development on Mask R-CNN (F1 score for the final model – 0.98). Mask R-CNN comprises a CNN backbone for feature extraction, a region proposal network for candidate detections, and classification/box-regression heads augmented with a parallel mask head for pixel-wise instance segmentation. We evaluated four pretrained backbones: EfficientNet-B7; MobileNetV3; ResNet-101; and DenseNet-201, to probe the accuracy–latency–capacity trade-off. MobileNetV3 offers a compact, mobile-oriented option; EfficientNet-B7 and DenseNet-201 provide high representational capacity at greater compute costs; ResNet-101 is a strong, well-balanced baseline with stable training behaviour. For each backbone we trained a Mask R-CNN model with the following common settings: two images per GPU, 50 epochs with 1,000 steps per epoch, and 50 validation steps per epoch; two classes (cow, background); initial learning rate 0.001; momentum 0.9; weight decay 0.0001. Early stopping was applied (PyTorch callback) to prevent model overfitting. Models were trained and tested on an AI workstation with an AMD Ryzen 5950X CPU, 128 GB DDR4-3600 RAM, and an NVIDIA RTX 4080 (16 GB VRAM); the best performing model converged in 7 hours 36 minutes, with average power draw of 240 watts. Performance was monitored using standard detection and segmentation metrics (mean average precision, mAP and mean intersection over union, IoU) alongside qualitative inspection of failure modes (occlusions, dark coat patches, specular highlights). Each model was also interfered on additional 150 images randomly sampled from the complete dataset to see the real-world performance and to discover potential segmentation issues in varying scenarios. The ResNet-101 backbone yielded the most reliable instance masks on held-out images and on images unseen during training, while maintaining acceptable inference speed for near-real-time processing on workstation-class hardware. Its stable optimisation and superior delineation of extremities (head, tail, distal limbs) translated into more consistent mask-derived features for weight modelling. Although MobileNetV3 was attractive for eventual on-device deployment, its segmentation accuracy on our data lagged the deeper backbones (F1 score 0.87). Future mobile deployment can potentially recover latency/size via backbone distillation, quantisation (INT8), and structured pruning once accuracy targets are locked. Choosing instance segmentation over bounding-box detection ensured that downstream features reflected object shape rather than just extent. Bounding boxes could inflate with pose and background clutter, whereas masks permit computation of projected area, contour length, convexity, and other silhouette descriptors that are more directly related to body volume proxies. This alignment between the image analysis output and the biological quantity of interest (weight) reduces information loss and helps the subsequent machine-learning estimator generalise across breeds, poses, and acquisition conditions. Weight classification model When the inference on images is performed, Mask R-CNN returns, for each detected cow, both a bounding box (BB) and a per-pixel segmentation mask (SM) (Figure 4). From these outputs we derived a compact set of silhouette features designed to correlate with body volume and weight proxies: BB width/height and area; SM (mask) area; extent (mask area ÷ BB area); convex-hull area; elongation (major/minor axis ratio); contour length; and simple moment-based descriptors. Binary masks were cleaned with light morphological operations (hole filling, small-component removal) to stabilise measurements across poses and backgrounds. Because all pixel-based features scale with stand-off distance, we applied a per-image scale normalisation. Using the recorded LDM distance, we computed a pixels-to-millimetres factor and re-expressed all length and area features to a common reference of 2.00 m stand-off. Practically, this rescales BB and SM measurements so that an animal photographed at 1.6 m or 2.4 m is made comparable to one photographed at 2.0 m, mitigating distance-induced variance without altering shape information. To improve generalisability, we also included a single external scalar covariate reflecting body girth. When manual measurements were unavailable for a given image, we used a constant prior equal to the cohort’s average heart-girth (≈ 200 cm). This anchors the model when segmentation masks differ slightly in the inclusion of extremities (e.g., head/tail), while leaving the image-derived features to carry most of the predictive signal. The dataset used for developing and testing final weight estimation model contained 1080 entries, based on features calculated from 1080 different cow images from Batch 9 matched with exact body weight confirmed through automatic scale system. Because downstream decision-making in farms often relies on weight bands rather than exact kilograms, the continuous weight was transformed into a categorical outcome in a data-driven manner. A one-dimensional Gaussian Mixture Model (GMM) was fitted to the weight distribution, the number of components K was selected by minimizing the Bayesian Information Criterion over K = 3…10, and clusters with fewer than 25 animals were merged into the nearest cluster in mean weight. The resulting clusters were ordered by mean weight and relabeled 1…K. This derived target is referred to as AutoWeightCategory and was fixed for all subsequent analyses. Supervised learning for weight-category classification was conducted in Python. The primary workflow was implemented with scikit-learn and imbalanced-learn, while PyCaret was used as an independent cross-check on the same training split. All experiments ran on a high-performance workstation (Ryzen 7950X, 64 GB DDR5-6000, NVIDIA RTX 4090 24 GB). A train/holdout split was created once using stratification on the derived weight categories (80% train, 20% holdout; fixed random seed). The holdout set was kept untouched until the final evaluation. All model development was performed on the training portion within a single, end-to-end pipeline so that every transformation was estimated inside cross-validation folds. The pipeline comprised median imputation for missing values, robust scaling to stabilize feature ranges, removal of zero-variance features, class-imbalance handling with SMOTE applied within each training fold, and an Extra Trees classifier (500 trees, class_weight="balanced", parallel execution, fixed seed). SMOTE (Synthetic Minority Oversampling Technique) synthesizes additional minority-class examples by interpolating between each minority sample and its nearest minority neighbors in feature space. By enriching sparse regions locally, SMOTE exposes the classifier to a more balanced and informative decision surface without altering the holdout data; applying it only within folds prevents information from leaking into validation partitions. Model selection used 10-fold stratified cross-validation on the training split. Cross-validation was adopted to obtain a reliable estimate of out-of-sample performance while preserving the class distribution in each fold and keeping preprocessing, SMOTE, and model fitting strictly confined to the training portion of each fold. The macro-averaged F1 score (macro-F1) was specified a priori as the primary metric because it weighs each class equally and thus reflects performance on minority categories; weighted-F1 and accuracy were recorded as complementary summaries. After cross-validation, the pipeline was refitted on the entire training split and evaluated once on the untouched holdout set. Because weight bands are intrinsically ordinal, errors were characterized not only as correct/incorrect but also by distance across categories (absolute difference between true and predicted class). Adjacent errors (distance = 1) and non-adjacent errors (distance > 1) were quantified, along with the mean, median, and maximum distance. Uncertainty on holdout macro-F1 was quantified with bootstrap resampling (2000 replicates). Finally, the calibration of predicted probabilities from the sklearn pipeline was assessed using the multiclass Brier score, expected calibration error (ECE; top-1), and a reliability diagram. For the independent confirmation step, PyCaret was run on the same training data with internal resampling disabled to avoid fold misalignment. A class-weighted Extra Trees model was created, tuned using PyCaret’s built-in procedures, finalized on the full training split, and evaluated on the same holdout set. Agreement between this PyCaret model and the primary sklearn/imbalanced-learn pipeline was taken as evidence that the findings did not hinge on a single software implementation. This pipeline ensured that (i) the vision output aligned with the biological quantity of interest (weight) via shape-aware features, (ii) scale effects from variable stand-off distance were neutralised, and (iii) the final classifier was selected on the basis of systematic, reproducible comparisons rather than ad-hoc choice. Results and discussion General comparative accuracy of weight estimation equations The comparison of linear weight estimation equations derived from combinations of heart girth (HG), body length (BL) and age measurements across both breeds used in the experiment and within each breed (Swedish Red and Swedish Holstein) showed a large variation between equations (Table 2) Each row shows the equation, its coefficient of determination (R²), the associated p-value from the regression, and the mean absolute percentage error (MAPE) observed when the formula was applied to the dataset (manual measurement data from Batch 1 and 2). For the mixed breed sample, the equation that combines HG, BL and age (BW = 5.41×HG + 2.41×BL + 11.24×Age – 911.44) produced the highest R² (0.89) and lowest MAPE (~ 4.5 %), indicating that incorporating multiple measurements, yields more accurate predictions than using HG alone. Breed-specific models revealed that the Swedish Red cows in our study, were more difficult to predict when it comes to weight estimates: formulas using only HG or HG + BL achieve R² values around 0.69–0.71 and MAPE of 13–14 %, suggesting that heart girth correlates less strongly with body weight in this breed. Adding age marginally improves performance but still leaves a large error. In contrast, the Swedish Holstein models perform much better in our study; the three-parameter equation (BW = 5.76×HG + 2.13×BL + 8.62×Age – 928.26) attains R² = 0.93 and MAPE ~3.6 %. All regressions are highly significant (p < 2.2 × 10⁻¹⁶), implying that the predictors reliably explain variation in weight. These results highlight the value of multi-measure formulas and the importance of accounting for breed differences when selecting manual weight estimation equations. In addition, cows weighing over 800kg have a higher mean absolute percentage error (MAPE) compared to cows weighing less than 800 kg. In our dataset, these cows were predominantly Swedish Red hence skewing our data on heavy cows, which would explain the lack of fit of the equation. The breed characteristic in body conformation could be another explanation for the difficulties in finding a good equation. The BW/HG correlation is lower (0.79) for Swedish Red than Swedish Holstein (0.90), which also is seen in the BL:BW ratio. Table 2. Comparison of equations for body weight estimation in cattle and their estimated accuracy when applied to Swedish Red and Swedish Holstein breeds ( BW – Body Weight, HG – Heart Girth, BL – Body Length) . Breed Formula R² p-Value Mean Absolute Percentage Error (MAPE) Mixed Breeds (SR and SH) HG*7.3827-878.3134 0.85 < 2.2e-16 5.42 % HG*6.2570+BL*2.3311-1035.94 0.87 < 2.2e-16 4 .89 % HG*5.4143+BL*2.4066+AGE*11.2416-911.4412 0.89 < 2.2e-16 4.47 % Swedish Red (SR) BW = HG*7.1984-851.8974 0.69 < 2.2e-16 14.12 % BW = HG*6.1404+BL*2.0569-975.8509 0.71 < 2.2e-16 13.67 % BW = HG*5.4592+BL*2.0203+AGE*11.8829-869.3609 0.72 < 2.2e-16 13.34 % Swedish Holstein (SH) HG*7.455-891.57 0.90 < 2.2e-16 4.07 % 6.7215+BL* 1.9233-1065.8118 0.91 < 2.2e-16 3.59 % HG*5.7649+BL*2.1257+AGE*8.6201-928.2627 0.93 < 2.2e-16 3.55 % Object detection/segmentation and Mask R-CNN performance In head-to-head comparisons, U-Net underperformed on cluttered, partially occluded barn images (F1 ≈0.56), whereas Mask R-CNN with a ResNet-101 backbone yields pixel-accurate cow silhouettes (final F1 ≈0.98) and near-perfect detection/segmentation accuracy on held-out and out-of-session imagery (≈99.7–99.9%). This differential mattered practically: small, systematic mask errors (e.g., inconsistent inclusion of head, tail or a distal limb) propagated to area- and contour-based features and degrade regression stability. This was mitigated with a capture-side burst mode and by augmenting the feature vector with a single, low-variance scalar prior (cohort-average heart girth) that stabilised predictions without re-introducing the brittleness of full tape-based formulas. However, the potential transition to different breeds, age groups might require an extensive data collection rounds and model re-training/re-evaluation, for taking account to unknown features affecting the final weight classification. As could be seen in Figure 5, depending on the position of the cow during the photographing, the model produced an SM with or without a head, in addition to adding/removing other small body parts like tail, ears, and obscured leg. This, of course, posed to be a real-world challenge and affected the size of the final SM, potentially affecting the weight classification accuracy. One potential way to address this is a burst function, where SM is produced for each of them when four or more images are taken simultaneously. Then, the average value is used to input the weight classification model. Calculating additional image-based features and adding an average chest circumference value to the weight classification model eliminated this issue. Image-based weight classification model performance Nine ordered weight categories were produced for the 1080 individual animal images (Batch 9), with a minimum category size of at least 25 animals. Ten-fold stratified cross-validation on the training split yielded stable results. The pipeline achieved a macro-F1 of 0.930 ± 0.020 (mean ± SD) and a weighted-F1 of 0.962 ± 0.011, indicating consistent performance across folds and classes when all preprocessing and SMOTE were confined within folds. On the untouched holdout set (n = 216), the sklearn pipeline attained a macro-F1 of 0.912 with a 95% bootstrap confidence interval of 0.879–0.941. Weighted-F1 and accuracy were 0.952 and 0.954, respectively. Error structure reflected the ordinal nature of the task: the overall error rate was 9.7%, of which 7.4% were adjacent misclassifications and 2.3% were non-adjacent. The mean absolute class distance was 0.130, the median was 0, and the maximum was 3 categories. Probability calibration for this pipeline showed a multiclass Brier score of 0.2114 and an ECE of 0.2112, with the reliability curve suggesting some over-confidence at higher predicted probabilities. The independently tuned PyCaret Extra Trees model improved holdout performance. A macro-F1 of 0.936 was obtained with a 95% bootstrap confidence interval of 0.913–0.956; weighted-F1 and accuracy were 0.967 and 0.969, respectively. The overall error rate dropped to 4.2%, and all errors were adjacent to the true category. The mean absolute distance was 0.042, the median remained 0, and the maximum distance was 1 category. These error profiles show that residual mistakes occurred almost exclusively at bin boundaries and that large misclassifications were rare. Strengths, limitations, implications, and future directions A transparent, leak-safe pipeline was assembled to predict data-driven weight categories from image-derived features, and good performance was demonstrated on a holdout set. By deriving categories from the observed weight distribution using a GMM with BIC selection, bin definitions were grounded in the population rather than imposed a priori. Enforcing a minimum cluster size ensured that each class had enough animals for stable estimation. Most importantly, the source variable from which the label was created (Weight) was removed from the features. This removal prevented target leakage, where a model would otherwise learn a near-deterministic mapping from weight to its own discretized categories, resulting in deceptively high scores that would not generalize. In livestock terms, it is the difference between recognizing meaningful conformation or gait patterns that correlate with body mass, and simply being told the mass itself in disguise. The use of cross-validation was central to reliable inference. By partitioning the training data into stratified folds, fitting all preprocessing and SMOTE only on each fold’s internal training partition, and evaluating on its validation partition, an unbiased estimate of generalization was obtained while preserving the natural class balance in each fold. This is especially important when imbalance exists, because naive validation can overstate performance by over-representing majority classes or by letting information seep across folds. The strong agreement between cross-validation estimates and holdout performance - together with narrow bootstrap confidence intervals - supports the stability of the model. SMOTE was applied for a practical reason: weight bands are not uniformly populated. When minority classes are severely under-represented, tree ensembles can learn boundaries that favor the larger classes. SMOTE augments the local neighborhoods of minority classes by interpolating additional points between each minority instance and its nearest minority neighbors in feature space. When performed inside the cross-validation folds (as done here), SMOTE improves the classifier’s view of the decision surface while preserving the integrity of validation. In contrast, performing SMOTE before cross-validation would leak information into validation partitions and inflate performance. The combination of fold-internal SMOTE and class-weighted Extra Trees therefore provided two complementary safeguards against imbalance. The errors observed were biologically sensible. Nearly all misclassifications were adjacent to the true category, which is exactly where uncertainty is expected when thresholds are drawn across a continuous trait. Small fluctuations in pose, image capture, or true live weight can move an animal across a cut-point. The absence of distant errors in the tuned model indicates that the learned patterns align with real weight differences rather than spurious artifacts. Probability outputs from the sklearn pipeline showed moderate over-confidence; if calibrated probabilities are required for decision thresholds (e.g., routing animals to pens by risk), simple post-hoc calibration such as temperature scaling or isotonic regression on a validation split is recommended. Strengths. Methodologically, three aspects stand out. First, the image capture protocol was engineered for field realism — explicit stand-off guidance, tilt control, and operator ergonomics — rather than controlled environment; this lowered the barrier to practical translation. Second, the model choice (instance segmentation over boxes; ResNet-101 over mobile backbones) was empirically justified on failure modes that matter for downstream regression, not only on abstract detection metrics. Third, the data regime : multi-device, multi-session, two sites, and repeated animals over time, reduces the risk of identity memorisation and supports out-of-distribution robustness, an often-overlooked confounder in computer-vision-for-PLF studies [26]. Limitations. The work is intentionally scoped as a proof-of-concept. The dataset, while heterogeneous, is geographically constrained to two Swedish research herds and dominated by two breeds; the observed breed asymmetries (e.g., weaker girth–mass coupling and higher MAPE in Swedish Reds, especially >800 kg) require explicit handling before broad deployment. External validity has not yet been established; shifts in season, breed body composition, farm environment, or sensor setup could alter the relationship between image features and weight. Although the task is ordinal, a standard multi-class loss was used; explicit ordinal objectives or cost-sensitive training that penalize distant mistakes more than adjacent ones could further reduce boundary errors. SMOTE assumes locally smooth class structure; if a minority class occupies a distinct, non-convex region of feature space, interpolation may be less appropriate, although confining SMOTE to folds and using class weights mitigates this risk. Although LiDAR stabilises scale, reflectivity edge cases can still degrade depth quality in principle; cross-device calibration (between iPhone generations and Android ToF sensors) and continuous self-calibration in the app will be essential. Finally, Mask R-CNN with a deep backbone is not yet mobile-ready; on-device inference will require distillation, structured pruning, and INT8 quantisation, and such compression can introduce subtle, class-conditional biases that must be audited before release. Implications for practice. Scale ambiguity is a principal reason that single-view image-based anthropometry has struggled to translate to farm practice. Here, LiDAR-derived stand-off enabled a per-image pixel-to-millimetre factor and rescaling to a common 2.00 m reference, effectively removing distance-induced variance while preserving shape. We stress-tested LiDAR in a worst-case reflectivity scenario (a high-contrast black-and-white toy cow) to probe the known susceptibility of infrared depth sensing to dark, absorptive patches; the resulting calibration adjustments, together with minor protocol refinements (perpendicular incidence, glare avoidance), eliminated distance failures in the final field round (>1,300 images at consistent stand-off without notable errors). This was an important development allowing to combine the near-precision of classical LDMs with something more user-oriented, creating a middle ground between precision and ease of measurement. Despite these caveats, the system already supports valuable use cases: rapid triage into weight bands for ration adjustment or drug dosing; longitudinal monitoring of weight trajectories with minimal animal stress; and creation of a shared, intermediate data layer for advisors and producers. Crucially, the method demands only a phone and an easy image capture protocol, avoiding the infrastructure cost and operational friction of multi-camera or top-down 3D systems — an adoption determinant for commercial farms. As argued in the Introduction, normalising and scaling the routine capture of size and weight also opens a data channel for breeding and management decisions, including selection for more efficient, potentially smaller cows; while those sustainability claims remain prospective here, the enabling measurement substrate now exists in a practical form. Future directions. Three lines of work would elevate this from promising prototype to deployable standard. (1) External validation and fairness: prospective, preregistered trials across countries, housing types, floorings, and breeds (dairy and beef), with pre-specified non-inferiority margins versus calibrated scales, and subgroup reporting to surface any systematic under- or over-estimation. (2) Model and capture co-design: mask consistency can be enforced with capture-time cues (automatic “full-silhouette” checks), and residual pose variance can be attenuated with short, guided bursts whose embeddings are fused via attention pooling; domain adaptation and self-supervised pretraining on unlabelled barn video should further harden features to lighting and coat variation. (3) Edge deployment and privacy: end-to-end on-device inference (segmentation + estimation) with encrypted, opt-in telemetry for periodic recalibration would minimise connectivity dependence and address data-governance concerns from the outset. Conclusions The overall aim of this research was to investigate the possibilities for accurate body size and weight estimation of dairy cattle with the help of a smartphone camera and computer vision and machine learning algorithms. The results of this study demonstrated that accurate, non-contact estimation of dairy cattle body weight is achievable when two long-standing imaging bottlenecks are addressed jointly: (i) reliable scale normalisation at capture time, and (ii) segmentation quality sufficient to extract shape-aware features that correlate with mass. By coupling iPhone-class LiDAR for stand-off control with a Mask R-CNN (ResNet-101) instance-segmentation pipeline and a lightweight, feature-based estimator, robust weight estimation in real-world conditions was possible. Declarations Author contributions OG: Funding Acquisition, Project Administration, Conceptualization, Data Collection, Formal Analysis, Investigation, Methodology, Resources, Visualization, Writing - Original Draft, Writing - Review & Editing. EmmaT: Funding Acquisition, Conceptualization, Data Collection, Methodology, Writing - Original Draft, Writing - Review & Editing. ET: Funding Acquisition, Conceptualization, Writing - Review & Editing. ML: Funding Acquisition, Conceptualization, Data Collection, Resources, Writing - Review & Editing. CK: Funding Acquisition, Conceptualization, Data Collection, Resources, Writing - Review & Editing. Data availability statement The data and custom code that support this study are available from the corresponding author on reasonable request. Public deposition is temporarily restricted due to a pending patent investigation. We will release the data and code in a public repository once the patent review is complete. During peer review we will provide editors and reviewers with all necessary data and code on request. Competing Interests Statement The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Generative AI disclosure The author(s) verify and take full responsibility for the use of generative AI in the preparation of this manuscript. The author(s) declare that generative AI (ChatGPT-5, OpenAI) was used to assist in improving the language and clarity of the manuscript, as well as reviewing analytical code. Funding: This work was funded by the Swedish farmers’ foundation for agricultural research (Grant O-20-20-448) References Baltsavias, E. P. (1999). Airborne laser scanning: basic relations and formulas. ISPRS Journal of photogrammetry and remote sensing , 54 (2-3), 199-214. Barwick, S.A., Henzell, A.L., Herd, R.M. et al. Methods and consequences of including reduction in greenhouse gas emission in beef cattle multiple-trait selection. Genet Sel Evol 51, 18 (2019). https://doi.org/10.1186/s12711-019-0459-5 Carrasco-Guzmán, M. E., Barrientos-Medina, R. C., Arcos-Álvarez, D. N., Casanova-Lugo, F., Pozo-Leyva, D., & Chay-Canul, A. J. (2025). Reliability and concordance of Schaeffer and Agarwal formulae for predicting crossbred dairy cattle weight. Ecosistemas y Recursos Agropecuarios, 12 (1), e4245. https://doi.org/10.19136/era.a12n1.424 Cominotte, A., et al. (2020). Automated computer vision system to predict body weight and average daily gain in beef cattle during growing and finishing phases. Livest. Sci. 232, 103904. Costigan, H., Delaby, L., Walsh, S., Lahart, B., & Kennedy, E. (2021). The development of equations to predict live-weight from linear body measurements of pasture-based Holstein-Friesian and Jersey dairy heifers. Livestock Science, 253 , Article 104693. https://doi.org/10.1016/j.livsci.2021.104693 Forkuo GO and Borz SA (2023) Accuracy and inter-cloud precision of low-cost mobile LiDAR technology in estimating soil disturbance in forest operations. Front. For. Glob. Change 6:1224575. doi: 10.3389/ffgc.2023.1224575 Gomes, R. A., Monteiro, G. R., Assis, G. J. F., Busato, K. C., Ladeira, M. M., and Chizzotti, M. L. (2016). Estimating body weight and body composition of beef cattle trough digital image analysis. J. Anim. Sci. 94, 5414–5422. Huang, X., Hu, Z., Wang, X., Yang, X., Zhang, J., and Shi, D. (2019). An improved single shot multibox detector method applied in body condition score for dairy cows. Animals 9, 470. Jiang, M., Guo, G., and Mu, G. (2020). Visual BMI estimation from face images using a label distribution based method. Comput. Vis. Image Underst., 102985. Kilkenny, C., Browne, W. J., Cuthill, I. C., Emerson, M., and Altman, D. G. (2010). Improving Bioscience Research Reporting: The ARRIVE Guidelines for Reporting Animal Research. PLoS Biol. 8, e1000412. doi: 10.1371/journal.pbio.1000412 Lagrosas, N., Okubo, K., Irie, H., Matsumi, Y., Nakayama, T., Sugita, Y., Okada, T., and Shiina, T.: Continuous observations from horizontally pointing lidar, weather parameters and PM 2.5 : a pre-deployment assessment for monitoring radioactive dust in Fukushima, Japan, Atmos. Meas. Tech., 16, 5937–5951, https://doi.org/10.5194/amt-16-5937-2023, 2023. Lee C-b, Lee H-s, Cho H-c. Cattle Weight Estimation Using Fully and Weakly Supervised Segmentation from 2D Images. Applied Sciences . 2023; 13(5):2896. https://doi.org/10.3390/app13052896 Lukuyu, M.N., Gibson, J.P., Savage, D.B. et al. Use of body linear measurements to estimate liveweight of crossbred dairy cattle in smallholder farms in Kenya. SpringerPlus 5, 63 (2016). https://doi.org/10.1186/s40064-016-1698-3 Ma W, Qi X, Sun Y, Gao R, Ding L, Wang R, Peng C, Zhang J, Wu J, Xu Z, et al. Computer Vision-Based Measurement Techniques for Livestock Body Dimension and Weight: A Review. Agriculture . 2024; 14(2):306. https://doi.org/10.3390/agriculture14020306 Nir, O., Parmet, Y., Werner, D., Adin, G., and Halachmi, I. (2018). 3D Computer-vision system for automatically estimating heifer height and body mass. Biosyst. Eng. 173, 4–10. Ozkaya, S., Neja, W., Krezel-Czopek, S., and Oler, A. (2016). Estimation of bodyweight from body measurements and determination of body measurements on Limousin cattle using digital image analysis. Anim. Prod. Sci. 56, 2060–2063. Qiao, Y., Kong, H., Clark, C., Lomax, S., Su, D., Eiffert, S., & Sukkarieh, S. (2021). Intelligent perception for cattle monitoring: A review for cattle identification, body condition score evaluation, and weight estimation. Computers and electronics in agriculture , 185 , 106143. Sherwin, C. M., Christiansen, S. B., Duncan, I. J., Erhard, H. W., Lay, D. C., Mench, J. A., et al. (2003). Guidelines for the ethical use of animals in applied ethology studies. Appl. Anim. Behav. Sci. 81, 291–305. doi: 10.1016/S0168-1591(02)00288-5 Silwal, A., Parhar, T., Yandun, F., Baweja, H., & Kantor, G. (2021, September). A robust illumination-invariant camera system for agricultural applications. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3292-3298). IEEE. Song X, Bokkers EAM, van der Tol PPJ, Groot Koerkamp PWG, van Mourik S. Automated body weight prediction of dairy cows using 3-dimensional vision. J Dairy Sci. 2018 May;101(5):4448-4459. doi: 10.3168/jds.2017-13094. Epub 2018 Feb 22. PMID: 29477535. Spoliansky, R., Edan, Y., Parmet, Y., and Halachmi, I. (2016). Development of automatic body condition scoring using a low-cost 3-dimensional Kinect camera. J. Dairy Sci. 99, 7714 7725. Tong, A. K. W., Kennedy, B. W., and Moxley, J. E. 1976. A dairy records study of the effects of feeding levels on milk yield and composition. Canadian Journal of Animal Science . 56(3): 513-522. https://doi.org/10.4141/cjas76-063 Wagner, W., Ullrich, A., Ducic, V., Melzer, T., & Studnicka, N. (2006). Gaussian decomposition and calibration of a novel small-footprint full-waveform digitising airborne laser scanner. ISPRS journal of Photogrammetry and Remote Sensing , 60 (2), 100-112. Zhang, X., Wang, Y., and Shi, W. (2018). pcamp: Performance comparison of machine learning packages on the edges. in {USENIX} Workshop on Hot Topics in Edge Computing (HotEdge 18). Zin, T. T., Tin, P., Kobayashi, I., and Horii, Y. (2018). An automatic estimation of dairy cow body condition score using analytic geometric image features. in 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE), 775–776. Rumala, D. J. (2023, October). How you split matters: data leakage and subject characteristics studies in longitudinal brain MRI analysis. In Workshop on clinical image-based procedures (pp. 235-245). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-45249-9_23 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 16 Apr, 2026 Reviews received at journal 12 Apr, 2026 Reviewers agreed at journal 07 Apr, 2026 Reviews received at journal 03 Mar, 2026 Reviewers agreed at journal 26 Feb, 2026 Reviewers invited by journal 31 Oct, 2025 Editor invited by journal 16 Oct, 2025 Editor assigned by journal 12 Oct, 2025 Submission checks completed at journal 12 Oct, 2025 First submitted to journal 10 Oct, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7827424","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":542629352,"identity":"975e0636-4b3d-4204-a39a-ccf47b8f8026","order_by":0,"name":"Oleksiy Guzhva","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA8UlEQVRIie2PsWrDMBRFnxHIyyNeJQLxLyh4bEl+I6OMoF77AaUEDMoSOvtzUgQZkzWQKRg8OxiKhxCquBDoInsMRAcknkCHex+Ax/OIhEAARDdSe15h1KuQ/8pbdw9Q4K6YfiUixEzbdwPxIqtK/NjPKZOkrh0Kz6lSKAxMDzJLcHtMNZOUFw5FGEwM3JRCbnlBj9KmUIJOJWq+205JNS+uu65Yc3GnEHkrFjNF2VlvAlsMxq717S5JgiJDgZVVvlSq8aT52qFEYV7y9vIyiVdZxeTPbB6FytStK+aPHMXm/giW/QLAJ8SD/nk8Hs9T8gsBjEMjgksHvQAAAABJRU5ErkJggg==","orcid":"","institution":"Swedish University of Agricultural Sciences","correspondingAuthor":true,"prefix":"","firstName":"Oleksiy","middleName":"","lastName":"Guzhva","suffix":""},{"id":542629355,"identity":"cb5de218-a142-4c1a-a773-5dc1357685db","order_by":1,"name":"Emma Ternman","email":"","orcid":"","institution":"Nord University","correspondingAuthor":false,"prefix":"","firstName":"Emma","middleName":"","lastName":"Ternman","suffix":""},{"id":542629356,"identity":"90237271-f02e-42b7-b5c9-a30d314823fd","order_by":2,"name":"Mikaela Lindberg","email":"","orcid":"","institution":"Swedish University of Agricultural Sciences","correspondingAuthor":false,"prefix":"","firstName":"Mikaela","middleName":"","lastName":"Lindberg","suffix":""},{"id":542629357,"identity":"23109e75-6ca0-44be-b057-3e26dfcd9ed0","order_by":3,"name":"Evgenij Telezhenko","email":"","orcid":"","institution":"Swedish University of Agricultural Sciences","correspondingAuthor":false,"prefix":"","firstName":"Evgenij","middleName":"","lastName":"Telezhenko","suffix":""},{"id":542629359,"identity":"7bc1a12a-e301-465d-b803-1fee14621ebd","order_by":4,"name":"Cecilia Kronqvist","email":"","orcid":"","institution":"Swedish University of Agricultural Sciences","correspondingAuthor":false,"prefix":"","firstName":"Cecilia","middleName":"","lastName":"Kronqvist","suffix":""}],"badges":[],"createdAt":"2025-10-10 13:23:11","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7827424/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7827424/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":95805959,"identity":"3662de44-a871-49b4-90a5-794c62dc5f9a","added_by":"auto","created_at":"2025-11-13 08:47:10","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4979146,"visible":true,"origin":"","legend":"","description":"","filename":"PickAMoosubm.docx","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/e7f960c04d46e1f6dd862c4f.docx"},{"id":95805831,"identity":"8b1b069c-1c6c-479f-a57d-0c6d7ea1bc5b","added_by":"auto","created_at":"2025-11-13 08:46:57","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":8047,"visible":true,"origin":"","legend":"","description":"","filename":"7038453320334e9aab79dabb526a9785.json","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/89e7fd927e8dddb0fc4b4695.json"},{"id":95806081,"identity":"407f11f3-5901-47a0-937c-b04edc3b1ba7","added_by":"auto","created_at":"2025-11-13 08:47:14","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":109311,"visible":true,"origin":"","legend":"","description":"","filename":"7038453320334e9aab79dabb526a97851enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/034a99f1d9f4a30098d348ef.xml"},{"id":95806143,"identity":"d9b6368f-431a-4cd9-8915-ad4d015ba276","added_by":"auto","created_at":"2025-11-13 08:47:17","extension":"jpeg","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":442190,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/64ff0bedcfd988325da68158.jpeg"},{"id":95806138,"identity":"15456632-b28f-4663-bacf-02679bf1d243","added_by":"auto","created_at":"2025-11-13 08:47:17","extension":"jpeg","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":80665,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/3dca316c2bcadfb42d79a76f.jpeg"},{"id":95805904,"identity":"61fa1bea-6674-4a04-9984-5fe66454a4e1","added_by":"auto","created_at":"2025-11-13 08:47:04","extension":"png","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":789553,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/43fb9415c0995c32677b15eb.png"},{"id":95806343,"identity":"e0ed0a54-3287-4834-b67d-8cb8a06347f9","added_by":"auto","created_at":"2025-11-13 08:47:24","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":781240,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/da60f2e37931d4d8281b2faa.png"},{"id":95806158,"identity":"1fc0cfa6-9b3d-49f0-aac4-b5251a262f0f","added_by":"auto","created_at":"2025-11-13 08:47:18","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2625704,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/f7c6c42130a551fb8ec110ee.png"},{"id":95806219,"identity":"0740d02c-a482-44b2-933a-3f3f096efd18","added_by":"auto","created_at":"2025-11-13 08:47:20","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":109131,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/1971d45032aa166bb054ce22.png"},{"id":95805948,"identity":"02fcf97d-b4d2-437b-b1aa-38f7e257885e","added_by":"auto","created_at":"2025-11-13 08:47:09","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":47355,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/c93eea04392cd442dada8a21.png"},{"id":95806089,"identity":"0cef8516-2e7c-4fd2-b42d-874bfbfbab82","added_by":"auto","created_at":"2025-11-13 08:47:15","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":87231,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/56ede00f6a403208fb7228d3.png"},{"id":95805954,"identity":"1d1a758d-3652-4f56-b589-5478b185256d","added_by":"auto","created_at":"2025-11-13 08:47:09","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":74125,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/f8e610b790430d2ca76a20a5.png"},{"id":95805827,"identity":"5d5a5e9b-75c6-4252-87e9-a679e999d478","added_by":"auto","created_at":"2025-11-13 08:46:57","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":296248,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/bae8bd02246a347dbed2c273.png"},{"id":95805935,"identity":"b87b417d-1806-47aa-b3d8-ef5dd528f587","added_by":"auto","created_at":"2025-11-13 08:47:07","extension":"xml","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":108610,"visible":true,"origin":"","legend":"","description":"","filename":"7038453320334e9aab79dabb526a97851structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/28b1218386c483b9d20f8eea.xml"},{"id":95805938,"identity":"3845f239-d479-45ae-936d-1a4bb276e46a","added_by":"auto","created_at":"2025-11-13 08:47:07","extension":"html","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":118187,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/b346705e18589d2692d57df9.html"},{"id":95805927,"identity":"d0fe8b0f-2e84-427f-9976-812b4d6885b7","added_by":"auto","created_at":"2025-11-13 08:47:06","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":442190,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eExamples of user interface screenshots from the final version of the PickAMoo app (left to right): a) app selection on the main screen b) welcoming screen where user is requested to enter a cow ID to start the measurement c) LIDAR-based distance and perspective view, where optimal distance for image acquisition is highlighted with green, and changes to red when sub-optimal conditions are detected d) the confirmation of the image being saved, with additional metadata assigned to each photographed animal.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/34c88412cd583ccac575b886.jpeg"},{"id":95805894,"identity":"4d73e534-ec8d-4913-97d9-7fea13250408","added_by":"auto","created_at":"2025-11-13 08:47:03","extension":"jpeg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":72989,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eExperimental setup for worst-case reflectivity testing. A toy cow with contrasting black and white fur was used as a surrogate object to evaluate LiDAR performance under challenging surface conditions. Dark fur patches acted as low-reflectivity regions, allowing assessment of the system’s ability to maintain distance accuracy on non-uniform targets.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/cadfc8514cf26d23bcf0be28.jpeg"},{"id":95805918,"identity":"ea28fb6e-501d-446a-998d-37f54d453c17","added_by":"auto","created_at":"2025-11-13 08:47:06","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":789553,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eExample image of a cow in one of the typical barn contexts. Yellow line represents a manually drawn polygon which served as the Ground Truth for training the segmentation model.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/4620178fcc9b76d774c83367.png"},{"id":95805843,"identity":"dd5f67e2-b407-48f5-b0fd-e0fdf3a29424","added_by":"auto","created_at":"2025-11-13 08:46:59","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":568520,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eResults of an object detection by Mask R-CNN (rectangular bounding box around the detected object, here cow) and a segmentation mask of said object.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/150a418d1886039e626b0a6c.png"},{"id":95805866,"identity":"349d1a93-8cc3-4428-baf1-d9b2a9ee702f","added_by":"auto","created_at":"2025-11-13 08:47:01","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":1878495,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eModel detection/segmentation real-world variability\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/972b5fa56abe8e62b4034e33.png"},{"id":95819178,"identity":"c6f641e2-5410-484b-9a08-cb288dc75fa6","added_by":"auto","created_at":"2025-11-13 10:38:25","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":5063301,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7827424/v1/82feeec0-7b4a-4cbe-883a-f5a961ae6b8f.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"PickAMoo: LIDAR-Enhanced Mask R-CNN segmentation for Precision Weight Estimation in Dairy Cattle Using Smartphone Imaging.","fulltext":[{"header":"Introduction","content":"\u003cp\u003eThe competitiveness of dairy production, along with the complexity of management practices aimed at higher profitability and sustainability while maintaining animal health and welfare, creates a demand for automation of standard farming procedures. Currently, methods for weighing dairy cattle range from manual measuring tapes to platform scales or crates equipped with technology that records animal identification and weight data automatically. Traditional weight estimation methods, such as visual assessment and manual measurements using heart girth tapes, are labour-intensive, time-consuming, and often inaccurate [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. These methods also pose injury risks to animals and farm personnel due to the necessity of close physical contact. Consequently, there is a growing need for automated solutions that can provide quick and accurate weight estimates without stressing animals or endangering workers.\u003c/p\u003e\u003cp\u003eDeveloping camera-based monitoring devices for weight and size assessment is therefore considered vital in the Precision Livestock Farming (PLF) research field. Recent scientific initiatives have explored the potential of automated methods for estimating body weight, size, and body condition score (BCS) in cattle through computer vision, image analysis, and machine learning techniques [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. These systems claim to offer non-invasive and efficient alternatives to traditional weighing methods. For instance, [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e] developed an image-based method using 2D images and deep neural networks to estimate cattle weight, achieving a mean absolute percentage error of 5.5% with fully supervised segmentation. Similarly, [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e] utilized a 3D vision system to automatically measure morphological traits and predict body weight in dairy cattle with a root mean square error of 41.2 kg and a mean absolute percentage error of 5.2%.\u003c/p\u003e\u003cp\u003eHowever, many of these studies rely heavily on 3D cameras or 2D multi-camera setups to perform full-body scans or capture top-down images as cows pass beneath a stationary camera(s). While this approach can provide valuable information \u0026mdash; from body shape and composition to specific measurements like length, width, and angles between relevant anatomical points, it requires controlled conditions for reliable imaging from stationary equipment and involves complex modelling for analysis [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. Furthermore, the quality of images obtained with 3D cameras is highly dependent on lighting, camera angle, dust particles, lens cleanliness, and the cows' movement or inability to stand still during data acquisition process. Additionally, analysing images from 3D or multiple cameras simultaneously often demands substantial time and incurs extra costs for hardware setup, data storage, and computational processing [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e], which must be considered when evaluating the added value of automated body measurements.\u003c/p\u003e\u003cp\u003eAdvancements in artificial intelligence (AI) and computer vision (CV), along with algorithm miniaturization enabling deployment on mobile platforms (as investigated by Zhang et al., [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]), now allow for visual estimation of body mass index (BMI) from facial images in humans. This technology has primarily been applied in the public health domain [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], resulting in smartphone apps that estimate weight gain potential and recommend clothing sizes. Despite research highlighting the strong potential of CV and image analysis for automated estimation of dairy cows\u0026rsquo; weight, BCS, and conformation, these technologies are not yet widely used in everyday farming practice. This reluctance is mainly due to the complexity of image recording procedures, equipment and modelling costs, and the steep learning curve for interpreting results to support farm operations.\u003c/p\u003e\u003cp\u003eWe envision that by combining state-of-the-art, scalable CV algorithms for mobile platforms (e.g., smartphones) with expertise in dairy production, and by utilizing external assessments and on-farm individual animal data, we can develop an all-in-one solution for daily use. Such a tool would enable farmers to record their cows' weight status using a smartphone, increasing production efficiency, reducing costs, and potentially supporting new management routines in dairy farming. This approach would facilitate the creation of an intermediate data platform where farmers, researchers, and dairy advisors can collaboratively gather and access body weight and size data, developing new routines and guidelines for a sustainable and competitive dairy value chain. Highly relevant individual animal data could then be collected with a simple \"one-click\" procedure. In the long term, precise measurements of size and weight in cattle would enhance opportunities for assistance in breeding decisions and continuous health assessment. This approach could improve nutrient utilization, farm economics and reduce greenhouse gas emissions [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eThe overall aim of this research project was to investigate the possibilities for accurate weight estimation of dairy cattle with the help of a smartphone camera and computer vision algorithms.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003eEthical Consideration\u003c/p\u003e\n\u003cp\u003ePrior to the start of the experiment, the procedures and details of the experiment were evaluated by the Board of Ethical use of Animals in Teaching and Research, Swedish University of Agricultural Sciences, Uppsala, Sweden, and an ethical permit was obtained from the Swedish Board of Agriculture, Uppsala, Sweden ID number: 5.8.18-05598/2021. All procedures were conducted in accordance with the ethical guidelines proposed by the Ethical Committee of the ISAE (International Society of Applied Ethology [18]) and met the ARRIVE guidelines [10].\u003c/p\u003e\n\u003cp\u003eStudy location and animals\u003c/p\u003e\n\u003cp\u003eThe data was collected at two locations. At the first location, 248 lactating dairy cows of the breeds Swedish Red (n=148) and Swedish Holstein (n=100) were included for observations over twelve distinct days on nine separate occasions from October 2021 to February 2024. The cows were housed in a free-stall system at the Swedish Livestock Research Centre (Swedish University of Agricultural Sciences (SLU), Uppsala, Sweden). At the second location, 22 cows of the Swedish Red breed were included for observations for one day, to extend the pool of available animals and capture as much individual variability as possible. The cows were housed in a free-stall system at R\u0026ouml;b\u0026auml;cksdalen\u0026rsquo;s Research Facility, SLU, Ume\u0026aring;, Sweden.\u003c/p\u003e\n\u003cp\u003eEvaluating Manual Weight‑Estimation Equations as a Ground Truth for Machine‑Learning Models\u003c/p\u003e\n\u003cp\u003eAccurate baseline data are essential for training and validating any automated weight estimation algorithms. Traditional weight estimation equations for cattle rely on manual measurements, such as heart girth and body length taken with a measuring tape, but their precision varies with operator skill, animal posture and breed.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eOver the years researchers have proposed a spectrum of equations to estimate body weight in cattle, reflecting differences in measurement techniques, breeds and management conditions. At the simplest end are linear ratios such as Agarwal\u0026rsquo;s formula, which multiplies chest girth by body length and divides by a girth dependent constant [3]; and the widely used Schaeffer equation, which scales the product of body length and the square of chest girth [22]. More elaborate polynomial models, like the Holstein heifer equation from Heinrichs and colleagues, fit second and third order terms of heart girth to capture curvature in the weight\u0026ndash;girth relationship [5]. Breed-specific regressions for crossbred heifers and Senegalese dairy cattle substitute alternative landmarks such as hip width or the distance between coxal tuberosities when girth proves unreliable. Studies in low-input systems often favour straightforward heart girth or girth-plus-length equations because they require minimal equipment, whereas intensive dairy operations with a higher degree of automatization have adopted multiple regression models incorporating age, withers height or body condition to improve precision [13].\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTo understand the reliability of these manual weight estimation methods, we evaluated several commonly used equations and compared their outputs against a golden standard of automated weighing obtained from calibrated platform scales. To establish a solid foundation for automated weight classification, we have also performed manual measures using a measuring tape (ANIMETER measuring tape, V\u0026auml;xa, J\u0026ouml;nk\u0026ouml;ping, Sweden) for cattle to be used as reference for image analysis part of the work (Batches 1 and 2). These measures included chest girth, stomach circumference, back length (diagonal and direct), height (back and front), and rump width.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThis comparison allowed us to quantify the bias and variance inherent in manual measurements and to identify which body dimensions correlate best with true body weight and which similar features could be extracted from images. Because both manual tapes and automated scales can be affected by calibration drift, animal movement and human error, there remains an unavoidable margin of uncertainty; nonetheless, establishing the degree of agreement between the two methods provides a solid starting point for training machine learning models. By anchoring the models to weights from the automated scales and understanding the limitations of manual tape measurements, analogous features from images could be extracted and used for building image-based weight prediction models that emulate, and eventually replace, manual and mechanical measurements.\u003c/p\u003e\n\u003cp\u003eImage acquisition protocol\u003c/p\u003e\n\u003cp\u003eHigh-quality images for a well-performing CV model are dependent on a standardized image acquisition protocol. In our study, the protocol specified (1) a stand-off distance that ensured full-body coverage without occlusion by other animals or obstacles, (2) a near-perpendicular camera angle to reduce perspective distortion, and (3) operator positioning that maintained safety and avoided disturbing normal animal behaviour. A commercial Bosch laser distance meter (Bosch GLM 40 Professional, Robert Bosch Power Tools GmbH, Leinfelden-Echterdingen, Germany; measurement accuracy \u0026plusmn;0.15 mm) was used to control stand-off distance. We evaluated three configurations to quantify practical trade-offs and to guarantee seamless transition to a smartphone-only distance estimation: (a) handheld LDM with a handheld camera/smartphone; (b) tripod-mounted LDM and camera aligned to the cow\u0026rsquo;s midline and targeting the thoracolumbar region; and (c) a rigid bar (camera plus two LDMs) on a tripod to simultaneously sample distances to the cranial and caudal body, providing a simple approximation of animal depth. Across trials, stand-off distances ranged from 1.44 to 2.72 meters; empirically, ~2.00\u0026ndash;2.05 meters yielded the most consistent full-body coverage with minimal occlusion and stable focus and was therefore adopted when barn layout and animal flow permitted. To further stabilise image quality, operators were instructed to avoid strong backlighting, minimise motion during exposure, keep lenses clean, and favour perpendicular incidence to mitigate measurement and future segmentation errors associated with dark, glossy or wet coat patches.\u003c/p\u003e\n\u003cp\u003eMobile application development \u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAfter standardising the image-acquisition protocol, we initiated development of a dedicated mobile application to allow a full package solution: robust distance estimation, image acquisition and potentially, weight estimation. Design and implementation were contracted to digital interaction agency with expertise in app prototyping and user-interface engineering. Given smartphone platform constraints, Apple\u0026rsquo;s iPhone Pro/Pro Max devices (iPhone 12 Pro and newer; Apple Inc., Cupertino, CA, USA) were selected because they provided application-level access (Software Development Kit or SDK functionality) to the LIDAR sensor, enabling precise stand-off estimation \u0026mdash; critical for image quality control and scale normalisation.\u003c/p\u003e\n\u003cp\u003eThe app development process was built around four coordinated workstreams: (1) Camera module: a configurable capture pipeline supporting controlled exposure, focus and resolution to obtain full-body frames with minimal motion blur; (2) Distance module: real-time LIDAR-based stand-off estimation with on-screen guidance to keep the operator within a target distance window suitable for analysis; (3) Classifier integration: ingestion of CV outputs (e.g., instance masks) and computation of image-derived features for on-device weight classification or deferred processing; (4) User interface and data layer: workflows for animal selection, capture confirmation, and longitudinal review of predicted weight to support management decisions.\u003c/p\u003e\n\u003cp\u003eDevelopment followed an iterative cycle with weekly internal builds delivered to the project Primary Investigator for field testing and feedback. The development cycle included seven different app versions, aiming to refine capture guidance, robustness to barn lighting and operator ergonomics. The final data-acquisition app \u0026mdash; PickAMoo (Figure 1) \u0026mdash; contained: (i) a burst mode that captured four consecutive images of the same animal to reduce pose-related variance; (ii) LIDAR-based distance estimation and a depth-map preview to verify framing and scale; and (iii) a gyroscope-driven tilt indicator to help the operator maintain a near-perpendicular view and minimise perspective distortion.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eLIDAR/LDM distance measurement consistency\u003c/p\u003e\n\u003cp\u003eA central methodological challenge in developing a reliable image acquisition protocol was to guarantee that images were consistently obtained at the correct distance and angle. Such control was crucial for ensuring both user-friendliness of the procedure and reproducibility of the approach. The transition from manual LDMs to smartphone-integrated LiDAR was therefore a decisive step, as it allowed for direct, automated, and accurate distance estimation.\u003c/p\u003e\n\u003cp\u003eDuring the preliminary phase, several factors known to affect the performance of LDMs were considered in detail. Previous studies have shown that reflectivity of the target surface strongly influences measurement precision: highly reflective materials typically return stable signals, whereas darker or absorptive surfaces decrease reliability [23]. Ambient illumination is another factor; strong direct sunlight can reduce laser spot contrast and hinder detection, while diffuse or low-light conditions are generally more favourable [1]. Furthermore, atmospheric factors such as dust and humidity, both common in livestock housing, may scatter the beam and degrade measurement precision [11]. Finally, geometric factors, particularly the angle of incidence, were also recognized as a limitation [1]. Taken together, these considerations demonstrated that while LDMs are suitable for controlled settings, their susceptibility to environmental and geometric variability renders them less reliable in barns, where surfaces, lighting, and air quality cannot be standardized. For this reason, smartphone LiDAR was adopted as an alternative solution. By integrating distance estimation directly into the image acquisition device, LiDAR minimizes operator error and maintains robust measurement accuracy under suboptimal field conditions [6]. The importance of distance accuracy became even more evident when the first version of the mobile application was tested. While the LIDAR system demonstrated an overall measurement accuracy of \u0026plusmn;0.2 mm, occasional deviations were observed, where the app either underestimated or overestimated the distance substantially. Given the known sensitivity of smartphone LIDAR-based optical systems to surface reflectivity, a worst-case scenario test was performed (Figure 2). For this purpose, a toy cow with highly contrasting white and black fur was selected as the test object. The deep black fur posed a challenge for the LiDAR-based distance estimation system, as it absorbed rather than reflected the incident infrared light. The results confirmed that measurement errors were more pronounced on the darkest patches of fur, where accurate image capture with the desired parameters became problematic.\u003c/p\u003e\n\u003cp\u003eThis information was subsequently used to recalibrate the LiDAR sensor of the test device (iPhone 14 Pro Max), and the image acquisition protocol was adjusted accordingly. During the final round of data collection, a total of 1300 images were captured at consistent distances without any notable measurement errors.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eData collection and image annotations\u003c/p\u003e\n\u003cp\u003eData (images, BW and manual measures) were gathered on 12 separate days spread over three and a half years, with different animals included at each session to capture a representative cross-section of the herd (Table 1). To avoid confounding our analyses with duplicated observations, we ensured that no individual cow was weighed or photographed more than once on the same day. Thus, while some cows were sampled multiple times across the study period, each imaging session provided a unique combination of animals, contributing to a diverse dataset spanning a wide range of body weights and sizes.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eTable 1. Overview of the data collection occasions (Batches) with number of images, animals for each occasion and whether manual measurements complementary to automatic weighing were taken (\u003c/em\u003e\u003cem\u003e*Updated protocol for distance measurement (two reference points instead of one)\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e**Data collected using a PickAMoo app with LIDAR-functionality for distance estimation\u003c/em\u003e\u003cem\u003e)\u003c/em\u003e\u003cem\u003e.\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\u003cstrong\u003eBatch #\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\u003cstrong\u003eDate\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\u003cstrong\u003e# of Images\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\u003cstrong\u003e# of Animals Photographed\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\u003cstrong\u003eManual measurements\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e1a\u003cbr\u003e\u003c/td\u003e\n \u003ctd rowspan=\"2\" valign=\"top\"\u003eMay 2021\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e256\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e34\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eYes\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e1b\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e258\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e28\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eYes\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e2\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eJuly 2021\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e55\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e22\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eYes\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e3a\u003cbr\u003e\u003c/td\u003e\n \u003ctd rowspan=\"3\" valign=\"top\"\u003eOctober and November 2021\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e232\u0026nbsp;\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e92\u0026nbsp;\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e3b\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e166\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e54\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e3c\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e100\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e52\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e4*\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eOctober 2022\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e14\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e5\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e5*\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNovember 2022\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e305\u0026nbsp;\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e37\u0026nbsp;\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e6*\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNovember 2022\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e125\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e25\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e7*\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eFebruary 2023\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e110\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e17\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e8**\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eMay 2023\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e104\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e26\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e9**\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eDecember 2023\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e1122\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e145\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003eNo\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" valign=\"top\"\u003e\u003cstrong\u003eTotal\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\u003cstrong\u003e2847\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\u003cstrong\u003e537\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003cbr\u003e\u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u0026nbsp;Across nine data collection rounds (Batches), we assembled a dataset consisting of 2,847 images suitable for analysis. Imaging was performed with an Olympus \u0026mu;-700 digital camera (Olympus Corporation; Hachioji, Tokyo) and a Motorola smartphone (Motorola Mobility LLC, Chicago, IL, USA; standard camera app) for Batches 1 - 6, and with an iPhone 14 Pro Max running our experimental application for Batches 7\u0026ndash;9. The dataset comprised 248 unique cows photographed on multiple days and under varying conditions, yielding repeated observations per animal. Although more than 200+ lactating dairy cows were repeatedly weighed and photographed throughout the study, each animal\u0026rsquo;s body weight and conformation can change appreciably over time, and the imaging conditions (distance, angle and lighting) varied between data collection rounds. Consequently, images of the same cow taken days, weeks or months apart often differed in appearance and weight to such an extent that they effectively represented different individuals from a modelling perspective. This temporal variation, combined with subtle posture and camera related differences, created a large pool of individual variability within the dataset and reduced the risk that the models simply memorised specific animals.\u003c/p\u003e\n\u003cp\u003eManual body measurements (e.g., heart girth, body length, height, rump width) were collected for Batches 1\u0026ndash;2 and paired with ground-truth body weights from the barn\u0026rsquo;s automated scales (Batches 1-9), providing a reference set to benchmark manual equations and to anchor subsequent image-based modelling. Together, these multi-device, multi-session data constituted a heterogeneous basis for developing and stress-testing our CV and machine learning pipelines.\u003c/p\u003e\n\u003cp\u003eTo create quality backbone for image segmentation model, 567 images were manually annotated using the VIA Image Annotator (Visual Geometry Group, University of Oxford, Oxford, UK). Annotators draw precise polygons around each focal cow to produce instance masks suitable for training Mask R-CNN model (Figure 3). Annotation guidelines prioritised full-body contours when visible and resolved partial occlusions by following the visible outline only; images with severe occlusion or motion blur were excluded from the gold-standard set. These masks formed the basis for training and validating our instance-segmentation models and for deriving silhouette-based features (e.g., projected area, aspect ratios and contour descriptors) used in weight classification.\u003c/p\u003e\n\u003cp\u003eDevelopment of object detection/segmentation model\u003c/p\u003e\n\u003cp\u003eObject localisation for animal studies typically follows two paradigms; bounding-box (BB) detectors (e.g., Faster R-CNN) return rectangular boxes that roughly circumscribe each object and are well-suited to counting and coarse localisation tasks, whereas segmentation models assign labels at the pixel level. Within segmentation approach, semantic segmentation labels all pixels of a class without separating instances, whereas instance segmentation (e.g., Mask R-CNN) both detects objects and predicts a per-object mask. For weight estimation from single images, pixel-accurate silhouettes provide richer morphometric information than boxes in terms of projected mask area, aspect ratio, hull and contour-based measures, as well as pose-dependent ones. Having our reference dataset with manually annotated images (n = 567), the following split was used for model training and evaluation: 75% training (n = 425), 10% testing (n = 57) and 15% validation (n = 85).\u003c/p\u003e\n\u003cp\u003eWe compared two widely used segmentation architectures to evaluate the potential for transferring models into smartphone application: U-Net (semantic segmentation; efficient and often favoured for edge deployment) and Mask R-CNN (instance segmentation extending Faster R-CNN with a mask head). In preliminary tests, U-Net produced noticeably lower segmentation quality on cow images with clutter and partial occlusions, leading to leakage into background regions and unstable silhouettes (F1 score of 0.56). By contrast, Mask R-CNN was consistently able to delineate the cow from nearby animals and fixtures, providing masks that were robust enough to derive reliable image-based features for subsequent weight classification. Given these results and prior evidence that instance masks improve morphology-derived predictions in related human BMI/anthropometry work, we focused model development on Mask R-CNN (F1 score for the final model \u0026ndash; 0.98).\u003c/p\u003e\n\u003cp\u003eMask R-CNN comprises a CNN backbone for feature extraction, a region proposal network for candidate detections, and classification/box-regression heads augmented with a parallel mask head for pixel-wise instance segmentation. We evaluated four pretrained backbones: EfficientNet-B7; MobileNetV3; ResNet-101; and DenseNet-201, to probe the accuracy\u0026ndash;latency\u0026ndash;capacity trade-off. MobileNetV3 offers a compact, mobile-oriented option; EfficientNet-B7 and DenseNet-201 provide high representational capacity at greater compute costs; ResNet-101 is a strong, well-balanced baseline with stable training behaviour.\u003c/p\u003e\n\u003cp\u003eFor each backbone we trained a Mask R-CNN model with the following common settings: two images per GPU, 50 epochs with 1,000 steps per epoch, and 50 validation steps per epoch; two classes (cow, background); initial learning rate 0.001; momentum 0.9; weight decay 0.0001. Early stopping was applied (PyTorch callback) to prevent model overfitting. Models were trained and tested on an AI workstation with an AMD Ryzen 5950X CPU, 128 GB DDR4-3600 RAM, and an NVIDIA RTX 4080 (16 GB VRAM); the best performing model converged in 7 hours 36 minutes, with average power draw of 240 watts. Performance was monitored using standard detection and segmentation metrics (mean average precision, mAP and mean intersection over union, IoU) alongside qualitative inspection of failure modes (occlusions, dark coat patches, specular highlights). Each model was also interfered on additional 150 images randomly sampled from the complete dataset to see the real-world performance and to discover potential segmentation issues in varying scenarios.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe ResNet-101 backbone yielded the most reliable instance masks on held-out images and on images unseen during training, while maintaining acceptable inference speed for near-real-time processing on workstation-class hardware. Its stable optimisation and superior delineation of extremities (head, tail, distal limbs) translated into more consistent mask-derived features for weight modelling. Although MobileNetV3 was attractive for eventual on-device deployment, its segmentation accuracy on our data lagged the deeper backbones (F1 score 0.87). Future mobile deployment can potentially recover latency/size via backbone distillation, quantisation (INT8), and structured pruning once accuracy targets are locked.\u003c/p\u003e\n\u003cp\u003eChoosing instance segmentation over bounding-box detection ensured that downstream features reflected object shape rather than just extent. Bounding boxes could inflate with pose and background clutter, whereas masks permit computation of projected area, contour length, convexity, and other silhouette descriptors that are more directly related to body volume proxies. This alignment between the image analysis output and the biological quantity of interest (weight) reduces information loss and helps the subsequent machine-learning estimator generalise across breeds, poses, and acquisition conditions.\u003c/p\u003e\n\u003cp\u003eWeight classification model\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWhen the inference on images is performed, Mask R-CNN returns, for each detected cow, both a bounding box (BB) and a per-pixel segmentation mask (SM) (Figure 4). From these outputs we derived a compact set of silhouette features designed to correlate with body volume and weight proxies: BB width/height and area; SM (mask) area; extent (mask area \u0026divide; BB area); convex-hull area; elongation (major/minor axis ratio); contour length; and simple moment-based descriptors. Binary masks were cleaned with light morphological operations (hole filling, small-component removal) to stabilise measurements across poses and backgrounds.\u003c/p\u003e\n\u003cp\u003eBecause all pixel-based features scale with stand-off distance, we applied a per-image scale normalisation. Using the recorded LDM distance, we computed a pixels-to-millimetres factor and re-expressed all length and area features to a common reference of 2.00 m stand-off. Practically, this rescales BB and SM measurements so that an animal photographed at 1.6 m or 2.4 m is made comparable to one photographed at 2.0 m, mitigating distance-induced variance without altering shape information. To improve generalisability, we also included a single external scalar covariate reflecting body girth. When manual measurements were unavailable for a given image, we used a constant prior equal to the cohort\u0026rsquo;s average heart-girth (\u0026asymp; 200 cm). This anchors the model when segmentation masks differ slightly in the inclusion of extremities (e.g., head/tail), while leaving the image-derived features to carry most of the predictive signal.\u003c/p\u003e\n\u003cp\u003eThe dataset used for developing and testing final weight estimation model contained 1080 entries, based on features calculated from 1080 different cow images from Batch 9 matched with exact body weight confirmed through automatic scale system. Because downstream decision-making in farms often relies on weight bands rather than exact kilograms, the continuous weight was transformed into a categorical outcome in a data-driven manner. A one-dimensional Gaussian Mixture Model (GMM) was fitted to the weight distribution, the number of components K was selected by minimizing the Bayesian Information Criterion over K = 3\u0026hellip;10, and clusters with fewer than 25 animals were merged into the nearest cluster in mean weight. The resulting clusters were ordered by mean weight and relabeled 1\u0026hellip;K. This derived target is referred to as AutoWeightCategory and was fixed for all subsequent analyses.\u003c/p\u003e\n\u003cp\u003eSupervised learning for weight-category classification was conducted in Python. The primary workflow was implemented with scikit-learn and imbalanced-learn, while PyCaret was used as an independent cross-check on the same training split. All experiments ran on a high-performance workstation (Ryzen 7950X, 64 GB DDR5-6000, NVIDIA RTX 4090 24 GB).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eA train/holdout split was created once using stratification on the derived weight categories (80% train, 20% holdout; fixed random seed). The holdout set was kept untouched until the final evaluation. All model development was performed on the training portion within a single, end-to-end pipeline so that every transformation was estimated inside cross-validation folds. The pipeline comprised median imputation for missing values, robust scaling to stabilize feature ranges, removal of zero-variance features, class-imbalance handling with SMOTE applied within each training fold, and an Extra Trees classifier (500 trees, class_weight=\u0026quot;balanced\u0026quot;, parallel execution, fixed seed). SMOTE (Synthetic Minority Oversampling Technique) synthesizes additional minority-class examples by interpolating between each minority sample and its nearest minority neighbors in feature space. By enriching sparse regions locally, SMOTE exposes the classifier to a more balanced and informative decision surface without altering the holdout data; applying it only within folds prevents information from leaking into validation partitions.\u003c/p\u003e\n\u003cp\u003eModel selection used 10-fold stratified cross-validation on the training split. Cross-validation was adopted to obtain a reliable estimate of out-of-sample performance while preserving the class distribution in each fold and keeping preprocessing, SMOTE, and model fitting strictly confined to the training portion of each fold. The macro-averaged F1 score (macro-F1) was specified a priori as the primary metric because it weighs each class equally and thus reflects performance on minority categories; weighted-F1 and accuracy were recorded as complementary summaries. After cross-validation, the pipeline was refitted on the entire training split and evaluated once on the untouched holdout set.\u003c/p\u003e\n\u003cp\u003eBecause weight bands are intrinsically ordinal, errors were characterized not only as correct/incorrect but also by distance across categories (absolute difference between true and predicted class). Adjacent errors (distance = 1) and non-adjacent errors (distance \u0026gt; 1) were quantified, along with the mean, median, and maximum distance. Uncertainty on holdout macro-F1 was quantified with bootstrap resampling (2000 replicates). Finally, the calibration of predicted probabilities from the sklearn pipeline was assessed using the multiclass Brier score, expected calibration error (ECE; top-1), and a reliability diagram.\u003c/p\u003e\n\u003cp\u003eFor the independent confirmation step, PyCaret was run on the same training data with internal resampling disabled to avoid fold misalignment. A class-weighted Extra Trees model was created, tuned using PyCaret\u0026rsquo;s built-in procedures, finalized on the full training split, and evaluated on the same holdout set. Agreement between this PyCaret model and the primary sklearn/imbalanced-learn pipeline was taken as evidence that the findings did not hinge on a single software implementation.\u003c/p\u003e\n\u003cp\u003eThis pipeline ensured that (i) the vision output aligned with the biological quantity of interest (weight) via shape-aware features, (ii) scale effects from variable stand-off distance were neutralised, and (iii) the final classifier was selected on the basis of systematic, reproducible comparisons rather than ad-hoc choice.\u003c/p\u003e"},{"header":"Results and discussion","content":"\u003ch2\u003eGeneral comparative accuracy of weight estimation equations\u0026nbsp;\u003c/h2\u003e\n\u003cp\u003eThe comparison of linear weight estimation equations derived from combinations of heart girth (HG), body length (BL) and age measurements across both breeds used in the experiment and within each breed (Swedish Red and Swedish Holstein) showed a large variation between equations (Table 2) Each row shows the equation, its coefficient of determination (R\u0026sup2;), the associated p-value from the regression, and the mean absolute percentage error (MAPE) observed when the formula was applied to the dataset (manual measurement data from Batch 1 and 2). For the mixed breed sample, the equation that combines HG, BL and age (BW = 5.41\u0026times;HG + 2.41\u0026times;BL + 11.24\u0026times;Age \u0026ndash; 911.44) produced the highest R\u0026sup2; (0.89) and lowest MAPE (~ 4.5 %), indicating that incorporating multiple measurements, yields more accurate predictions than using HG alone.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eBreed-specific models revealed that the Swedish\u0026nbsp;Red cows in our study, were more difficult to predict when it comes to weight estimates: formulas using only HG or HG\u0026nbsp;+\u0026nbsp;BL achieve R\u0026sup2; values around 0.69\u0026ndash;0.71 and MAPE of 13\u0026ndash;14\u0026nbsp;%, suggesting that heart girth correlates less strongly with body weight in this breed. Adding age marginally improves performance but still leaves a large error. In contrast, the Swedish\u0026nbsp;Holstein models perform much better in our study; the three-parameter equation (BW\u0026nbsp;=\u0026nbsp;5.76\u0026times;HG\u0026nbsp;+\u0026nbsp;2.13\u0026times;BL\u0026nbsp;+\u0026nbsp;8.62\u0026times;Age\u0026nbsp;\u0026ndash;\u0026nbsp;928.26) attains R\u0026sup2;\u0026nbsp;=\u0026nbsp;0.93 and MAPE ~3.6\u0026nbsp;%. All regressions are highly significant (p\u0026nbsp;\u0026lt;\u0026nbsp;2.2\u0026nbsp;\u0026times;\u0026nbsp;10⁻\u0026sup1;⁶), implying that the predictors reliably explain variation in weight. These results highlight the value of multi-measure formulas and the importance of accounting for breed differences when selecting manual weight estimation equations.\u003c/p\u003e\n\u003cp\u003eIn addition, cows weighing over 800kg have a higher mean absolute percentage error (MAPE) compared to cows weighing less than 800 kg. In our dataset, these cows were predominantly Swedish Red hence skewing our data on heavy cows, which would explain the lack of fit of the equation. The breed characteristic in body conformation could be another explanation for the difficulties in finding a good equation. The BW/HG correlation is lower (0.79) for Swedish Red than Swedish Holstein (0.90), which also is seen in the BL:BW ratio.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eTable 2. Comparison of equations for body weight estimation in cattle and their estimated\u0026nbsp;\u003c/em\u003e\u003cem\u003eaccuracy when applied to Swedish Red and Swedish Holstein breeds (\u003c/em\u003eBW \u0026ndash; Body Weight, HG \u0026ndash; Heart Girth, BL \u0026ndash; Body Length)\u003cem\u003e.\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eBreed\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eFormula\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eR\u0026sup2;\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003ep-Value\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u003cstrong\u003eMean Absolute Percentage Error\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e(MAPE)\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"3\" valign=\"top\"\u003e\n \u003cp\u003eMixed Breeds (SR and SH)\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003eHG*7.3827-878.3134\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.85\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e5.42 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003eHG*6.2570+BL*2.3311-1035.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.87\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e4\u0026nbsp;.89 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003eHG*5.4143+BL*2.4066+AGE*11.2416-911.4412\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.89\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e4.47 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"3\" valign=\"top\"\u003e\n \u003cp\u003eSwedish Red (SR)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003eBW = HG*7.1984-851.8974\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.69\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e14.12 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003eBW = HG*6.1404+BL*2.0569-975.8509\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.71\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e13.67 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003eBW = HG*5.4592+BL*2.0203+AGE*11.8829-869.3609\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.72\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e13.34 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"3\" valign=\"top\"\u003e\n \u003cp\u003eSwedish Holstein (SH)\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003eHG*7.455-891.57\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.90\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e4.07 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003e6.7215+BL* 1.9233-1065.8118\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.91\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e3.59 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd\u003e\n \u003cp\u003eHG*5.7649+BL*2.1257+AGE*8.6201-928.2627\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e0.93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e\u0026lt; 2.2e-16\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd\u003e\n \u003cp\u003e3.55 %\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003ch2\u003eObject detection/segmentation and Mask R-CNN performance\u003c/h2\u003e\n\u003cp\u003eIn head-to-head comparisons, U-Net underperformed on cluttered, partially occluded barn images (F1 \u0026asymp;0.56), whereas Mask R-CNN with a ResNet-101 backbone yields pixel-accurate cow silhouettes (final F1 \u0026asymp;0.98) and near-perfect detection/segmentation accuracy on held-out and out-of-session imagery (\u0026asymp;99.7\u0026ndash;99.9%). This differential mattered practically: small, systematic mask errors (e.g., inconsistent inclusion of head, tail or a distal limb) propagated to area- and contour-based features and degrade regression stability. This was mitigated with a capture-side burst mode and by augmenting the feature vector with a single, low-variance scalar prior (cohort-average heart girth) that stabilised predictions without re-introducing the brittleness of full tape-based formulas. However, the potential transition to different breeds, age groups might require an extensive data collection rounds and model re-training/re-evaluation, for taking account to unknown features affecting the final weight classification.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAs could be seen in Figure 5, depending on the position of the cow during the photographing, the model produced an SM with or without a head, in addition to adding/removing other small body parts like tail, ears, and obscured leg. This, of course, posed to be a real-world challenge and affected the size of the final SM, potentially affecting the weight classification accuracy. One potential way to address this is a burst function, where SM is produced for each of them when four or more images are taken simultaneously. Then, the average value is used to input the weight classification model. Calculating additional image-based features and adding an average chest circumference value to the weight classification model eliminated this issue.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eImage-based weight classification model performance\u003c/h2\u003e\n\u003cp\u003eNine ordered weight categories were produced for the 1080 individual animal images (Batch 9), with a minimum category size of at least 25 animals. Ten-fold stratified cross-validation on the training split yielded stable results. The pipeline achieved a macro-F1 of 0.930 \u0026plusmn; 0.020 (mean \u0026plusmn; SD) and a weighted-F1 of 0.962 \u0026plusmn; 0.011, indicating consistent performance across folds and classes when all preprocessing and SMOTE were confined within folds.\u003c/p\u003e\n\u003cp\u003eOn the untouched holdout set (n = 216), the sklearn pipeline attained a macro-F1 of 0.912 with a 95% bootstrap confidence interval of 0.879\u0026ndash;0.941. Weighted-F1 and accuracy were 0.952 and 0.954, respectively. Error structure reflected the ordinal nature of the task: the overall error rate was 9.7%, of which 7.4% were adjacent misclassifications and 2.3% were non-adjacent. The mean absolute class distance was 0.130, the median was 0, and the maximum was 3 categories. Probability calibration for this pipeline showed a multiclass Brier score of 0.2114 and an ECE of 0.2112, with the reliability curve suggesting some over-confidence at higher predicted probabilities.\u003c/p\u003e\n\u003cp\u003eThe independently tuned PyCaret Extra Trees model improved holdout performance. A macro-F1 of 0.936 was obtained with a 95% bootstrap confidence interval of 0.913\u0026ndash;0.956; weighted-F1 and accuracy were 0.967 and 0.969, respectively. The overall error rate dropped to 4.2%, and all errors were adjacent to the true category. The mean absolute distance was 0.042, the median remained 0, and the maximum distance was 1 category. These error profiles show that residual mistakes occurred almost exclusively at bin boundaries and that large misclassifications were rare.\u003c/p\u003e\n\u003ch2\u003e\u003cstrong\u003eStrengths, limitations, implications, and future directions\u003c/strong\u003e\u003c/h2\u003e\n\u003cp\u003eA transparent, leak-safe pipeline was assembled to predict data-driven weight categories from image-derived features, and good performance was demonstrated on a holdout set. By deriving categories from the observed weight distribution using a GMM with BIC selection, bin definitions were grounded in the population rather than imposed a priori. Enforcing a minimum cluster size ensured that each class had enough animals for stable estimation. Most importantly, the source variable from which the label was created (Weight) was removed from the features. This removal prevented target leakage, where a model would otherwise learn a near-deterministic mapping from weight to its own discretized categories, resulting in deceptively high scores that would not generalize. In livestock terms, it is the difference between recognizing meaningful conformation or gait patterns that correlate with body mass, and simply being told the mass itself in disguise.\u003c/p\u003e\n\u003cp\u003eThe use of cross-validation was central to reliable inference. By partitioning the training data into stratified folds, fitting all preprocessing and SMOTE only on each fold\u0026rsquo;s internal training partition, and evaluating on its validation partition, an unbiased estimate of generalization was obtained while preserving the natural class balance in each fold. This is especially important when imbalance exists, because naive validation can overstate performance by over-representing majority classes or by letting information seep across folds. The strong agreement between cross-validation estimates and holdout performance - together with narrow bootstrap confidence intervals - supports the stability of the model.\u003c/p\u003e\n\u003cp\u003eSMOTE was applied for a practical reason: weight bands are not uniformly populated. When minority classes are severely under-represented, tree ensembles can learn boundaries that favor the larger classes. SMOTE augments the local neighborhoods of minority classes by interpolating additional points between each minority instance and its nearest minority neighbors in feature space. When performed inside the cross-validation folds (as done here), SMOTE improves the classifier\u0026rsquo;s view of the decision surface while preserving the integrity of validation. In contrast, performing SMOTE before cross-validation would leak information into validation partitions and inflate performance. The combination of fold-internal SMOTE and class-weighted Extra Trees therefore provided two complementary safeguards against imbalance.\u003c/p\u003e\n\u003cp\u003eThe errors observed were biologically sensible. Nearly all misclassifications were adjacent to the true category, which is exactly where uncertainty is expected when thresholds are drawn across a continuous trait. Small fluctuations in pose, image capture, or true live weight can move an animal across a cut-point. The absence of distant errors in the tuned model indicates that the learned patterns align with real weight differences rather than spurious artifacts. Probability outputs from the sklearn pipeline showed moderate over-confidence; if calibrated probabilities are required for decision thresholds (e.g., routing animals to pens by risk), simple post-hoc calibration such as temperature scaling or isotonic regression on a validation split is recommended.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStrengths.\u003c/strong\u003e Methodologically, three aspects stand out. First, the image \u003cem\u003ecapture protocol\u003c/em\u003e was engineered for field realism \u0026mdash; explicit stand-off guidance, tilt control, and operator ergonomics \u0026mdash; rather than controlled environment; this lowered the barrier to practical translation. Second, the \u003cem\u003emodel choice\u003c/em\u003e (instance segmentation over boxes; ResNet-101 over mobile backbones) was empirically justified on failure modes that matter for downstream regression, not only on abstract detection metrics. Third, the \u003cem\u003edata regime\u003c/em\u003e: multi-device, multi-session, two sites, and repeated animals over time, reduces the risk of identity memorisation and supports out-of-distribution robustness, an often-overlooked confounder in computer-vision-for-PLF studies [26].\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eLimitations.\u003c/strong\u003e The work is intentionally scoped as a proof-of-concept. The dataset, while heterogeneous, is geographically constrained to two Swedish research herds and dominated by two breeds; the observed breed asymmetries (e.g., weaker girth\u0026ndash;mass coupling and higher MAPE in Swedish Reds, especially \u0026gt;800 kg) require explicit handling before broad deployment. External validity has not yet been established; shifts in season, breed body composition, farm environment, or sensor setup could alter the relationship between image features and weight. Although the task is ordinal, a standard multi-class loss was used; explicit ordinal objectives or cost-sensitive training that penalize distant mistakes more than adjacent ones could further reduce boundary errors. SMOTE assumes locally smooth class structure; if a minority class occupies a distinct, non-convex region of feature space, interpolation may be less appropriate, although confining SMOTE to folds and using class weights mitigates this risk. Although LiDAR stabilises scale, reflectivity edge cases can still degrade depth quality in principle; cross-device calibration (between iPhone generations and Android ToF sensors) and continuous self-calibration in the app will be essential. Finally, Mask R-CNN with a deep backbone is not yet mobile-ready; on-device inference will require distillation, structured pruning, and INT8 quantisation, and such compression can introduce subtle, class-conditional biases that must be audited before release.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eImplications for practice.\u003c/strong\u003e Scale ambiguity is a principal reason that single-view image-based anthropometry has struggled to translate to farm practice. Here, LiDAR-derived stand-off enabled a per-image pixel-to-millimetre factor and rescaling to a common 2.00 m reference, effectively removing distance-induced variance while preserving shape. We stress-tested LiDAR in a worst-case reflectivity scenario (a high-contrast black-and-white toy cow) to probe the known susceptibility of infrared depth sensing to dark, absorptive patches; the resulting calibration adjustments, together with minor protocol refinements (perpendicular incidence, glare avoidance), eliminated distance failures in the final field round (\u0026gt;1,300 images at consistent stand-off without notable errors). This was an important development allowing to combine the near-precision of classical LDMs with something more user-oriented, creating a middle ground between precision and ease of measurement.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eDespite these caveats, the system already supports valuable use cases: rapid triage into weight bands for ration adjustment or drug dosing; longitudinal monitoring of weight trajectories with minimal animal stress; and creation of a shared, intermediate data layer for advisors and producers. Crucially, the method demands only a phone and an easy image capture protocol, avoiding the infrastructure cost and operational friction of multi-camera or top-down 3D systems \u0026mdash; an adoption determinant for commercial farms. As argued in the Introduction, normalising and scaling the routine capture of size and weight also opens a data channel for breeding and management decisions, including selection for more efficient, potentially smaller cows; while those sustainability claims remain prospective here, the enabling measurement substrate now exists in a practical form.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFuture directions.\u003c/strong\u003e Three lines of work would elevate this from promising prototype to deployable standard. (1) \u003cem\u003eExternal validation and fairness:\u003c/em\u003e prospective, preregistered trials across countries, housing types, floorings, and breeds (dairy and beef), with pre-specified non-inferiority margins versus calibrated scales, and subgroup reporting to surface any systematic under- or over-estimation. (2) \u003cem\u003eModel and capture co-design:\u003c/em\u003e mask consistency can be enforced with capture-time cues (automatic \u0026ldquo;full-silhouette\u0026rdquo; checks), and residual pose variance can be attenuated with short, guided bursts whose embeddings are fused via attention pooling; domain adaptation and self-supervised pretraining on unlabelled barn video should further harden features to lighting and coat variation. (3) \u003cem\u003eEdge deployment and privacy:\u003c/em\u003e end-to-end on-device inference (segmentation + estimation) with encrypted, opt-in telemetry for periodic recalibration would minimise connectivity dependence and address data-governance concerns from the outset.\u0026nbsp;\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThe overall aim of this research was to investigate the possibilities for accurate body size and weight estimation of dairy cattle with the help of a smartphone camera and computer vision and machine learning algorithms.\u003c/p\u003e\u003cp\u003eThe results of this study demonstrated that accurate, non-contact estimation of dairy cattle body weight is achievable when two long-standing imaging bottlenecks are addressed jointly: (i) reliable scale normalisation at capture time, and (ii) segmentation quality sufficient to extract shape-aware features that correlate with mass. By coupling iPhone-class LiDAR for stand-off control with a Mask R-CNN (ResNet-101) instance-segmentation pipeline and a lightweight, feature-based estimator, robust weight estimation in real-world conditions was possible.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eOG: Funding Acquisition, Project Administration, Conceptualization, Data Collection, Formal Analysis, Investigation, Methodology, Resources, Visualization, Writing - Original Draft, Writing - Review \u0026amp; Editing. EmmaT: Funding Acquisition, Conceptualization, Data Collection, Methodology, Writing - Original Draft, Writing - Review \u0026amp; Editing. ET: Funding Acquisition, Conceptualization, Writing - Review \u0026amp; Editing. ML: Funding Acquisition, Conceptualization, Data Collection, Resources, Writing - Review \u0026amp; Editing. CK: Funding Acquisition, Conceptualization, Data Collection, Resources, Writing - Review \u0026amp; Editing.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe data and custom code that support this study are available from the corresponding author on reasonable request. Public deposition is temporarily restricted due to a pending patent investigation. We will release the data and code in a public repository once the patent review is complete. During peer review we will provide editors and reviewers with all necessary data and code on request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests Statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eGenerative AI disclosure\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe author(s) verify and take full responsibility for the use of generative AI in the preparation of this manuscript.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe author(s) declare that generative AI (ChatGPT-5, OpenAI) was used to assist in improving the language and clarity of the manuscript, as well as reviewing analytical code.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding:\u0026nbsp;\u003c/strong\u003eThis work was funded by the Swedish farmers\u0026rsquo; foundation for agricultural research (Grant O-20-20-448)\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eBaltsavias, E. P. (1999). Airborne laser scanning: basic relations and formulas. \u003cem\u003eISPRS Journal of photogrammetry and remote sensing\u003c/em\u003e, \u003cem\u003e54\u003c/em\u003e(2-3), 199-214.\u003c/li\u003e\n \u003cli\u003eBarwick, S.A., Henzell, A.L., Herd, R.M. \u003cem\u003eet al.\u003c/em\u003e Methods and consequences of including reduction in greenhouse gas emission in beef cattle multiple-trait selection. \u003cem\u003eGenet Sel Evol\u003c/em\u003e 51, 18 (2019). https://doi.org/10.1186/s12711-019-0459-5\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eCarrasco-Guzm\u0026aacute;n, M. E., Barrientos-Medina, R. C., Arcos-\u0026Aacute;lvarez, D. N., Casanova-Lugo, F., Pozo-Leyva, D., \u0026amp; Chay-Canul, A. J. (2025). Reliability and concordance of Schaeffer and Agarwal formulae for predicting crossbred dairy cattle weight. \u003cem\u003eEcosistemas y Recursos Agropecuarios, 12\u003c/em\u003e(1), e4245. https://doi.org/10.19136/era.a12n1.424\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eCominotte, A., et al. (2020). Automated computer vision system to predict body weight and average daily gain in beef cattle during growing and finishing phases. Livest. Sci. 232, 103904.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eCostigan, H., Delaby, L., Walsh, S., Lahart, B., \u0026amp; Kennedy, E. (2021). The development of equations to predict live-weight from linear body measurements of pasture-based Holstein-Friesian and Jersey dairy heifers. \u003cem\u003eLivestock Science, 253\u003c/em\u003e, Article 104693. https://doi.org/10.1016/j.livsci.2021.104693\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eForkuo GO and Borz SA (2023) Accuracy and inter-cloud precision of low-cost mobile LiDAR technology in estimating soil disturbance in forest operations. Front. For. Glob. Change 6:1224575. doi: 10.3389/ffgc.2023.1224575\u003c/li\u003e\n \u003cli\u003eGomes, R. A., Monteiro, G. R., Assis, G. J. F., Busato, K. C., Ladeira, M. M., and Chizzotti, M. L. (2016). Estimating body weight and body composition of beef cattle trough digital image analysis. J. Anim. Sci. 94, 5414\u0026ndash;5422.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eHuang, X., Hu, Z., Wang, X., Yang, X., Zhang, J., and Shi, D. (2019). An improved single shot multibox detector method applied in body condition score for dairy cows. Animals 9, 470.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eJiang, M., Guo, G., and Mu, G. (2020). Visual BMI estimation from face images using a label distribution based method. Comput. Vis. Image Underst., 102985.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eKilkenny, C., Browne, W. J., Cuthill, I. C., Emerson, M., and Altman, D. G. (2010). Improving Bioscience Research Reporting: The ARRIVE Guidelines for Reporting Animal Research. \u003cem\u003ePLoS Biol.\u003c/em\u003e 8, e1000412. doi: 10.1371/journal.pbio.1000412\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eLagrosas, N., Okubo, K., Irie, H., Matsumi, Y., Nakayama, T., Sugita, Y., Okada, T., and Shiina, T.: Continuous observations from horizontally pointing lidar, weather parameters and PM\u003csub\u003e2.5\u003c/sub\u003e: a pre-deployment assessment for monitoring radioactive dust in Fukushima, Japan, Atmos. Meas. Tech., 16, 5937\u0026ndash;5951, https://doi.org/10.5194/amt-16-5937-2023, 2023.\u003c/li\u003e\n \u003cli\u003eLee C-b, Lee H-s, Cho H-c. Cattle Weight Estimation Using Fully and Weakly Supervised Segmentation from 2D Images. \u003cem\u003eApplied Sciences\u003c/em\u003e. 2023; 13(5):2896. https://doi.org/10.3390/app13052896\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eLukuyu, M.N., Gibson, J.P., Savage, D.B. \u003cem\u003eet al.\u003c/em\u003e Use of body linear measurements to estimate liveweight of crossbred dairy cattle in smallholder farms in Kenya. \u003cem\u003eSpringerPlus\u003c/em\u003e 5, 63 (2016). https://doi.org/10.1186/s40064-016-1698-3\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eMa W, Qi X, Sun Y, Gao R, Ding L, Wang R, Peng C, Zhang J, Wu J, Xu Z, et al. Computer Vision-Based Measurement Techniques for Livestock Body Dimension and Weight: A Review. \u003cem\u003eAgriculture\u003c/em\u003e. 2024; 14(2):306. https://doi.org/10.3390/agriculture14020306\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eNir, O., Parmet, Y., Werner, D., Adin, G., and Halachmi, I. (2018). 3D Computer-vision system for automatically estimating heifer height and body mass. Biosyst. Eng. 173, 4\u0026ndash;10.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eOzkaya, S., Neja, W., Krezel-Czopek, S., and Oler, A. (2016). Estimation of bodyweight from body measurements and determination of body measurements on Limousin cattle using digital image analysis. Anim. Prod. Sci. 56, 2060\u0026ndash;2063.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eQiao, Y., Kong, H., Clark, C., Lomax, S., Su, D., Eiffert, S., \u0026amp; Sukkarieh, S. (2021). Intelligent perception for cattle monitoring: A review for cattle identification, body condition score evaluation, and weight estimation. \u003cem\u003eComputers and electronics in agriculture\u003c/em\u003e, \u003cem\u003e185\u003c/em\u003e, 106143.\u003c/li\u003e\n \u003cli\u003eSherwin, C. M., Christiansen, S. B., Duncan, I. J., Erhard, H. W., Lay, D. C., Mench, J. A., et al. (2003). Guidelines for the ethical use of animals in applied ethology studies. \u003cem\u003eAppl. Anim. Behav. Sci.\u003c/em\u003e 81, 291\u0026ndash;305. doi: 10.1016/S0168-1591(02)00288-5\u003c/li\u003e\n \u003cli\u003eSilwal, A., Parhar, T., Yandun, F., Baweja, H., \u0026amp; Kantor, G. (2021, September). A robust illumination-invariant camera system for agricultural applications. In \u003cem\u003e2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)\u003c/em\u003e (pp. 3292-3298). IEEE.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eSong X, Bokkers EAM, van der Tol PPJ, Groot Koerkamp PWG, van Mourik S. Automated body weight prediction of dairy cows using 3-dimensional vision. J Dairy Sci. 2018 May;101(5):4448-4459. doi: 10.3168/jds.2017-13094. Epub 2018 Feb 22. PMID: 29477535.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eSpoliansky, R., Edan, Y., Parmet, Y., and Halachmi, I. (2016). Development of automatic body condition scoring using a low-cost 3-dimensional Kinect camera. J. Dairy Sci. 99, 7714 7725.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eTong, A. K. W., Kennedy, B. W., and Moxley, J. E. 1976. A dairy records study of the effects of feeding levels on milk yield and composition. \u003cem\u003eCanadian Journal of Animal Science\u003c/em\u003e. 56(3): 513-522.\u0026nbsp;\u003ca href=\"https://doi.org/10.4141/cjas76-063\"\u003ehttps://doi.org/10.4141/cjas76-063\u003c/a\u003e\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eWagner, W., Ullrich, A., Ducic, V., Melzer, T., \u0026amp; Studnicka, N. (2006). Gaussian decomposition and calibration of a novel small-footprint full-waveform digitising airborne laser scanner. \u003cem\u003eISPRS journal of Photogrammetry and Remote Sensing\u003c/em\u003e, \u003cem\u003e60\u003c/em\u003e(2), 100-112.\u003c/li\u003e\n \u003cli\u003eZhang, X., Wang, Y., and Shi, W. (2018). pcamp: Performance comparison of machine learning packages on the edges. in {USENIX} Workshop on Hot Topics in Edge Computing (HotEdge 18).\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eZin, T. T., Tin, P., Kobayashi, I., and Horii, Y. (2018). An automatic estimation of dairy cow body condition score using analytic geometric image features. in 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE), 775\u0026ndash;776.\u003c/li\u003e\n \u003cli\u003eRumala, D. J. (2023, October). How you split matters: data leakage and subject characteristics studies in longitudinal brain MRI analysis. In Workshop on clinical image-based procedures (pp. 235-245). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-45249-9_23\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-7827424/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7827424/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eData on body weight, as well as objective measures of body condition and size, are essential for appropriate decision-making on farm level, e.g. for calculations of nutrient requirements, health control and assessments for breeding purposes. Cows with suboptimal body condition score are at higher risk for transition diseases (e.g. metritis, subclinical ketosis, retained placenta) and lameness. Weighing dairy cattle and assessing their body condition is laborious and therefore often not performed on farms as frequently as desired for best production results. Despite recent research findings advocating a strong potential of using computer vision and image analysis for automated estimation of dairy cows\u0026rsquo; weight, body condition score (BCS) and conformation, current technologies are still not widely applied in everyday practice, and the majority of methods used for BCS or weight estimation in cattle utilize the multi-camera stationary setups or 3D-cameras, which leads to high computational costs. We propose a new, two-step, AI-based method for easy live weight estimation. The first step includes Mask R-CNN segmentation network trained on 565 unique cow images (both left and right side) collected at distances varying from 1.90 meters to 2.10 meters, under different lightning conditions and at various angles. The final segmentation accuracy of Mask R-CNN was 0.98 in this first step. In the second step, weight was discretized into nine data-driven categories using a Gaussian Mixture Model (BIC-selected), after which the source weight variable was removed to prevent leakage and a leak-safe pipeline (imputation, robust scaling, fold-internal SMOTE, Extra Trees) was trained with stratified cross-validation and evaluated on an untouched holdout; a PyCaret implementation was used as an independent cross-check. On the 216-animal holdout, the tuned Extra Trees model achieved a macro-F1 of 0.936 (95% CI 0.913\u0026ndash;0.956), with a 4.2% error rate composed entirely of adjacent (neighbouring-bin) mistakes. These results were obtained on 1080 images collected using the developed camera app and not used during the Mask R-CNN training. The idea is to further streamline the algorithm to allow its downscaling and transition in the form of a smartphone application to be used on-farm as an open-source support tool.\u003c/p\u003e","manuscriptTitle":"PickAMoo: LIDAR-Enhanced Mask R-CNN segmentation for Precision Weight Estimation in Dairy Cattle Using Smartphone Imaging.","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-11-13 07:46:23","doi":"10.21203/rs.3.rs-7827424/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-04-16T14:01:12+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-04-12T04:20:45+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"106876218311843818894529784880326647458","date":"2026-04-07T08:43:34+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-03-04T01:55:33+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"276411199418741462924228095825140168913","date":"2026-02-27T00:01:25+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-10-31T11:09:31+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-10-16T17:11:59+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-10-13T01:10:52+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-10-13T01:09:58+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2025-10-10T13:09:04+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"165e1ff0-b694-4b4a-ae99-cac1e2bfce6f","owner":[],"postedDate":"November 13th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":57724807,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":57724808,"name":"Physical sciences/Engineering"},{"id":57724809,"name":"Physical sciences/Mathematics and computing"}],"tags":[],"updatedAt":"2026-05-15T07:13:58+00:00","versionOfRecord":[],"versionCreatedAt":"2025-11-13 07:46:23","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7827424","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7827424","identity":"rs-7827424","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00