Towards Real-Time Industry-Proof Pork Breed and Boar Taint Classification using Rapid Evaporative Ionisation Mass Spectrometry (REIMS) | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Towards Real-Time Industry-Proof Pork Breed and Boar Taint Classification using Rapid Evaporative Ionisation Mass Spectrometry (REIMS) Vasiliki Gkarane, Marilyn De Graeve, Clive Stephens, Anneleen I Decloedt, and 7 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7537051/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 08 Jan, 2026 Read the published version in npj Science of Food → Version 1 posted 9 You are reading this latest preprint version Abstract To help counteract food fraud and meet consumer expectations, the pork industry requires reliable quality-monitoring and traceability systems. In this context, Rapid Evaporative Ionization Mass Spectrometry (REIMS) could be rolled out as a real-time, accurate metabolic fingerprint-based classifier of pork meat characteristics and quality issues like e.g. genetic origin and taint. Here, fingerprinting of > 3000 pig neck fat samples enabled highly accurate pig breed classification (pairwise comparison of Commercials (Pietrain x Hampshires x Durocs, Large-Whites, Durocs), Hampshires and Large-Whites, where data modelling using Support Vector Machine (SVM, all pairwise comparisons > 89%) and Orthogonal Partial Least Squares - Discriminant Analysis (OPLS-DA, >90%) outperformed Random Forest (RF, 72.0–79.5%). Boar taint classification showed comparable results between OPLS-DA, RF, and SVM (93.5–96.0%), but strategies to avoid false negatives and positives, including the construction of balanced models (tainted vs. non-tainted), proved imperative. Biological sciences/Biological techniques Biological sciences/Computational biology and bioinformatics Ambient Ionisation Mass Spectrometry Meat Authenticity Meat Quality Metabolomic Fingerprinting Pork Production Figures Figure 1 Figure 2 Figure 3 Figure 4 1. Introduction Consumers have high expectations regarding the authenticity and organoleptic quality of meat products. Several practices may sabotage traceability and promote food fraud however, including provenance masking, species substitution or adulteration 1 . Therefore, to combat food fraud and meet consumers expectations, the meat industry is required to install reliable quality-monitoring and traceability systems, consisting of a labelling system that associates each production animal with the final meat product 2 , 3 . Pig breed is one of the most important pre-slaughter factors to consider regarding pork meat quality traits (including eating quality) and muscle fibre characteristics (including e.g. intramuscular fat) 4 . In this context, UK meat traders have secured premium prices for pork products from traditional British livestock breeds like e.g. Duroc 5 . Meat producers are however challenged to prove breed authenticity for retail produce 6 , and therefore real-time breed identification would be of great benefit. An increasing number of countries are using livestock traceability control systems to ensure the acceptability of their products on the international market 3 , 6 . Until now, DNA-based technologies like microsatellite genotyping and single nucleotide polymorphisms (SNPs) have been applied to verify traceability of animal origin. This however requires a genotypic data repository of many different breeds for comparison (in the case of microsatellites) 7 , or a large number of loci (in the case of SNPs), resulting in high costs 8 , 9 . DNA barcoding, based on analysis of genes responsible for pigmentation undergoing mutation in individual breeds is deemed another promising technology 10 , and has been applied for species identification 10 , 11 , but appears unsatisfactory for breed identification 12 . Another important pre-slaughter factor for the organoleptic quality of pork is the presence of boar taint. Boar taint is an off-odour that may occur in entire males, and which has long plagued the industry. The increased demand to improve the sensory quality of meat from entire males has underlined the already existing need for real-time screening and sorting of carcasses with or without boar taint at the slaughter line 13 , 14 . In this context, several analytical methods like e.g. GC-MS and RAMAN spectroscopy have been proposed, but lack sensitivity and specificity, and are moreover not high-throughput 15 – 18 . Sensory methods (in which the smell is assessed by a trained expert) allow fast and holistic boar taint assessment, but are thwarted by interindividual variation, habituation and fatigue 19 . As such, at the time, there still is no existing method that meets all requirements for real-time at-line boar taint detection, in which the high rate at which pigs are slaughtered remains the biggest challenge 20 . The application of metabolomics for food authentication and quality assessment has grown strongly over the past 15 years, with Mass Spectrometry (MS) being one of the most used platforms 21 , 22 . More specifically, MS-based metabolomics has been applied successfully to address authenticity, functionality, quality and safety of raw, semi-processed and finished food products 23 . The advancement of ambient MS-based metabolomics, like e.g. Rapid Evaporative Ionisation MS (REIMS), is of specific interest in this context, due to the possibility of high-throughput, real-time classification, with high sensitivity and selectivity 24 , 25 . The technique relies on the desorption and ionisation of molecules by creating an aerosol from a biological sample, using an electrosurgical device, bipolar forceps or laser 24 , 26 , 27 . The combination with a Q-ToF mass analyser provides the spectral information required, which, through chemometric analysis (e.g., Principal Component Analysis (PCA) and Orthogonal Partial Least-Squares Discriminant Analysis (OPLS-DA)) creates statistical models for group evaluation and separation 28 . Alternatively, Machine Learning (ML) classification algorithms (e.g. Support Vector Machines (SVM), Random Forests (RF)) may be applied to process big heterogeneous datasets and generate predictive classification models 29 , 30 . In previous work, REIMS has been used in conjunction with multivariate classification models or ML-based algorithms to e.g. assess meat quality characteristics 31 , breed in beef 32 , production system in chicken meat products 33 and fish fraud 30 with high accuracy. REIMS has moreover already been applied successfully to model boar taint status in a small-scale laboratory setting 14 . The real-time, at-line sampling and analysis of pork characteristics by means of REIMS offers significant potential for monitoring and decision making (Fig. 1 ). Ideally, assessment of as many as possible parameters in one single analysis is made, moreover preferentially in an economically less relevant part of the carcass. For breed and boar taint specifically, sampling of neck fat is a valid option since differences in breed are reflected in adipose composition 34 , and boar taint compounds are known to accumulate in adipose tissue 14 . In the current study, we hypothesized that REIMS can be used to generate a metabolomic fingerprint-based classifier for the accurate, real-time prediction of pork breed and boar taint status. The first objective of this study was therefore to test the accuracy of REIMS for pig breed classification based on metabolomic fingerprinting of neck fat samples, using either multivariate data analysis (OPLS-DA) or machine learning algorithms (SVM, RF). This was tested for two datasets: (1) with the aim of differentiating three pork categories: i.e., Commercials (COMs, n = 346), Hampshires (HMPs, n = 341) and Large-Whites (L-Ws, n = 359); and (2) a subset of the first dataset to secure less genetic variability within each category, including Durocs (as subset of the COMs group), Hampshires (HMPs) and Large-Whites (L-Ws). The second objective of this study was to use and demonstrate the potential of the same proposed technology and methodology for the real-time, at-line screening and classification of tainted vs . untainted boar carcasses. The benefits of using either multivariate data analysis (OPLS-DA) or machine learning algorithms (SVM, RF) were tested in an off-line laboratory setting (n = 2055) as well as online (n = 554), i.e., in a slaughterhouse. 2. Material and methods 2.1. Collection of neck fat samples from different pig breeds A total of 1046 neck fat samples from pigs (finishing ages of 22–24 weeks) were collected over 5 days in November 2020 from a large meat processing plant in the UK. The 1046 samples comprised 3 categories (as detailed in Supplementary Table S1 ): 346 Commercials (COMs) crossbred pigs, 341 Hampshire (HMPs) crossbred pigs, and 359 L-Ws crossbred pigs. To minimise the impact of genetic variability within each category, a subset was made (as detailed in Supplementary Table S2), including 600 samples of the original dataset and only one genetic combination per breed/category. i.e., 200 crossbred pigs COMs, 200 HMPs crossbred pigs and 200 L-Ws crossbred pigs. Killing of the pigs was not carried out by the authors but performed in line with the commercial slaughtering process, including stunning, bleeding, scalding, dehairing, singeing, eviscerating, washing and chilling. Samples were collected post rapid cooling (2 ± 2°C), sealed in plastic bags and frozen in an air-blast freezer, before transfer to the ASSET Technology Centre (Institute for Global Food Security, Queen’s University Belfast, Northern Ireland). Samples were kept at -20°C, until analysis within 8 days following receipt. Prior to analysis, samples were thawed during approximately one hour in a laminar flow safety cabinet and maintained at 2 ± 2°C. 2.2. Collection, sensory and chemical analysis of boar neck fat samples 2.2.1. Phase I: off-line sample selection and analysis 2.2.1.1. Collection of samples and boar taint analysis A total of 2055 boar fat samples (from different batches, different farms, and sampled on different slaughter days as detailed in Supplementary Table S3) were collected at Exportslachthuis Tielt (Belgium) and Sus Campinae Westerlo (Belgium). Pigs were commercially slaughtered pigs, undergoing carbon dioxide, bleeding, scalding, dehairing, singeing, eviscerating, washing and chilling. The fat samples were cut at-line or in the slaughterhouse’s chilling rooms (2 ± 2°C), sealed in plastic bags and transported to the LIMET laboratory (Belgium) in a cooler, after which samples were stored at -20°C. Sensory boar taint analysis was done using the soldering iron method 19 , performed by two independent experts who assigned a score from 0 to 4, with 0 = “no taint” and 4 = “strong taint”. If there was no consensus between the two experts, i.e., a difference in score ≥ 2, the sample was scored by a third expert to provide a definite answer as to the sensory “boar taint status" of that specific sample. Based on the average sensory score (and preliminary “boar taint status”), 554 samples were selected for chemical analysis, including 153 samples with an average score < 1.5, and 208 samples with an average score ≥ 1.5. The experts consented to participate with the sensory research. Ethical permission was not required. Chemical analysis entailed analysis of the three known boar taint compounds indole, skatole and androstenone in the fat using UHPLC-HRMS (Ultra High Performance Liquid Chromatography hyphenated to High Resolution Mass Spectrometry), as described previously 35 . 2.2.1.2. Selection of samples for boar taint modelling As opposed to certain other traits, like e.g., breed or type, the occurrence of boar taint is less straightforward, due to the occurrence of very minor, mild to average and more severe boar taint cases. Therefore, to be able to create a reliable boar taint classification model, results of the sensory and chemical boar taint analyses were examined to make a more strict and unambiguous classification, i.e., samples were considered “tainted” if fat concentrations of respectively indole, skatole and/or androstenone exceeded the limit of 100, 200 and/or 1000 ppb respectively, regardless of the average sensory score. Only samples that were assigned an average sensory score of zero (exclusion from 0.25 onwards) and demonstrated indole, skatole and androstenone concentrations below 50, 100 and 500 ppb respectively, were considered “untainted”. Samples of which the sensory and chemical analysis rendered a contradicting boar taint status were excluded. This resulted in the selection of 1097 samples, with 93 tainted samples. 2.2.2. Phase II: online sample selection and analysis A total of 554 boar fat samples (detailed in Supplementary Table S3) were collected at Sus Campinae (Westerlo, Belgium), cut, stored, and transported as described previously (section 2.2.1.1 .). All samples underwent sensory boar taint analysis (cfr. section 2.2.1.1 .), and based on the mean sensory score, 183 samples were selected for chemical analysis (62 untainted samples < 1.5 and 121 tainted samples ≥ 1.5). Using the sample selection criteria used in Phase I, the total number of tainted samples was deemed too low for successful modelling, and therefore, in Phase II, the selection of tainted samples was less strict, with the aim of increasing statistical power. Logistical constraints did not allow for more extensive sampling and inclusion of more samples. Tainted samples were selected based on indole, skatole and/or androstenone levels exceeding 100, 200 and/or 900 ppb respectively, regardless of the average sensory score. The untainted samples were selected based on an average sensory score of 0, without exceptions. Based on the results of the sensory and chemical analysis combined, a final selection of 65 tainted and 147 untainted samples was made (212 samples in total). Originally, for phase II, plans were made to sample and analyse boar neck fat samples at Exportslachthuis Tielt (Belgium). However, in the weeks following the installation of the REIMS instrument at Exportslachthuis Tielt (see below), no boars were slaughtered there. Hence, alternatively, samples were collected at Sus Campinae, yet analysed at Exportslachthuis Tielt. 2.3. REIMS analysis 2.3.1. Sample analysis REIMS analysis of neck fat samples was performed using a Waters Xevo G2-XS QToF mass spectrometer and Waters REIMS ion-source (Waters, Wilmslow, UK). For both sections of the study, “burning” of fat samples was performed using a 3D-printed ‘fat probe’ (Waters Research Centre, Budapest, Hungary) powered by an ERBE VIO 50C diathermy generator (Erbe Elektromedizin, Tubingen, Germany) that was designed for at-line application specifically (as demonstrated and shown in Supplementary Figures S1 and S2). Instrument settings are provided in Supplementary Table S4. Analysis of neck fat samples for pig breed classification was performed at the ASSET Technology Centre (under controlled laboratory circumstances), and the off-line analysis (Phase I) of 2055 boar fat samples was performed at LIMET (under controlled laboratory circumstances). The online analysis (Phase II) of 554 boar samples was performed in an in situ air-conditioned laboratory adjacent to the slaughter line, after moving the LIMET REIMS instrument to Exportslachthuis Tielt (Belgium). The instrument was set up in a separate room, connected to the fat probe at the slaughter line through 4 m long tubing. 2.3.2. System calibration, cleaning and quality control The REIMS instrument underwent a detector setup process using a 0.1 ng/µL lockmass solution of leucine enkephalin (Waters, Milford, MA, USA) in 2-propanol (LC-MS grade; Honeywell Riedel-de Haën, Seelze, Germany) at a flow rate of 0.2 mL/min, and calibration using 5 mM sodium formate infusion (20 µL/min) at the start of each analysis day to correct for instrumental drift. The tip of the fat probe was cleaned every five samples by means of a water-soaked swab following the removal of charred remains. All REIMS parts (including the source, tubing and Venturi) were cleaned every 100 to 200 samples, whilst the instrument was vented and the StepWave was cleaned approximately every 500 samples. 2.4. Data analysis 2.4.1. Data preprocessing Data pre-processing included data cleaning (checking for inconsistent data, removing duplicates), baseline correction, peak detection, alignment, normalisation and scaling were performed in R (version 3.4.3, Vienna, Austria) in line with De Graeve, et al. 30 (breed data) and Verplanken, et al. 14 (boar taint data). Data were processed in a virtual computer environment with OS Linux (Ubuntu, v16.04 LTS or v20.04 LTS, Linux) using Oracle VM VirtualBox (version 6.1, Oracle). Code used for the preprocessing is available at https://github.com/UGent-LIMET/Preprocessing_AIMS . 2.4.2. Chemometric multivariate modelling using (O)PLS-DA Multivariate classification analysis was performed using SIMCA 17.0 (Sartorius Stedim Biotech, Umea, Sweden). PCA modelling was used to detect potential sample clustering, trends or outliers. The supervised OPLS-DA analysis was used to construct prediction models that predict the Y-variable (classification of samples in COMs, HMPs or L-Ws group) from the X matrix (the feature matrix with intensities of m/z peaks for each sample). SIMCA models were evaluated in terms of the seven-fold cross-validated cumulative modelled variation in the X matrix (R 2 X(cum) > 0.5), the cumulative modelled variation in the Y-variable i.e., the goodness of fit to the original data (R 2 Y(cum) > 0.5), and the cumulative predictive ability of the models (Q 2 (cum) > 0.5). Permutation testing (100 permutations) was performed to control the risk of spurious findings. Additionally, CV-ANOVA (cross-validated analysis of variance) was performed to validate the models, according to a 1/7 leave-out classification, with a statistical cut-off of p < 0.05. 2.4.3. Chemometric machine learning-based modelling using RF and SVM RF and SVM modelling were performed in Python (version 3.7.4, Fredericksburg, VA) using sklearn. More details, including the implemented programming languages and packages were published previously 30 . For both SVM and RF modelling, 75% of the original dataset was used as a training set after randomisation, followed by hyperparameter-optimisation with 5-fold cross-validation. Specifically, the training set was subsequently randomly split into five equal subgroups and the models were trained using four of the five subsets and validated on the remaining one part of the data, with computation of accuracy. The remaining 25% of the dataset (further referred to as the test or hold-out set) was used to test the ability of the best model (derived from the training set) to accurately classify the samples into their a priori labelled classes. An overview of the tested set of parameters for hyperparameter optimisation of RF and SVM algorithms is provided in Supplementary Table S5. 2.4.4. Model performance To evaluate the performance and compare the predictive ability of each cross-validated model, the following indicators were calculated on the hold-out set: Accuracy, Precision, Recall, Specificity and F1-score. Accuracy refers to the ratio of correctly classified samples to the total number of classified samples and characterises the performance of the model: Accuracy = (TP + TN) / (TP + TN + FP + FN) with TP, TN, FP, and FN denoting true positives, true negatives, false positives, and false negatives, respectively. Precision indicates the ratio of total true positive to the total predicted positive samples: Precision = TP / (TP + FP). Recall refers to the model’s ability to correctly classify relevant samples and was calculated by the fraction of true positive samples correctly classified as true positive samples: Recall = TP / TP + FN. Specificity indicates the fraction of true negative samples correctly classified as true negative samples: Specificity = TN / (TN + FP). F1-score represents a balanced accuracy as it is a combination of Precision and Recall: F1-score = 2 × (Precision × Recall) / (Precision + Recall). 3. Results REIMS was used to generate metabolomic fingerprints of thousands of pig neck fat samples to assess its potential application for accurate classification of pig breed and boar taint status. To model the data, multivariate data analysis (OPLS-DA) and machine learning algorithms (SVM, RF) were implemented, followed by comparison and evaluation of model performance. 3.1. Breed classification PCA analysis was performed to assess inherent sample clustering or trends among categories in the breed classification dataset but did not reveal breed-related trends, indicating that breed was not the main driver of biological variance. OPLS-DA, SVM and RF were used to generate pairwise comparisons among categories (i.e., COMs vs. HMPs, HMPs vs. L-Ws and COMs vs. L-Ws), and breed classification was also studied in a smaller subset, with reduced (impact of) genetic variability within each breed type. Results of OPLS-DA models indicated good to excellent classification accuracy for all three comparisons (COMs vs . HMPs, HMPs vs . L-Ws and COMs vs . L-Ws), ranging from 91–93% in the overall dataset (Table S6) and from 85–95% in the subset (Table S7). In SVM, the linear kernel function was selected as most appropriate in all pairwise comparisons in the original dataset and the reduced variability subset. The results show that the accuracy of the classification models for all three comparisons (COMs vs. HMPs, HMPs vs. L-Ws and COMs vs. L-Ws) was excellent, ranging from 90–95% in the overall dataset (Table S8) and from 91–96% in the subset (Table S9). The optimised model hyperparameters (Kernel, C, Gamma) for SVM derived models are summarised in Supplementary Table S10. The classification matrix for both the overall dataset and reduced variability subset is provided in Supplementary Tables S8 and S9. RF models showed an acceptable performance, although not to the same extent as the ones using the SVM algorithm. Specifically, the results showed that the classification accuracy of the models for all three comparisons (COMs vs. HMPs, HMPs vs. L-Ws and COMs vs. L-Ws) was good, ranging from 72–80% in the overall dataset (Table S12) and from 71–87% in the subset (Table S13). The optimised parameters (n trees, Min samples leaf, Max leaf nodes, max Depth) for RF derived models are summarised in Supplementary Table S11, whilst classification matrices are provided in Supplementary Tables S12 and S13. Overall, the performance of OPLS-DA and SVM outperformed RF modelling and was highly intercomparable, albeit SVM performed slightly better than OPLS-DA. Both OPLS-DA and SVM enabled the discrimination of the three breeds, independent of the reduced genetic variability of the Commercials pork breeds, with reliable accuracies and an equal ratio of false positives of false negatives (as detailed in the Supplementary Tables S6-S9, S12 and S13, also depicting the Precision, Recall, Specificity and F1-score). Figures 2 and 3 depict the confusion matrices for SVM models of the three pairwise comparisons of the overall breed classification dataset and breed classification subset (with a reduced genetic variability), respectively. 3.2. Boar taint classification OPLS-DA, SVM and RF models for boar taint classification were created and tested in line with the approach used for breed classification. First, PCA analysis was performed, yet did not reveal any boar taint-related trends in the data (not shown). In the off-line Phase I (with fingerprinting of all samples under laboratory circumstances), a first, ‘overall’ boar taint model was built modelling the REIMS-acquired metabolomic fingerprint of 93 tainted vs . 1004 untainted neck fat samples. As documented in Fig. 4 , boar taint classification accuracies of 94 to 95% could be reached using OPLS-DA, SVM and RF. Taking a closer look at correct classification % however, it appeared that 99% of untainted samples were classified correctly, whereas only 38, 33 and 42% of tainted samples were classified correctly using OPLS-DA, SVM and RF, respectively. The imbalance between false positives and false negatives is documented more in detail in Supplementary Tables S14-S16 (incl. Precision, Recall, Specificity, F1-score) and is reflected in the F1-scores of tainted samples that can be viewed in Fig. 4 A. With the aim of improving misclassification of tainted samples, a balanced model was constructed, including all 93 tainted samples and 100 untainted samples (with balancing prior to split; selecting samples with a sensory score of 0 and with indole, skatole and androstenone levels < LOD). This allowed classification accuracies from 94 to 96%, with notable improvement of classification of tainted samples also. More specifically, correct classification of untainted samples was 100, 96 and 96%, whilst correct classification of tainted samples was 92, 92 and 96% using OPLS-DA, SVM and RF, respectively, resulting in a higher Precision and F1-score for the ‘tainted’ group (Fig. 4 B, Supplementary Tables S14-S16). Overall, the performance of OPLS-DA, SVM and RF models was highly intercomparable. In Phase II, in which the REIMS instrument was set up at the slaughterhouse, a first ‘overall’ boar taint model was built modelling the REIMS-acquired metabolomic fingerprint of 65 tainted vs . 147 untainted neck fat samples. Using OPLS-DA, SVM and RF, accuracies of 70 to 74% were achieved. In line with the overall Phase I model, classification of tainted samples was insufficient as only 35, 24 and 6% of tainted samples were classified correctly, respectively. Correct classification of untainted samples was 89, 97 and 100%. To try to improve correct classification, balanced models were built by omitting samples from the larger ‘untainted’ group. For these models, accuracies between 61 and 70% were obtained. Correct classification of tainted samples could be improved, i.e., a 59, 64 and 65% correct classification of tainted samples was achieved for OPLS-DA, SVM and RF, with a 69, 63 and 64% correct classification of untainted samples, respectively. Taking into account accuracy, as well as the correct classification of both tainted and untainted samples specifically, SVM modelling outperformed RF and OPLS-DA modelling, although there is scope for improvement to further decrease false positives and false negatives while guarding the balance between error types. OPLS-DA, SVM and RF model (hyper)parameters, classification confusion matrices and metrics (Precision, Recall, Specificity and F1-score) can be consulted in Supplementary Tables S14 to S16. 4. Discussion A number of studies have applied MS-based metabolomics to address authenticity, functionality, quality and safety of raw, semi-processed and finished food products 23 . REIMS specifically can play an important role to verify the authenticity and quality of a food product and may be applied in traceability systems that link physical and digital attributes of products across the supply chain. The mass spectrometric fingerprints generated by REIMS require the application of multivariate classification models like e.g. OPLS-DA or ML-based algorithms to categorize meat samples according to production system and/or quality traits (e.g. breed, taint, species or tissue types) 2 . In the current work, firstly, REIMS was used to classify breed categories according to the metabolomic profile of pigs from three different types in the UK market, as well as from a subset of breeds, to test if a lower genetic diversity would positively influence classification. Secondly, REIMS was used to classify boar taint status of uncastrated male pigs, both in a laboratory and slaughterhouse setting, to test applicability in real-time and at-line. Three different multivariate classification algorithms were used for this purpose, i.e., OPLS-DA, SVM and RF, which were selected based on previous research, in which the use of OPLS-DA, SVM and RF resulted in high accuracies for e.g. fish speciation and evaluation of beef quality 29 , 30 , 36 . 4.1. Breed classification It is well established that breed or genetic type influences lipid content and fatty acid profile in pigs; and thus, metabolomic differences among breeds may be expected. Results for breed category classification showed that the use of REIMS combined with multivariate data analysis or machine-learning algorithms (RF or SVM) delivered models with high prediction accuracy. Specifically, the conjunction of REIMS with OPLS-DA demonstrated sufficient performance and an adequate capability to separate the categories, either in the complete dataset (1046 samples) or the subset (600 samples). Indeed, in the majority of meat-metabolomic studies, the conjunction of REIMS with multivariate statistics produces reliable models with excellent predictive abilities 14 , 37 – 40 , with few exceptions of low performance 26 , 41 . When evaluating alternative classification algorithms across the two breed datasets, RF exhibited a slightly lower performance compared to the OPLS-DA and SVM models, evident through the reduced number of correctly classified animals, substantiated by a lower accuracy. Menze, et al. 42 found that RF excelled in feature selection, while PLS-DA outperformed RF in classification when analysing spectral data. The authors suggested a combined approach, using RF's Gini index for feature selection and PLS-DA for classification, to leverage the strengths of both methods. However, Zhang, et al. 43 put forward an ML-based approach (e.g. decision tree algorithm) for the comparison of lamb breeds among farms, as this would better account for the diversity in farm production systems and would reveal confounding factors (diet, slaughter age, gender) -possibly hindering the discovery of potential breed biomarkers. In this case, non-linear ML techniques, such as RFs, may be more suitable for modelling metabolomics data, especially when dealing with the complex and highly diverse meat metabolome profile. The difference in performance between the two ML algorithms was also noted in this study. Overall, SVM provided better results in predicting the breed line compared to RF, as recorded through the accuracy (%) and F1-scores. However, the outperforming SVMs here were linear ML techniques, as the linear kernel was selected after 5-fold cross-validated hyperparameter optimisation for each breed model, indicating linear separability within this dataset. Our results confirm those of Gredell, et al. 29 , who applied REIMS in conjunction with different ML techniques to successfully classify beef quality attributes (including production background and breed); i.e., that the predictive accuracy of the different ML algorithms varied according to the classification problem, highlighting the need for a unique, fit-for-purpose approach. Depending on the classification problem, LDA, SVM (with a linear or radial kernel) or XGBoost emerged as the best performing algorithm in their results. In line with this, Penning, Snelling and Woodward-Greene 44 , who successfully applied REIMS in combination with 8 different ML algorithms to assess beef carcass quality traits, also noted that the best performing algorithm was dependent on the trait that was being measured, to such degree that each beef trait required a different algorithm to perform the best predictions on measured data. This was also further confirmed by Loomas, et al. 36 , who found that different ML algorithms performed better (SVM (radial), XGBoost and RF), depending on the dataset and the meat quality attribute being predicted. Our and previous findings align with the “No Free Lunch Theorem” 45 , according to which there is no algorithm that can achieve optimal performance for each problem it encounters each time. Interestingly, using the RF algorithms, the models derived for the breed subset performed slightly better than the models of the complete dataset, as expressed through the accuracy and F1-score values, for two comparisons (Durocs vs. HMPs and Durocs vs. L-Ws) especially. This may imply that the decreased level of genetic variability within each category positively influences phenotype expression and metabolomic fingerprinting and thus, the accuracy of breed classification. Apart from the genetic variability, the heterogeneity of the metabolic profile of fat and muscle tissues may be related to feeding, stress and post-mortem processes 46 . These factors can affect the way genes are expressed through processes like gene transcription and translation, and how proteins are modified, ultimately influencing the animal's physical traits and characteristics or phenotype. 4.2. Boar taint classification In previous work, we demonstrated that the metabolomic fingerprinting of neck fat obtained by means of REIMS analysis can be used for accurate boar taint classification using OPLS-DA modelling in a small-scale laboratory setting 14 . In the current study, modalities for real-time, at-line screening and classification of tainted vs . untainted boar carcasses were assessed, i.e., firstly in an off-line, yet large-scale laboratory setting (phase I), followed up by testing in an online slaughterhouse setting (phase II). In addition, we aimed to assess the use of SVM and RF (compared to OPLS-DA) modelling for boar taint classification. In phase I, the total number of collected tainted samples was 93, being 8% of the dataset. A non-balanced model including all samples resulted in unacceptable misclassification of tainted samples (less than 50% correctly classified, independent of algorithm used). However, by establishing a balanced model, this could be improved substantially; with classification accuracies > 93% and correct classification of both tainted and untainted samples > 92% (accompanying F1-scores of > 94%). Apparently, the adverse effect of the reduction of statistical power due to the smaller sample size after balancing (i.e., by removing the excess of untainted samples) was negligible compared to the penalty by misclassification of more than 50% of tainted samples in the unbalanced model. Therefore, alternatively, to ensure penalisation, the Precision or F1-score of the tainted samples could be applied as evaluation metric during training (instead of accuracy). In phase II, the same strategy was applied and again, balanced models outperformed the overall unbalanced models, with improved classification of both the tainted and untainted samples. Accuracies for the phase II models were considerably lower compared to the phase I models however, and we hypothesize that this may be due to mislabelling during model training. In previous work, Verplanken, et al. 14 implemented REIMS-based boar taint classification using the iKnife and reached a classification accuracy of 100% (OPLS-DA). The setup was very similar to the work described here but on a much smaller scale, and all samples were subjected to both chemical and sensory boar taint analysis prior to REIMS analysis and modelling 14 . Due to the very large number of samples in the current study, it was not feasible to subject all samples to chemical analysis, and thus a selection of (potentially) tainted samples was made based on sensory analysis results only. This implies that all final selected ‘tainted’ samples demonstrated indole, skatole and/or androstenone levels of more than 100, 200 and 900 ppb, but indole, skatole and androstenone levels were in fact unknown for most ‘untainted’ samples. These labels were used as Y-variable for model building however, and thus misclassification of untainted samples may form the basis of the lower classification accuracies hence obtained. All samples did undergo sensory analysis by at least two experts, but due to the very large number of samples analysed (more than 2000) in a short timeframe, habituation or fatigue may have occurred, potentially leading to false negative classification and inclusion of low to moderately tainted samples in the ‘untainted’ group of samples. To be able to counteract this problem in the future; i.e., to be able to build a more robust REIMS-fingerprint based boar taint model in a slaughterhouse setting, we therefore advise to build a balanced model using the F1-score metric as evaluation parameter, with sample analysis performed at-line, yet with chemical and sensory boar taint analysis of all samples used for training. This must be part of the investment in terms of costs and time, to avoid incorrect inclusion and classification. It is moreover advised to limit the daily number of samples analysed by the sensory experts, to avoid habituation and fatigue, and ensure correct labelling and (pre)selection of samples. False negative classification risks consumers’ liking of pork products. This does not apply for false positive classifications, although this may have ramifications for producers, i.e., in the form of penalty fees. Therefore, both false negative and positive test results are to be avoided at all costs 20 , 47 and should be penalised equally during model building. Lastly, it should be mentioned that recent work by Mörlein, et al. 48 revealed that 2-aminoacetophenone (AAP) is a boar taint contributing compound that has been overlooked for years. Although the olfactory importance of AAP is yet to be confirmed, future inclusion of AAP in the chemical boar taint analysis panel is recommended, as it may reduce false negative classification and improve training and use of REIMS-based boar taint classification models. 4.3. Practical and economic feasibility Besides REIMS, several candidate methods for pig breed classification have been tested, including e.g. sequencing of single nucleotide variants, MALDI-TOF (Matrix-Assisted Laser Desorption/Ionization - Time of Flight), Nuclear Magnetic Resonance (NMR), Visible and Near Infrared Reflectance (VIS/NIRS) spectroscopy, as well as conventional GC and LC-MS methods 49 – 53 . VIS/NIRS e.g. has the specific advantage of being fast yet seems to demonstrate higher error rates in comparison 54 . The other methods mentioned demonstrate very low error rates yet require more time, and moreover imply a higher cost per sample. For boar taint also, the at-line applicability and accuracy of several sensory and analytical detection methods have been investigated, but they do not meet the required performance characteristics 20 , 55 . Results of the current study and similar other studies using REIMS are relevant to meat industries who are willing to apply efficient methods in terms of accuracy, sensitivity, low-cost and time-efficiency, i.e. to address prominent analytical challenges. Even more so, as REIMS-based molecular fingerprinting and ML-based classification can be used for the elucidation of a wide range of biological characteristics, REIMS is expected to indeed become a highly useful tool for meat science and the meat industry as a whole 43 . In the pig industry, REIMS fingerprinting and ML-based classification tools could be rolled out for the identification of genetic origin of labelled meat products and at-line detection of tainted meat. From a more practical point of view, we experienced that the installation and at-line implementation of REIMS, being highly expensive and sensitive lab equipment, in a slaughterhouse is challenging, yet feasible. The iKnife or fat probe are connected to the Xevo G2-XS Q-TOF instrument by means of 4 m long tubing, with the instrument itself placed in a separate room with controlled humidity and temperature next to the slaughter line. Purchase and installation involve a high investment cost for abattoirs, but considering the high number of pigs being slaughtered each (work)day, costs per analysis were estimated to remain below 1 Euro per carcass 14 . Moreover, in case the REIMS instrument is used for more than one application, costs per analysis will decrease even further. Besides breed and boar taint classification, REIMS-based metabolomic fingerprinting may also be used for the detection of other pre-slaughter factors or meat product characteristics 31 , like e.g. screening for use of growth promotors 2 or adulteration with bulking agents 26 , further contributing to the overall economic feasibility. Nevertheless, as was discussed more in depth by De Graeve, et al. 30 , the implementation of real-time classification using REIMS in an industrial setting in the (near) future is mostly hardware and software dependent. The speed of the analysis itself remains unparalleled, but what would additionally be required is sufficient computing power, automatization of the data processing pipeline and integration of several chemometric modelling options into the REIMS-compatible vendor-specific or independent modelling software 30 . As such, REIMS can be embedded in production and traceability systems that link the physical and digital attributes of food products across the supply chain, indeed playing an important role in the verification of food safety, authenticity, and quality. 5. Conclusion Integration of REIMS-based metabolomics with multivariate data analysis or machine-learning algorithms demonstrated remarkable potential modelling of pig breed post-slaughter. OPLS-DA, SVM, and RF yielded high prediction accuracies, albeit OPLS-DA and SVM outperformed RF data modelling. Reduction of genetic variability within categories may positively influence classification accuracy, although factors such as feeding, stress, and post-mortem processes also contribute to metabolic heterogeneity. While REIMS analysis also showed promise to accurately distinguish tainted from untainted boar carcasses using OPLS-DA, SVM and RF, the current work underscores the importance of balanced modelling and incorporation of extensive chemical and sensory boar taint analysis during model training, in order to minimize false negative classification. Despite the foreseeable challenges regarding installation and implementation of REIMS in slaughterhouses, the economic feasibility of REIMS analysis remains promising, especially considering its numerous potential applications, i.e., beyond breed and boar taint classification. The successful integration of real-time classification using REIMS in an industrial setting depends on the selection of a fit-for-purpose multivariate model, reliable model training, other factors such as hardware and software compatibility, as well as data processing automation. Declarations AUTHOR INFORMATION These authors contributed equally: Lynn Vanhaecke, Lieselot Y. Hemeryck Authors and Affiliations Institute for Global Food Security, School of Biological Sciences National Measurement Laboratory: Centre of Excellence in Agriculture and Food Integrity, Queen’s University Belfast, 19 Chlorine Gardens, Belfast BT9 5DL, UK V. Gkarane, C. Elliott, N. Birse, L. Vanhaecke Laboratory of Integrative Metabolomics, Ghent University, Salisburylaan 133, BE-9820 Merelbeke, Belgium M. De Graeve, A.I. Decloedt, P. Vangeenderhuysen, L.Y. Hemeryck, L. Vanhaecke Institute for Biomedicine, EURAC Research, Via A.-Volta 21, I-39100 Bolzano, Italy M. De Graeve Cranswick Country Foods, Preston, Hull, HU12 8TB, UK C. Stephens Waters Corporation, Milford, MA, USA J. Balog The International Joint Research Center on Food Security, 113 Thailand Science Park, Phahonyothin Road, Pathum Thani 12120, Thailand C. Elliott Scientific Operations, Waters Corporation, Wilmslow, UK S.L. Stead Contributions VG: Formal Analysis, Investigation, Data Curation, Writing - Original draft, Visualization, Project administration. MDG: Software, Validation, Formal Analysis, Data Curation, Writing - Original draft, Visualization. CS: Conceptualization, Resources, Supervision, Project administration, Funding Acquisition. AID: Conceptualization, Methodology, Validation, Formal Analysis, Investigation, Writing - Review & Editing, Project administration. PV: Data curation, Writing - Review & Editing, Visualization. JB: Methodology, Formal Analysis, Resources. CE: Conceptualization, Writing - Review & Editing, Supervision, Funding Acquisition. SLS: Methodology, Formal Analysis, Resources. NB: Conceptualization, Methodology, Software, Validation, Data Curation, Writing - Original draft, Visualization. LYH: Conceptualization, Methodology, Validation, Formal Analysis, Investigation, Writing - Original Draft, Writing - Review & Editing, Supervision, Project administration. LV: Conceptualization, Writing - Review & Editing, Supervision, Project administration, Funding Acquisition. LYH and LV contributed equally to this work; i.e. shared senior authors. Corresponding author Correspondence to Lynn Vanhaecke: [email protected] ACKNOWLEDGEMENTS The authors would like to thank the staff of the Ballymena and Preston slaughterhouses (Cranswick Country Foods, UK), Sus Campinae Westerlo (Belgium) and Exportslachthuis Tielt (Belgium) for helping accommodate the collection of samples, and Exportslachthuis Tielt (Belgium) more specifically also for enabling the in-situ installation of the REIMS instrument. Thank you also to Dirk Stockx, Beata Pomian, Mieke Naessens, Joke Goedgebuer, Margot De Spiegeleer and Steve Huysman at the Laboratory of Integrative Metabolomics (LIMET) for their technical assistance. FUNDING DECLARATION The pig breed study was partially funded by the Knowledge Transfer Partnership (KTP) Agreement (Application ID: 1025930, Cranswick Foods & Queen’s University Belfast), whereas the boar taint work was funded by the Vlaamse Overheid - Departement LNE - Dienst Dierenwelzijn [LNE/STG/DWZ/16/11]. The 3D-printed ‘fat probe’ was an in-kind contribution by the Budapest Waters Research Centre (Hungary). L.Y. Hemeryck is an FWO (Research Foundation - Flanders) postdoctoral fellow [1297623N]. The funders played no role in study design, data collection, analysis and interpretation of the data, or the writing of this manuscript. ETHICS DECLARATIONS Competing interests JB and SLS are employed by Waters. The other authors declare no competing interests. Ethics approval All experiments were performed in accordance with relevant guidelines and regulations. No experiments were performed on live animals; all pigs were slaughtered for commercial purposes prior to this study. Ethics approval was thus not required. References Hassoun, A. et al. Fraud in animal origin food products: Advances in emerging spectroscopic detection methods over the past five years. Foods 9, 1069 (2020). Barlow, R. S. et al. Rapid evaporative ionization mass spectrometry: a review on its application to the red meat industry with an Australian context. Metabolites 11, 171 (2021). Shackell, G. H. & Dodds, K. G. in Meat Biotechnology (ed. Toldrá F.) 61–88 (Springer New York, 2008). Lee, S. et al. The influence of pork quality traits and muscle fiber characteristics on the eating quality of pork from various breeds. Meat Sci. 90, 284–291 (2012). Wilkinson, S. et al. Development of a genetic tool for product regulation in the diverse British pig breed market. BMC Genomics 13, 1–12 (2012). Fontanesi, L. Lawrie´s Meat Science (Eighth Edition) (ed. Toldrá F.) 585–633 (Woodhead Publishing, 2017). Girish, P. & Barbuddhe, S. in Meat Quality Analysis 153–170 (Elsevier, 2019). Ramos, A. M., Megens, H., Crooijmans, R. P. M. A., Schook, L. B. & Groenen, M. A. M. Identification of high utility SNPs for population assignment and traceability purposes in the pig using high-throughput sequencing. Anim. Genet. 6, 613–620 (2011). Vignal, A., Milan, D., SanCristobal, M. & Eggen, A. A review on SNP and other types of molecular markers and their use in animal genetics. Genet. Sel. Evol. 34, 275–305 (2002). Montowska, M. & Pospiech, E. Is authentication of regional and traditional food made of meat possible? Crit. Rev. Food Sci. Nutr. 52, 475–487 (2012). Arulandhu, A. J. et al. Development and validation of a multi-locus DNA metabarcoding method to identify endangered species in complex samples. GigaScience 6, gix080 (2017). Barcaccia, G., Lucchin, M. & Cassandro, M. DNA barcoding as a molecular tool to track down mislabeling and food piracy. Divers. 8, 2 (2015). Aluwé, M. et al. Exploratory survey on European consumer and stakeholder attitudes towards alternatives for surgical castration of piglets. Animals 10, 1758 (2020). Verplanken, K. et al. Rapid evaporative ionization mass spectrometry for high-throughput screening in food analysis: The case of boar taint. Talanta 169, 30–36 (2017). Verplanken, K. et al. Rapid method for the simultaneous detection of boar taint compounds by means of solid phase microextraction coupled to gas chromatography/mass spectrometry. J. Chromatogr. A 1462, 124–133 (2016). Sørensen, K. M. & Engelsen, S. B. Measurement of boar taint in porcine fat using a high-throughput gas chromatography–mass spectrometry protocol. J. Agric. Food Chem. 62, 9420–9427 (2014). Sørensen, K. M., Westley, C., Goodacre, R. & Engelsen, S. B. Simultaneous quantification of the boar-taint compounds skatole and androstenone by surface-enhanced Raman scattering (SERS) and multivariate data analysis. Anal. Bioanal. Chem. 407, 7787–7795 (2015). Liu, X., Schmidt, H. & Mörlein, D. Feasibility of boar taint classification using a portable Raman device. Meat Sci. 116, 133–139 (2016). Bekaert, K. et al. Evaluation of different heating methods for the detection of boar taint by means of the human nose. Meat Sci. 94, 125–132 (2013). Haugen, J.-E., Brunius, C. & Zamaratskaia, G. Review of analytical methods to measure boar taint compounds in porcine adipose tissue: The need for harmonised methods. Meat Sci. 90, 9–19 (2012). Böhme, K., Calo-Mata, P., Barros-Velázquez, J. & Ortea, I. Recent applications of omics-based technologies to main topics in food authentication. TrAC-Trends Anal. Chem. 110, 221–232 (2019). Hong, Y. et al. Data fusion and multivariate analysis for food authenticity analysis. Nat. Commun. 14, 3309 (2023). Hu, C. & Xu, G. Mass-spectrometry-based metabolomics analysis for foodomics. TrAC-Trends Anal. Chem. 52, 36–46 (2013). Black, C. et al. A real time metabolomic profiling approach to detecting fish fraud using rapid evaporative ionisation mass spectrometry. Metabolomics 13, 1–13 (2017). Liebal, U. W., Phan, A. N., Sudhakar, M., Raman, K. & Blank, L. M. Machine learning applications for mass spectrometry-based metabolomics. Metabolites 10, 243 (2020). Kosek, V. et al. Ambient mass spectrometry based on REIMS for the rapid detection of adulteration of minced meats by the use of a range of additives. Food Control 104, 50–56 (2019). Plekhova, V. et al. Rapid ex vivo molecular fingerprinting of biofluids using laser-assisted rapid evaporative ionization mass spectrometry. Nat. Protoc. 16, 4327–4354 (2021). Black, C. et al. Rapid detection and specific identification of offals within minced beef samples utilising ambient mass spectrometry. Sci. Rep. 9, 6295 (2019). Gredell, D. A. et al. Comparison of machine learning algorithms for predictive modeling of beef attributes using rapid evaporative ionization mass spectrometry (REIMS) data. Sci. Rep. 9, 5721 (2019). De Graeve, M. et al. Multivariate versus machine learning-based classification of rapid evaporative Ionisation mass spectrometry spectra towards industry based large-scale fish speciation. Food Chem. 404, 134632 (2023). Ross, A. et al. Making complex measurements of meat composition fast: Application of rapid evaporative ionisation mass spectrometry to measuring meat quality and fraud. Meat Sci. 181, 108333 (2021). Balog, J. et al. Identification of the species of origin for meat products by rapid evaporative ionization mass spectrometry. J. Agric. Food Chem. 64, 4793–4800 (2016). Birse, N. et al. Ambient mass spectrometry as a tool to determine poultry production system history: A comparison of rapid evaporative ionisation mass spectrometry (REIMS) and direct analysis in real time (DART) ambient mass spectrometry platforms. Food Control 123, 107740 (2021). Poklukar, K., Čandek-Potokar, M., Batorek Lukač, N., Tomažin, U. & Škrlep, M. Lipid deposition and metabolism in local and modern pig breeds: A review. Animals 10, 424 (2020). Bekaert, K. et al. A validated ultra-high performance liquid chromatography coupled to high resolution mass spectrometry analysis for the simultaneous quantification of the three known boar taint compounds. J. Chromatogr. A 1239, 49–55 (2012). Loomas, K. R. et al. Evaluation of rapid evaporative ionization mass spectrometry (REIMS) for the prediction of slice shear force and quality grades in beef longissimus lumborum steaks. Meat Sci. 222, 109752 (2025). Zhang, H. et al. Discrimination of dried sea cucumber (Apostichopus japonicus) products from different geographical origins by sequential windowed acquisition of all theoretical fragment ion mass spectra (SWATH-MS)-based proteomic analysis and chemometrics. Food Chem. 274, 592–602 (2019). Song, G. et al. In situ and real-time authentication of Thunnus species by iKnife rapid evaporative ionization mass spectrometry based lipidomics without sample pretreatment. Food Chem. 318, 126504 (2020). He, Q. et al. Differentiation between fresh and frozen–thawed meat using rapid evaporative ionization mass spectrometry: the case of beef muscle. Journal of Agricultural and Food Chem. 69, 5709–5724 (2021). Wang, J. et al. Liquid chromatography quadrupole time-of-flight mass spectrometry and rapid evaporative ionization mass spectrometry were used to develop a lamb authentication method: A preliminary study. Foods 9, 1723 (2020). Zhang, R., Ross, A. B., Yoo, M. J. & Farouk, M. M. Metabolic fingerprinting of in-bag dry-and wet-aged lamb with rapid evaporative ionisation mass spectroscopy. Food Chem. 347, 128999 (2021). Menze, B. H. et al. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinformatics 10, 213 (2009). Zhang, R., Realini, C. E., Middlewood, P., Pavan, E. & Ross, A. B. Metabolic fingerprinting using Rapid evaporative ionisation mass spectrometry can discriminate meat quality and composition of lambs from different sexes, breeds and forage systems. Food Chem. 386, 132758 (2022). Penning, B. W., Snelling, W. M. & Woodward-Greene, M. J. Machine learning in the assessment of meat quality. IT Prof. 22, 39–41 (2020). Wolpert, D. H. The lack of a priori distinctions between learning algorithms. Neural Comput. 8, 1341–1390 (1996). Muroya, S., Ueda, S., Komatsu, T., Miyakawa, T. & Ertbjerg, P. MEATabolomics: Muscle and meat metabolomics in domestic animals. Metabolites 10, 188 (2020). Klont, R. E., Kurt, E., Heres, L. & Urlings, B. Production of entire males-challenges and opportunities. Fleischwirtschaft 90, 107–109 (2010). Mörlein, D. et al. An overlooked compound contributing to boar taint and consumer rejection of meat products: 2-Aminoacetophenone. Meat Sci. 213, 109497 (2024). Muñoz, M. et al. Development of a 64 SNV panel for breed authentication in Iberian pigs and their derived meat products. Meat Sci. 167, 108152 (2020). Wagner, L., Kaufmann, M., Lange, F., Dallmann, A. & Bergmann, M. Differentiation of pork (Sus scrofa domesticus) and wild boar (Sus scrofa) meat using 1H NMR spectroscopy and MALDI-ToF mass spectrometry. Eur. Food Res. Technol. 251, 747–766 (2025). Ramiro, J. L., Neo, A. G., Pérez-Palacios, T., Antequera, T. & Marcos, C. F. Machine learning-enabled fatty acid quantification and classification of pork from autochthonous breeds using low-field 1H NMR spectroscopic data. Food Control 166, 110753 (2024). Ramiro, J. L. et al. Classification of raw cuts from Iberian and Celta pigs based on lipid analysis and chemometrics. J. Food Compos. Anal. 130, 106173 (2024). Liu, H. et al. Metabolomics analysis provides novel insights into the difference in meat quality between different pig breeds. Foods 12, 3476 (2023). del Moral, F. G. et al. Duroc and Iberian pork neural network classification by visible and near infrared reflectance spectroscopy. J. Food Eng. 90, 540–547 (2009). Font-i-Furnols, M. et al. Feasibility of on/at line methods to determine boar taint and boar taint compounds: an overview. Animals 10, 1886 (2020). Additional Declarations Competing interest reported. JB and SLS are employed by Waters Supplementary Files Gkaraneetal.2025SupplMaterial.docx Cite Share Download PDF Status: Published Journal Publication published 08 Jan, 2026 Read the published version in npj Science of Food → Version 1 posted Editorial decision: Revision requested 23 Oct, 2025 Reviews received at journal 23 Oct, 2025 Reviews received at journal 16 Oct, 2025 Reviewers agreed at journal 02 Oct, 2025 Reviewers agreed at journal 02 Oct, 2025 Reviewers invited by journal 02 Oct, 2025 Editor assigned by journal 27 Sep, 2025 Submission checks completed at journal 16 Sep, 2025 First submitted to journal 04 Sep, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7537051","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":528388012,"identity":"73390c14-b454-4974-b2f8-ae7cb0a294c7","order_by":0,"name":"Vasiliki Gkarane","email":"","orcid":"","institution":"Queen's University Belfast","correspondingAuthor":false,"prefix":"","firstName":"Vasiliki","middleName":"","lastName":"Gkarane","suffix":""},{"id":528388014,"identity":"eeb9dd26-07e5-4bf0-9ed2-1c0ff0ecd79e","order_by":1,"name":"Marilyn De Graeve","email":"","orcid":"","institution":"Ghent University","correspondingAuthor":false,"prefix":"","firstName":"Marilyn","middleName":"","lastName":"De Graeve","suffix":""},{"id":528388015,"identity":"0c58b95c-5093-42ae-801b-563e53bba095","order_by":2,"name":"Clive Stephens","email":"","orcid":"","institution":"Cranswick Country Foods","correspondingAuthor":false,"prefix":"","firstName":"Clive","middleName":"","lastName":"Stephens","suffix":""},{"id":528388016,"identity":"c92621db-44e3-4f06-af64-024e0ac9e567","order_by":3,"name":"Anneleen I Decloedt","email":"","orcid":"","institution":"Ghent University","correspondingAuthor":false,"prefix":"","firstName":"Anneleen","middleName":"I","lastName":"Decloedt","suffix":""},{"id":528388017,"identity":"c5ef8560-aecb-4900-9c14-c2d58f0e53f8","order_by":4,"name":"Pablo Vangeenderhuysen","email":"","orcid":"","institution":"Ghent University","correspondingAuthor":false,"prefix":"","firstName":"Pablo","middleName":"","lastName":"Vangeenderhuysen","suffix":""},{"id":528388018,"identity":"cec1f6e5-7b0d-4e93-9e32-693607b4db89","order_by":5,"name":"Julia Balog","email":"","orcid":"","institution":"Waters (United States)","correspondingAuthor":false,"prefix":"","firstName":"Julia","middleName":"","lastName":"Balog","suffix":""},{"id":528388019,"identity":"2bcb76a0-4b08-45d8-8a2f-3208e5ee7788","order_by":6,"name":"Chris Elliott","email":"","orcid":"","institution":"Queen's University Belfast","correspondingAuthor":false,"prefix":"","firstName":"Chris","middleName":"","lastName":"Elliott","suffix":""},{"id":528388020,"identity":"4ec79c2e-1180-4f11-86f1-f0aa9ad56cda","order_by":7,"name":"Sarah L Stead","email":"","orcid":"","institution":"Waters (United Kingdom)","correspondingAuthor":false,"prefix":"","firstName":"Sarah","middleName":"L","lastName":"Stead","suffix":""},{"id":528388021,"identity":"aaaafddf-ba01-42d0-ac97-78d4e3036bdc","order_by":8,"name":"Nick Birse","email":"","orcid":"","institution":"Queen's University Belfast","correspondingAuthor":false,"prefix":"","firstName":"Nick","middleName":"","lastName":"Birse","suffix":""},{"id":528388022,"identity":"e6177027-81c3-4364-ba73-84b4513df954","order_by":9,"name":"Lieselot Y Hemeryck","email":"","orcid":"","institution":"Ghent University","correspondingAuthor":false,"prefix":"","firstName":"Lieselot","middleName":"Y","lastName":"Hemeryck","suffix":""},{"id":528388023,"identity":"268ef5b9-0e5e-4449-9e36-1076cd8dc090","order_by":10,"name":"Lynn Vanhaecke","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/ElEQVRIiWNgGAWjYBACPhjDAIgZGxhsIDwgCydgQ9OSBtFAipbDRGiRyD32gLHNLs+c/fDBjzMqzifOd+89/oBxhw0eLXnpBoxtycWWPWnJkhvO3E7ceOZcYgPjmTQ8WnLMJBjOMCduOJBjxviwDahlRo5hA2PbYUJa6hM3nH//jfHhv3OJG+e/AWn5T0BLxeHEDTdy2Bg3NhxInC/BA9JyALcWnjdmEgkVx4FanhlLzjiWbLyBJ8dwRuKZZJxa+NmBtnwwqAY6LPnhx54aO9n57WcMPnzcYYdTC4NAAgNDArKAAchJCVhUIqxBd7R8Az7lo2AUjIJRMBIBALe0WvUG1zFnAAAAAElFTkSuQmCC","orcid":"","institution":"Queen's University Belfast","correspondingAuthor":true,"prefix":"","firstName":"Lynn","middleName":"","lastName":"Vanhaecke","suffix":""}],"badges":[],"createdAt":"2025-09-04 14:23:15","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7537051/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7537051/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41538-025-00685-4","type":"published","date":"2026-01-08T15:59:10+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":93713402,"identity":"fbfda480-1633-4518-a3be-baaf57b80ca6","added_by":"auto","created_at":"2025-10-16 18:49:10","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":593878,"visible":true,"origin":"","legend":"","description":"","filename":"Gkaraneetal.2025edit.docx","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/8884706cb5941cd47ee7cfe4.docx"},{"id":93714110,"identity":"4b626d7e-e81c-469a-abd1-d987d448456e","added_by":"auto","created_at":"2025-10-16 18:57:10","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":12430,"visible":true,"origin":"","legend":"","description":"","filename":"60f17d6ff07f4fb09530c44515d0ea98.json","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/88891c0a9e83805201d647d4.json"},{"id":93714112,"identity":"b49b3f4b-9cb7-450b-8d5c-5965642bd85d","added_by":"auto","created_at":"2025-10-16 18:57:10","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":232294,"visible":true,"origin":"","legend":"","description":"","filename":"Gkaraneetal.2025SupplMaterial.docx","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/6f00718918f85ec43637bc7f.docx"},{"id":93714113,"identity":"6264260e-0696-4bbe-8656-78a9c9438bd0","added_by":"auto","created_at":"2025-10-16 18:57:10","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":134584,"visible":true,"origin":"","legend":"","description":"","filename":"60f17d6ff07f4fb09530c44515d0ea981enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/8c9dd9c41fee1e1b0b8c6793.xml"},{"id":93713399,"identity":"bad90c62-5d35-4440-9b15-9a6e00416919","added_by":"auto","created_at":"2025-10-16 18:49:10","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":73073,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/dd71e0c725530273cb07b9f2.png"},{"id":93713404,"identity":"7817b117-ccfa-42ed-b816-9f5d6c740a7d","added_by":"auto","created_at":"2025-10-16 18:49:10","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":23230,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/6324b3c61700474930e9f0bb.png"},{"id":93713403,"identity":"e82064da-7998-47ed-8eaa-590268c5b6f2","added_by":"auto","created_at":"2025-10-16 18:49:10","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":22592,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/96d5649ccc15fe0264f0e354.png"},{"id":93713405,"identity":"6c59d455-c219-4024-8eb8-81b900a69d67","added_by":"auto","created_at":"2025-10-16 18:49:10","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":35918,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/66d60e8a233fed55bc0ed284.png"},{"id":93714114,"identity":"2740f752-6878-41c4-85bb-fca5aabc1a42","added_by":"auto","created_at":"2025-10-16 18:57:10","extension":"xml","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":131061,"visible":true,"origin":"","legend":"","description":"","filename":"60f17d6ff07f4fb09530c44515d0ea981structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/1515e6f5e1d81bc08466f6c6.xml"},{"id":93713406,"identity":"3085f300-914d-4bb0-85ec-25c5c0fed189","added_by":"auto","created_at":"2025-10-16 18:49:10","extension":"html","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":142318,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/de127f0c4625ab9f3e279717.html"},{"id":93713395,"identity":"fb695a0d-e6f6-4dc7-9d7f-205d4374a890","added_by":"auto","created_at":"2025-10-16 18:49:09","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":205450,"visible":true,"origin":"","legend":"\u003cp\u003eSchematic overview of the real-time, at-line application of REIMS.\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/6ae7105d540fb85e26769ad5.jpeg"},{"id":93713392,"identity":"d32d6853-271b-466d-964e-874e90869bb2","added_by":"auto","created_at":"2025-10-16 18:49:09","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":101088,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrices for SVM models of the three pairwise comparisons; a: COMs vs. HMPs, b: HMPs vs. L-Ws, c: COM vs. L-Ws of the overall breed classification dataset (n=1046; COMs=346, HMPs=341, L-Ws=359).\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/df840715b5db5b2ba04d693a.png"},{"id":93713398,"identity":"bc06cd4d-0ad5-4fe3-bd4c-f1b93ec70860","added_by":"auto","created_at":"2025-10-16 18:49:10","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":96703,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrices for SVM models of the three pairwise comparisons; a: Durocs vs. HMPs, b: HMPs vs. L-Ws, c: Durocs vs. L-Ws in the breed classification subset with a reduced genetic variability (n=600; Durocs=200, HMPs=200, L-Ws=200).\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/8f345a38ca0e9ca9cd5da5f5.png"},{"id":93714368,"identity":"efd3d327-164d-4adc-b768-a9219d743e5f","added_by":"auto","created_at":"2025-10-16 19:05:09","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":133913,"visible":true,"origin":"","legend":"\u003cp\u003eTest set accuracy and F1-scores (%) for OPLS-DA, SVM and RF modelling of boar taint; a: Phase I overall, b: Phase I balanced, c: Phase II overall, d: Phase II balanced.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/5620d0359aa7c3ea48995abc.png"},{"id":100069427,"identity":"ca5b44d8-99f8-49cc-bfd6-607b25f21459","added_by":"auto","created_at":"2026-01-12 16:13:55","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1519367,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/7162e412-1e86-4832-b93d-d641bfeeeb92.pdf"},{"id":93714109,"identity":"e0583f6b-4e77-42cc-ab6b-ae85da89dadb","added_by":"auto","created_at":"2025-10-16 18:57:09","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":232294,"visible":true,"origin":"","legend":"","description":"","filename":"Gkaraneetal.2025SupplMaterial.docx","url":"https://assets-eu.researchsquare.com/files/rs-7537051/v1/55b7d1d2816b78b1d9c22d58.docx"}],"financialInterests":"Competing interest reported. JB and SLS are employed by Waters","formattedTitle":"Towards Real-Time Industry-Proof Pork Breed and Boar Taint Classification using Rapid Evaporative Ionisation Mass Spectrometry (REIMS)","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eConsumers have high expectations regarding the authenticity and organoleptic quality of meat products. Several practices may sabotage traceability and promote food fraud however, including provenance masking, species substitution or adulteration\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e. Therefore, to combat food fraud and meet consumers expectations, the meat industry is required to install reliable quality-monitoring and traceability systems, consisting of a labelling system that associates each production animal with the final meat product\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003ePig breed is one of the most important pre-slaughter factors to consider regarding pork meat quality traits (including eating quality) and muscle fibre characteristics (including e.g. intramuscular fat)\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e. In this context, UK meat traders have secured premium prices for pork products from traditional British livestock breeds like e.g. Duroc\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e. Meat producers are however challenged to prove breed authenticity for retail produce\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e, and therefore real-time breed identification would be of great benefit. An increasing number of countries are using livestock traceability control systems to ensure the acceptability of their products on the international market\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. Until now, DNA-based technologies like microsatellite genotyping and single nucleotide polymorphisms (SNPs) have been applied to verify traceability of animal origin. This however requires a genotypic data repository of many different breeds for comparison (in the case of microsatellites)\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e, or a large number of loci (in the case of SNPs), resulting in high costs\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e. DNA barcoding, based on analysis of genes responsible for pigmentation undergoing mutation in individual breeds is deemed another promising technology\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e, and has been applied for species identification\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e, but appears unsatisfactory for breed identification\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eAnother important pre-slaughter factor for the organoleptic quality of pork is the presence of boar taint. Boar taint is an off-odour that may occur in entire males, and which has long plagued the industry. The increased demand to improve the sensory quality of meat from entire males has underlined the already existing need for real-time screening and sorting of carcasses with or without boar taint at the slaughter line\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. In this context, several analytical methods like e.g. GC-MS and RAMAN spectroscopy have been proposed, but lack sensitivity and specificity, and are moreover not high-throughput\u003csup\u003e\u003cspan additionalcitationids=\"CR16 CR17\" citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Sensory methods (in which the smell is assessed by a trained expert) allow fast and holistic boar taint assessment, but are thwarted by interindividual variation, habituation and fatigue\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e. As such, at the time, there still is no existing method that meets all requirements for real-time at-line boar taint detection, in which the high rate at which pigs are slaughtered remains the biggest challenge\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eThe application of metabolomics for food authentication and quality assessment has grown strongly over the past 15 years, with Mass Spectrometry (MS) being one of the most used platforms\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e,\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. More specifically, MS-based metabolomics has been applied successfully to address authenticity, functionality, quality and safety of raw, semi-processed and finished food products\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e. The advancement of ambient MS-based metabolomics, like e.g. Rapid Evaporative Ionisation MS (REIMS), is of specific interest in this context, due to the possibility of high-throughput, real-time classification, with high sensitivity and selectivity\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e,\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e. The technique relies on the desorption and ionisation of molecules by creating an aerosol from a biological sample, using an electrosurgical device, bipolar forceps or laser\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e,\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e,\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e. The combination with a Q-ToF mass analyser provides the spectral information required, which, through chemometric analysis (e.g., Principal Component Analysis (PCA) and Orthogonal Partial Least-Squares Discriminant Analysis (OPLS-DA)) creates statistical models for group evaluation and separation\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e. Alternatively, Machine Learning (ML) classification algorithms (e.g. Support Vector Machines (SVM), Random Forests (RF)) may be applied to process big heterogeneous datasets and generate predictive classification models\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e,\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. In previous work, REIMS has been used in conjunction with multivariate classification models or ML-based algorithms to e.g. assess meat quality characteristics\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e, breed in beef\u003csup\u003e\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e, production system in chicken meat products\u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e and fish fraud\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e with high accuracy. REIMS has moreover already been applied successfully to model boar taint status in a small-scale laboratory setting\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eThe real-time, at-line sampling and analysis of pork characteristics by means of REIMS offers significant potential for monitoring and decision making (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Ideally, assessment of as many as possible parameters in one single analysis is made, moreover preferentially in an economically less relevant part of the carcass. For breed and boar taint specifically, sampling of neck fat is a valid option since differences in breed are reflected in adipose composition\u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e, and boar taint compounds are known to accumulate in adipose tissue\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eIn the current study, we hypothesized that REIMS can be used to generate a metabolomic fingerprint-based classifier for the accurate, real-time prediction of pork breed and boar taint status. The first objective of this study was therefore to test the accuracy of REIMS for pig breed classification based on metabolomic fingerprinting of neck fat samples, using either multivariate data analysis (OPLS-DA) or machine learning algorithms (SVM, RF). This was tested for two datasets: (1) with the aim of differentiating three pork categories: i.e., Commercials (COMs, n\u0026thinsp;=\u0026thinsp;346), Hampshires (HMPs, n\u0026thinsp;=\u0026thinsp;341) and Large-Whites (L-Ws, n\u0026thinsp;=\u0026thinsp;359); and (2) a subset of the first dataset to secure less genetic variability within each category, including Durocs (as subset of the COMs group), Hampshires (HMPs) and Large-Whites (L-Ws). The second objective of this study was to use and demonstrate the potential of the same proposed technology and methodology for the real-time, at-line screening and classification of tainted \u003cem\u003evs\u003c/em\u003e. untainted boar carcasses. The benefits of using either multivariate data analysis (OPLS-DA) or machine learning algorithms (SVM, RF) were tested in an off-line laboratory setting (n\u0026thinsp;=\u0026thinsp;2055) as well as online (n\u0026thinsp;=\u0026thinsp;554), i.e., in a slaughterhouse.\u003c/p\u003e"},{"header":"2. Material and methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1. Collection of neck fat samples from different pig breeds\u003c/h2\u003e\u003cp\u003eA total of 1046 neck fat samples from pigs (finishing ages of 22\u0026ndash;24 weeks) were collected over 5 days in November 2020 from a large meat processing plant in the UK. The 1046 samples comprised 3 categories (as detailed in Supplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e): 346 Commercials (COMs) crossbred pigs, 341 Hampshire (HMPs) crossbred pigs, and 359 L-Ws crossbred pigs. To minimise the impact of genetic variability within each category, a subset was made (as detailed in Supplementary Table S2), including 600 samples of the original dataset and only one genetic combination per breed/category. i.e., 200 crossbred pigs COMs, 200 HMPs crossbred pigs and 200 L-Ws crossbred pigs.\u003c/p\u003e\u003cp\u003eKilling of the pigs was not carried out by the authors but performed in line with the commercial slaughtering process, including stunning, bleeding, scalding, dehairing, singeing, eviscerating, washing and chilling. Samples were collected post rapid cooling (2\u0026thinsp;\u0026plusmn;\u0026thinsp;2\u0026deg;C), sealed in plastic bags and frozen in an air-blast freezer, before transfer to the ASSET Technology Centre (Institute for Global Food Security, Queen\u0026rsquo;s University Belfast, Northern Ireland). Samples were kept at -20\u0026deg;C, until analysis within 8 days following receipt. Prior to analysis, samples were thawed during approximately one hour in a laminar flow safety cabinet and maintained at 2\u0026thinsp;\u0026plusmn;\u0026thinsp;2\u0026deg;C.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2. Collection, sensory and chemical analysis of boar neck fat samples\u003c/h2\u003e\u003cdiv id=\"Sec5\" class=\"Section3\"\u003e\u003ch2\u003e2.2.1. Phase I: off-line sample selection and analysis\u003c/h2\u003e\u003cdiv id=\"Sec6\" class=\"Section4\"\u003e\u003ch2\u003e2.2.1.1. Collection of samples and boar taint analysis\u003c/h2\u003e\u003cp\u003eA total of 2055 boar fat samples (from different batches, different farms, and sampled on different slaughter days as detailed in Supplementary Table S3) were collected at Exportslachthuis Tielt (Belgium) and Sus Campinae Westerlo (Belgium). Pigs were commercially slaughtered pigs, undergoing carbon dioxide, bleeding, scalding, dehairing, singeing, eviscerating, washing and chilling. The fat samples were cut at-line or in the slaughterhouse\u0026rsquo;s chilling rooms (2\u0026thinsp;\u0026plusmn;\u0026thinsp;2\u0026deg;C), sealed in plastic bags and transported to the LIMET laboratory (Belgium) in a cooler, after which samples were stored at -20\u0026deg;C.\u003c/p\u003e\u003cp\u003eSensory boar taint analysis was done using the soldering iron method\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e, performed by two independent experts who assigned a score from 0 to 4, with 0 = \u0026ldquo;no taint\u0026rdquo; and 4 = \u0026ldquo;strong taint\u0026rdquo;. If there was no consensus between the two experts, i.e., a difference in score\u0026thinsp;\u0026ge;\u0026thinsp;2, the sample was scored by a third expert to provide a definite answer as to the sensory \u0026ldquo;boar taint status\" of that specific sample. Based on the average sensory score (and preliminary \u0026ldquo;boar taint status\u0026rdquo;), 554 samples were selected for chemical analysis, including 153 samples with an average score\u0026thinsp;\u0026lt;\u0026thinsp;1.5, and 208 samples with an average score\u0026thinsp;\u0026ge;\u0026thinsp;1.5. The experts consented to participate with the sensory research. Ethical permission was not required.\u003c/p\u003e\u003cp\u003eChemical analysis entailed analysis of the three known boar taint compounds indole, skatole and androstenone in the fat using UHPLC-HRMS (Ultra High Performance Liquid Chromatography hyphenated to High Resolution Mass Spectrometry), as described previously\u003csup\u003e\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section4\"\u003e\u003ch2\u003e2.2.1.2. Selection of samples for boar taint modelling\u003c/h2\u003e\u003cp\u003eAs opposed to certain other traits, like e.g., breed or type, the occurrence of boar taint is less straightforward, due to the occurrence of very minor, mild to average and more severe boar taint cases. Therefore, to be able to create a reliable boar taint classification model, results of the sensory and chemical boar taint analyses were examined to make a more strict and unambiguous classification, i.e., samples were considered \u0026ldquo;tainted\u0026rdquo; if fat concentrations of respectively indole, skatole and/or androstenone exceeded the limit of 100, 200 and/or 1000 ppb respectively, regardless of the average sensory score. Only samples that were assigned an average sensory score of zero (exclusion from 0.25 onwards) and demonstrated indole, skatole and androstenone concentrations below 50, 100 and 500 ppb respectively, were considered \u0026ldquo;untainted\u0026rdquo;. Samples of which the sensory and chemical analysis rendered a contradicting boar taint status were excluded. This resulted in the selection of 1097 samples, with 93 tainted samples.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section3\"\u003e\u003ch2\u003e2.2.2. Phase II: online sample selection and analysis\u003c/h2\u003e\u003cp\u003eA total of 554 boar fat samples (detailed in Supplementary Table S3) were collected at Sus Campinae (Westerlo, Belgium), cut, stored, and transported as described previously (section \u003cspan refid=\"Sec6\" class=\"InternalRef\"\u003e2.2.1.1\u003c/span\u003e.). All samples underwent sensory boar taint analysis (cfr. section \u003cspan refid=\"Sec6\" class=\"InternalRef\"\u003e2.2.1.1\u003c/span\u003e.), and based on the mean sensory score, 183 samples were selected for chemical analysis (62 untainted samples\u0026thinsp;\u0026lt;\u0026thinsp;1.5 and 121 tainted samples\u0026thinsp;\u0026ge;\u0026thinsp;1.5). Using the sample selection criteria used in Phase I, the total number of tainted samples was deemed too low for successful modelling, and therefore, in Phase II, the selection of tainted samples was less strict, with the aim of increasing statistical power. Logistical constraints did not allow for more extensive sampling and inclusion of more samples. Tainted samples were selected based on indole, skatole and/or androstenone levels exceeding 100, 200 and/or 900 ppb respectively, regardless of the average sensory score. The untainted samples were selected based on an average sensory score of 0, without exceptions. Based on the results of the sensory and chemical analysis combined, a final selection of 65 tainted and 147 untainted samples was made (212 samples in total). Originally, for phase II, plans were made to sample and analyse boar neck fat samples at Exportslachthuis Tielt (Belgium). However, in the weeks following the installation of the REIMS instrument at Exportslachthuis Tielt (see below), no boars were slaughtered there. Hence, alternatively, samples were collected at Sus Campinae, yet analysed at Exportslachthuis Tielt.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003e2.3. REIMS analysis\u003c/h2\u003e\u003cdiv id=\"Sec10\" class=\"Section3\"\u003e\u003ch2\u003e2.3.1. Sample analysis\u003c/h2\u003e\u003cp\u003eREIMS analysis of neck fat samples was performed using a Waters Xevo G2-XS QToF mass spectrometer and Waters REIMS ion-source (Waters, Wilmslow, UK). For both sections of the study, \u0026ldquo;burning\u0026rdquo; of fat samples was performed using a 3D-printed \u0026lsquo;fat probe\u0026rsquo; (Waters Research Centre, Budapest, Hungary) powered by an ERBE VIO 50C diathermy generator (Erbe Elektromedizin, Tubingen, Germany) that was designed for at-line application specifically (as demonstrated and shown in Supplementary Figures \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e and S2). Instrument settings are provided in Supplementary Table S4. Analysis of neck fat samples for pig breed classification was performed at the ASSET Technology Centre (under controlled laboratory circumstances), and the off-line analysis (Phase I) of 2055 boar fat samples was performed at LIMET (under controlled laboratory circumstances). The online analysis (Phase II) of 554 boar samples was performed in an \u003cem\u003ein situ\u003c/em\u003e air-conditioned laboratory adjacent to the slaughter line, after moving the LIMET REIMS instrument to Exportslachthuis Tielt (Belgium). The instrument was set up in a separate room, connected to the fat probe at the slaughter line through 4 m long tubing.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec11\" class=\"Section3\"\u003e\u003ch2\u003e2.3.2. System calibration, cleaning and quality control\u003c/h2\u003e\u003cp\u003eThe REIMS instrument underwent a detector setup process using a 0.1 ng/\u0026micro;L lockmass solution of leucine enkephalin (Waters, Milford, MA, USA) in 2-propanol (LC-MS grade; Honeywell Riedel-de Ha\u0026euml;n, Seelze, Germany) at a flow rate of 0.2 mL/min, and calibration using 5 mM sodium formate infusion (20 \u0026micro;L/min) at the start of each analysis day to correct for instrumental drift. The tip of the fat probe was cleaned every five samples by means of a water-soaked swab following the removal of charred remains. All REIMS parts (including the source, tubing and Venturi) were cleaned every 100 to 200 samples, whilst the instrument was vented and the StepWave was cleaned approximately every 500 samples.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003e2.4. Data analysis\u003c/h2\u003e\u003cdiv id=\"Sec13\" class=\"Section3\"\u003e\u003ch2\u003e2.4.1. Data preprocessing\u003c/h2\u003e\u003cp\u003eData pre-processing included data cleaning (checking for inconsistent data, removing duplicates), baseline correction, peak detection, alignment, normalisation and scaling were performed in R (version 3.4.3, Vienna, Austria) in line with De Graeve, et al.\u003csup\u003e30\u003c/sup\u003e (breed data) and Verplanken, et al.\u003csup\u003e14\u003c/sup\u003e (boar taint data). Data were processed in a virtual computer environment with OS Linux (Ubuntu, v16.04 LTS or v20.04 LTS, Linux) using Oracle VM VirtualBox (version 6.1, Oracle). Code used for the preprocessing is available at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/UGent-LIMET/Preprocessing_AIMS\u003c/span\u003e\u003cspan address=\"https://github.com/UGent-LIMET/Preprocessing_AIMS\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section3\"\u003e\u003ch2\u003e2.4.2. Chemometric multivariate modelling using (O)PLS-DA\u003c/h2\u003e\u003cp\u003eMultivariate classification analysis was performed using SIMCA 17.0 (Sartorius Stedim Biotech, Umea, Sweden). PCA modelling was used to detect potential sample clustering, trends or outliers. The supervised OPLS-DA analysis was used to construct prediction models that predict the Y-variable (classification of samples in COMs, HMPs or L-Ws group) from the X matrix (the feature matrix with intensities of \u003cem\u003em/z\u003c/em\u003e peaks for each sample). SIMCA models were evaluated in terms of the seven-fold cross-validated cumulative modelled variation in the X matrix (R\u003csup\u003e2\u003c/sup\u003eX(cum)\u0026thinsp;\u0026gt;\u0026thinsp;0.5), the cumulative modelled variation in the Y-variable i.e., the goodness of fit to the original data (R\u003csup\u003e2\u003c/sup\u003eY(cum)\u0026thinsp;\u0026gt;\u0026thinsp;0.5), and the cumulative predictive ability of the models (Q\u003csup\u003e2\u003c/sup\u003e(cum)\u0026thinsp;\u0026gt;\u0026thinsp;0.5). Permutation testing (100 permutations) was performed to control the risk of spurious findings. Additionally, CV-ANOVA (cross-validated analysis of variance) was performed to validate the models, according to a 1/7 leave-out classification, with a statistical cut-off of p\u0026thinsp;\u0026lt;\u0026thinsp;0.05.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section3\"\u003e\u003ch2\u003e2.4.3. Chemometric machine learning-based modelling using RF and SVM\u003c/h2\u003e\u003cp\u003eRF and SVM modelling were performed in Python (version 3.7.4, Fredericksburg, VA) using sklearn. More details, including the implemented programming languages and packages were published previously\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. For both SVM and RF modelling, 75% of the original dataset was used as a training set after randomisation, followed by hyperparameter-optimisation with 5-fold cross-validation. Specifically, the training set was subsequently randomly split into five equal subgroups and the models were trained using four of the five subsets and validated on the remaining one part of the data, with computation of accuracy. The remaining 25% of the dataset (further referred to as the test or hold-out set) was used to test the ability of the best model (derived from the training set) to accurately classify the samples into their \u003cem\u003ea priori\u003c/em\u003e labelled classes. An overview of the tested set of parameters for hyperparameter optimisation of RF and SVM algorithms is provided in Supplementary Table S5.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec16\" class=\"Section3\"\u003e\u003ch2\u003e2.4.4. Model performance\u003c/h2\u003e\u003cp\u003eTo evaluate the performance and compare the predictive ability of each cross-validated model, the following indicators were calculated on the hold-out set: Accuracy, Precision, Recall, Specificity and F1-score. Accuracy refers to the ratio of correctly classified samples to the total number of classified samples and characterises the performance of the model: Accuracy = (TP\u0026thinsp;+\u0026thinsp;TN) / (TP\u0026thinsp;+\u0026thinsp;TN\u0026thinsp;+\u0026thinsp;FP\u0026thinsp;+\u0026thinsp;FN) with TP, TN, FP, and FN denoting true positives, true negatives, false positives, and false negatives, respectively. Precision indicates the ratio of total true positive to the total predicted positive samples: Precision\u0026thinsp;=\u0026thinsp;TP / (TP\u0026thinsp;+\u0026thinsp;FP). Recall refers to the model\u0026rsquo;s ability to correctly classify relevant samples and was calculated by the fraction of true positive samples correctly classified as true positive samples: Recall\u0026thinsp;=\u0026thinsp;TP / TP\u0026thinsp;+\u0026thinsp;FN. Specificity indicates the fraction of true negative samples correctly classified as true negative samples: Specificity\u0026thinsp;=\u0026thinsp;TN / (TN\u0026thinsp;+\u0026thinsp;FP). F1-score represents a balanced accuracy as it is a combination of Precision and Recall: F1-score\u0026thinsp;=\u0026thinsp;2 \u0026times; (Precision \u0026times; Recall) / (Precision\u0026thinsp;+\u0026thinsp;Recall).\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e"},{"header":"3. Results","content":"\u003cp\u003eREIMS was used to generate metabolomic fingerprints of thousands of pig neck fat samples to assess its potential application for accurate classification of pig breed and boar taint status. To model the data, multivariate data analysis (OPLS-DA) and machine learning algorithms (SVM, RF) were implemented, followed by comparison and evaluation of model performance.\u003c/p\u003e\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\u003ch2\u003e3.1. Breed classification\u003c/h2\u003e\u003cp\u003ePCA analysis was performed to assess inherent sample clustering or trends among categories in the breed classification dataset but did not reveal breed-related trends, indicating that breed was not the main driver of biological variance. OPLS-DA, SVM and RF were used to generate pairwise comparisons among categories (i.e., COMs \u003cem\u003evs.\u003c/em\u003e HMPs, HMPs \u003cem\u003evs.\u003c/em\u003e L-Ws and COMs \u003cem\u003evs.\u003c/em\u003e L-Ws), and breed classification was also studied in a smaller subset, with reduced (impact of) genetic variability within each breed type.\u003c/p\u003e\u003cp\u003eResults of OPLS-DA models indicated good to excellent classification accuracy for all three comparisons (COMs \u003cem\u003evs\u003c/em\u003e. HMPs, HMPs \u003cem\u003evs\u003c/em\u003e. L-Ws and COMs \u003cem\u003evs\u003c/em\u003e. L-Ws), ranging from 91\u0026ndash;93% in the overall dataset (Table S6) and from 85\u0026ndash;95% in the subset (Table S7). In SVM, the linear kernel function was selected as most appropriate in all pairwise comparisons in the original dataset and the reduced variability subset. The results show that the accuracy of the classification models for all three comparisons (COMs \u003cem\u003evs.\u003c/em\u003e HMPs, HMPs \u003cem\u003evs.\u003c/em\u003e L-Ws and COMs \u003cem\u003evs.\u003c/em\u003e L-Ws) was excellent, ranging from 90\u0026ndash;95% in the overall dataset (Table S8) and from 91\u0026ndash;96% in the subset (Table S9). The optimised model hyperparameters (Kernel, C, Gamma) for SVM derived models are summarised in Supplementary Table S10. The classification matrix for both the overall dataset and reduced variability subset is provided in Supplementary Tables S8 and S9. RF models showed an acceptable performance, although not to the same extent as the ones using the SVM algorithm. Specifically, the results showed that the classification accuracy of the models for all three comparisons (COMs \u003cem\u003evs.\u003c/em\u003e HMPs, HMPs \u003cem\u003evs.\u003c/em\u003e L-Ws and COMs \u003cem\u003evs.\u003c/em\u003e L-Ws) was good, ranging from 72\u0026ndash;80% in the overall dataset (Table S12) and from 71\u0026ndash;87% in the subset (Table S13). The optimised parameters (n trees, Min samples leaf, Max leaf nodes, max Depth) for RF derived models are summarised in Supplementary Table S11, whilst classification matrices are provided in Supplementary Tables S12 and S13.\u003c/p\u003e\u003cp\u003eOverall, the performance of OPLS-DA and SVM outperformed RF modelling and was highly intercomparable, albeit SVM performed slightly better than OPLS-DA. Both OPLS-DA and SVM enabled the discrimination of the three breeds, independent of the reduced genetic variability of the Commercials pork breeds, with reliable accuracies and an equal ratio of false positives of false negatives (as detailed in the Supplementary Tables S6-S9, S12 and S13, also depicting the Precision, Recall, Specificity and F1-score). Figures\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e and \u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e depict the confusion matrices for SVM models of the three pairwise comparisons of the overall breed classification dataset and breed classification subset (with a reduced genetic variability), respectively.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec19\" class=\"Section2\"\u003e\u003ch2\u003e3.2. Boar taint classification\u003c/h2\u003e\u003cp\u003eOPLS-DA, SVM and RF models for boar taint classification were created and tested in line with the approach used for breed classification. First, PCA analysis was performed, yet did not reveal any boar taint-related trends in the data (not shown). In the off-line Phase I (with fingerprinting of all samples under laboratory circumstances), a first, \u0026lsquo;overall\u0026rsquo; boar taint model was built modelling the REIMS-acquired metabolomic fingerprint of 93 tainted \u003cem\u003evs\u003c/em\u003e. 1004 untainted neck fat samples. As documented in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, boar taint classification accuracies of 94 to 95% could be reached using OPLS-DA, SVM and RF. Taking a closer look at correct classification % however, it appeared that 99% of untainted samples were classified correctly, whereas only 38, 33 and 42% of tainted samples were classified correctly using OPLS-DA, SVM and RF, respectively. The imbalance between false positives and false negatives is documented more in detail in Supplementary Tables S14-S16 (incl. Precision, Recall, Specificity, F1-score) and is reflected in the F1-scores of tainted samples that can be viewed in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA. With the aim of improving misclassification of tainted samples, a balanced model was constructed, including all 93 tainted samples and 100 untainted samples (with balancing prior to split; selecting samples with a sensory score of 0 and with indole, skatole and androstenone levels\u0026thinsp;\u0026lt;\u0026thinsp;LOD). This allowed classification accuracies from 94 to 96%, with notable improvement of classification of tainted samples also. More specifically, correct classification of untainted samples was 100, 96 and 96%, whilst correct classification of tainted samples was 92, 92 and 96% using OPLS-DA, SVM and RF, respectively, resulting in a higher Precision and F1-score for the \u0026lsquo;tainted\u0026rsquo; group (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB, Supplementary Tables S14-S16). Overall, the performance of OPLS-DA, SVM and RF models was highly intercomparable.\u003c/p\u003e\u003cp\u003eIn Phase II, in which the REIMS instrument was set up at the slaughterhouse, a first \u0026lsquo;overall\u0026rsquo; boar taint model was built modelling the REIMS-acquired metabolomic fingerprint of 65 tainted \u003cem\u003evs\u003c/em\u003e. 147 untainted neck fat samples. Using OPLS-DA, SVM and RF, accuracies of 70 to 74% were achieved. In line with the overall Phase I model, classification of tainted samples was insufficient as only 35, 24 and 6% of tainted samples were classified correctly, respectively. Correct classification of untainted samples was 89, 97 and 100%. To try to improve correct classification, balanced models were built by omitting samples from the larger \u0026lsquo;untainted\u0026rsquo; group. For these models, accuracies between 61 and 70% were obtained. Correct classification of tainted samples could be improved, i.e., a 59, 64 and 65% correct classification of tainted samples was achieved for OPLS-DA, SVM and RF, with a 69, 63 and 64% correct classification of untainted samples, respectively. Taking into account accuracy, as well as the correct classification of both tainted and untainted samples specifically, SVM modelling outperformed RF and OPLS-DA modelling, although there is scope for improvement to further decrease false positives and false negatives while guarding the balance between error types. OPLS-DA, SVM and RF model (hyper)parameters, classification confusion matrices and metrics (Precision, Recall, Specificity and F1-score) can be consulted in Supplementary Tables S14 to S16.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eA number of studies have applied MS-based metabolomics to address authenticity, functionality, quality and safety of raw, semi-processed and finished food products\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e. REIMS specifically can play an important role to verify the authenticity and quality of a food product and may be applied in traceability systems that link physical and digital attributes of products across the supply chain. The mass spectrometric fingerprints generated by REIMS require the application of multivariate classification models like e.g. OPLS-DA or ML-based algorithms to categorize meat samples according to production system and/or quality traits (e.g. breed, taint, species or tissue types)\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. In the current work, firstly, REIMS was used to classify breed categories according to the metabolomic profile of pigs from three different types in the UK market, as well as from a subset of breeds, to test if a lower genetic diversity would positively influence classification. Secondly, REIMS was used to classify boar taint status of uncastrated male pigs, both in a laboratory and slaughterhouse setting, to test applicability in real-time and at-line. Three different multivariate classification algorithms were used for this purpose, i.e., OPLS-DA, SVM and RF, which were selected based on previous research, in which the use of OPLS-DA, SVM and RF resulted in high accuracies for e.g. fish speciation and evaluation of beef quality\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e,\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e,\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cdiv id=\"Sec21\" class=\"Section2\"\u003e\u003ch2\u003e4.1. Breed classification\u003c/h2\u003e\u003cp\u003eIt is well established that breed or genetic type influences lipid content and fatty acid profile in pigs; and thus, metabolomic differences among breeds may be expected. Results for breed category classification showed that the use of REIMS combined with multivariate data analysis or machine-learning algorithms (RF or SVM) delivered models with high prediction accuracy. Specifically, the conjunction of REIMS with OPLS-DA demonstrated sufficient performance and an adequate capability to separate the categories, either in the complete dataset (1046 samples) or the subset (600 samples). Indeed, in the majority of meat-metabolomic studies, the conjunction of REIMS with multivariate statistics produces reliable models with excellent predictive abilities\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e,\u003cspan additionalcitationids=\"CR38 CR39\" citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e, with few exceptions of low performance\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e,\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eWhen evaluating alternative classification algorithms across the two breed datasets, RF exhibited a slightly lower performance compared to the OPLS-DA and SVM models, evident through the reduced number of correctly classified animals, substantiated by a lower accuracy. Menze, et al.\u003csup\u003e42\u003c/sup\u003e found that RF excelled in feature selection, while PLS-DA outperformed RF in classification when analysing spectral data. The authors suggested a combined approach, using RF's Gini index for feature selection and PLS-DA for classification, to leverage the strengths of both methods. However, Zhang, et al.\u003csup\u003e43\u003c/sup\u003e put forward an ML-based approach (e.g. decision tree algorithm) for the comparison of lamb breeds among farms, as this would better account for the diversity in farm production systems and would reveal confounding factors (diet, slaughter age, gender) -possibly hindering the discovery of potential breed biomarkers. In this case, non-linear ML techniques, such as RFs, may be more suitable for modelling metabolomics data, especially when dealing with the complex and highly diverse meat metabolome profile.\u003c/p\u003e\u003cp\u003eThe difference in performance between the two ML algorithms was also noted in this study. Overall, SVM provided better results in predicting the breed line compared to RF, as recorded through the accuracy (%) and F1-scores. However, the outperforming SVMs here were linear ML techniques, as the linear kernel was selected after 5-fold cross-validated hyperparameter optimisation for each breed model, indicating linear separability within this dataset. Our results confirm those of Gredell, et al.\u003csup\u003e29\u003c/sup\u003e, who applied REIMS in conjunction with different ML techniques to successfully classify beef quality attributes (including production background and breed); i.e., that the predictive accuracy of the different ML algorithms varied according to the classification problem, highlighting the need for a unique, fit-for-purpose approach. Depending on the classification problem, LDA, SVM (with a linear or radial kernel) or XGBoost emerged as the best performing algorithm in their results. In line with this, Penning, Snelling and Woodward-Greene\u003csup\u003e\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e, who successfully applied REIMS in combination with 8 different ML algorithms to assess beef carcass quality traits, also noted that the best performing algorithm was dependent on the trait that was being measured, to such degree that each beef trait required a different algorithm to perform the best predictions on measured data. This was also further confirmed by Loomas, et al.\u003csup\u003e36\u003c/sup\u003e, who found that different ML algorithms performed better (SVM (radial), XGBoost and RF), depending on the dataset and the meat quality attribute being predicted. Our and previous findings align with the \u0026ldquo;No Free Lunch Theorem\u0026rdquo;\u003csup\u003e45\u003c/sup\u003e, according to which there is no algorithm that can achieve optimal performance for each problem it encounters each time.\u003c/p\u003e\u003cp\u003eInterestingly, using the RF algorithms, the models derived for the breed subset performed slightly better than the models of the complete dataset, as expressed through the accuracy and F1-score values, for two comparisons (Durocs \u003cem\u003evs.\u003c/em\u003e HMPs and Durocs \u003cem\u003evs.\u003c/em\u003e L-Ws) especially. This may imply that the decreased level of genetic variability within each category positively influences phenotype expression and metabolomic fingerprinting and thus, the accuracy of breed classification. Apart from the genetic variability, the heterogeneity of the metabolic profile of fat and muscle tissues may be related to feeding, stress and post-mortem processes\u003csup\u003e\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u003c/sup\u003e. These factors can affect the way genes are expressed through processes like gene transcription and translation, and how proteins are modified, ultimately influencing the animal's physical traits and characteristics or phenotype.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec22\" class=\"Section2\"\u003e\u003ch2\u003e4.2. Boar taint classification\u003c/h2\u003e\u003cp\u003eIn previous work, we demonstrated that the metabolomic fingerprinting of neck fat obtained by means of REIMS analysis can be used for accurate boar taint classification using OPLS-DA modelling in a small-scale laboratory setting\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. In the current study, modalities for real-time, at-line screening and classification of tainted \u003cem\u003evs\u003c/em\u003e. untainted boar carcasses were assessed, i.e., firstly in an off-line, yet large-scale laboratory setting (phase I), followed up by testing in an online slaughterhouse setting (phase II). In addition, we aimed to assess the use of SVM and RF (compared to OPLS-DA) modelling for boar taint classification.\u003c/p\u003e\u003cp\u003eIn phase I, the total number of collected tainted samples was 93, being 8% of the dataset. A non-balanced model including all samples resulted in unacceptable misclassification of tainted samples (less than 50% correctly classified, independent of algorithm used). However, by establishing a balanced model, this could be improved substantially; with classification accuracies\u0026thinsp;\u0026gt;\u0026thinsp;93% and correct classification of both tainted and untainted samples\u0026thinsp;\u0026gt;\u0026thinsp;92% (accompanying F1-scores of \u0026gt;\u0026thinsp;94%). Apparently, the adverse effect of the reduction of statistical power due to the smaller sample size after balancing (i.e., by removing the excess of untainted samples) was negligible compared to the penalty by misclassification of more than 50% of tainted samples in the unbalanced model. Therefore, alternatively, to ensure penalisation, the Precision or F1-score of the tainted samples could be applied as evaluation metric during training (instead of accuracy).\u003c/p\u003e\u003cp\u003eIn phase II, the same strategy was applied and again, balanced models outperformed the overall unbalanced models, with improved classification of both the tainted and untainted samples. Accuracies for the phase II models were considerably lower compared to the phase I models however, and we hypothesize that this may be due to mislabelling during model training. In previous work, Verplanken, et al.\u003csup\u003e14\u003c/sup\u003e implemented REIMS-based boar taint classification using the iKnife and reached a classification accuracy of 100% (OPLS-DA). The setup was very similar to the work described here but on a much smaller scale, and all samples were subjected to both chemical and sensory boar taint analysis prior to REIMS analysis and modelling\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. Due to the very large number of samples in the current study, it was not feasible to subject all samples to chemical analysis, and thus a selection of (potentially) tainted samples was made based on sensory analysis results only. This implies that all final selected \u0026lsquo;tainted\u0026rsquo; samples demonstrated indole, skatole and/or androstenone levels of more than 100, 200 and 900 ppb, but indole, skatole and androstenone levels were in fact unknown for most \u0026lsquo;untainted\u0026rsquo; samples. These labels were used as Y-variable for model building however, and thus misclassification of untainted samples may form the basis of the lower classification accuracies hence obtained. All samples did undergo sensory analysis by at least two experts, but due to the very large number of samples analysed (more than 2000) in a short timeframe, habituation or fatigue may have occurred, potentially leading to false negative classification and inclusion of low to moderately tainted samples in the \u0026lsquo;untainted\u0026rsquo; group of samples. To be able to counteract this problem in the future; i.e., to be able to build a more robust REIMS-fingerprint based boar taint model in a slaughterhouse setting, we therefore advise to build a balanced model using the F1-score metric as evaluation parameter, with sample analysis performed at-line, yet with chemical and sensory boar taint analysis of all samples used for training. This must be part of the investment in terms of costs and time, to avoid incorrect inclusion and classification. It is moreover advised to limit the daily number of samples analysed by the sensory experts, to avoid habituation and fatigue, and ensure correct labelling and (pre)selection of samples. False negative classification risks consumers\u0026rsquo; liking of pork products. This does not apply for false positive classifications, although this may have ramifications for producers, i.e., in the form of penalty fees. Therefore, both false negative and positive test results are to be avoided at all costs\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e,\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e and should be penalised equally during model building.\u003c/p\u003e\u003cp\u003eLastly, it should be mentioned that recent work by M\u0026ouml;rlein, et al.\u003csup\u003e48\u003c/sup\u003e revealed that 2-aminoacetophenone (AAP) is a boar taint contributing compound that has been overlooked for years. Although the olfactory importance of AAP is yet to be confirmed, future inclusion of AAP in the chemical boar taint analysis panel is recommended, as it may reduce false negative classification and improve training and use of REIMS-based boar taint classification models.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec23\" class=\"Section2\"\u003e\u003ch2\u003e4.3. Practical and economic feasibility\u003c/h2\u003e\u003cp\u003eBesides REIMS, several candidate methods for pig breed classification have been tested, including e.g. sequencing of single nucleotide variants, MALDI-TOF (Matrix-Assisted Laser Desorption/Ionization - Time of Flight), Nuclear Magnetic Resonance (NMR), Visible and Near Infrared Reflectance (VIS/NIRS) spectroscopy, as well as conventional GC and LC-MS methods\u003csup\u003e\u003cspan additionalcitationids=\"CR50 CR51 CR52\" citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e\u003c/sup\u003e. VIS/NIRS e.g. has the specific advantage of being fast yet seems to demonstrate higher error rates in comparison\u003csup\u003e\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e. The other methods mentioned demonstrate very low error rates yet require more time, and moreover imply a higher cost per sample. For boar taint also, the at-line applicability and accuracy of several sensory and analytical detection methods have been investigated, but they do not meet the required performance characteristics\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e,\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eResults of the current study and similar other studies using REIMS are relevant to meat industries who are willing to apply efficient methods in terms of accuracy, sensitivity, low-cost and time-efficiency, i.e. to address prominent analytical challenges. Even more so, as REIMS-based molecular fingerprinting and ML-based classification can be used for the elucidation of a wide range of biological characteristics, REIMS is expected to indeed become a highly useful tool for meat science and the meat industry as a whole\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e. In the pig industry, REIMS fingerprinting and ML-based classification tools could be rolled out for the identification of genetic origin of labelled meat products and at-line detection of tainted meat. From a more practical point of view, we experienced that the installation and at-line implementation of REIMS, being highly expensive and sensitive lab equipment, in a slaughterhouse is challenging, yet feasible. The iKnife or fat probe are connected to the Xevo G2-XS Q-TOF instrument by means of 4 m long tubing, with the instrument itself placed in a separate room with controlled humidity and temperature next to the slaughter line. Purchase and installation involve a high investment cost for abattoirs, but considering the high number of pigs being slaughtered each (work)day, costs per analysis were estimated to remain below 1 Euro per carcass\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. Moreover, in case the REIMS instrument is used for more than one application, costs per analysis will decrease even further. Besides breed and boar taint classification, REIMS-based metabolomic fingerprinting may also be used for the detection of other pre-slaughter factors or meat product characteristics\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e, like e.g. screening for use of growth promotors\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e or adulteration with bulking agents\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e, further contributing to the overall economic feasibility. Nevertheless, as was discussed more in depth by De Graeve, et al. \u003csup\u003e30\u003c/sup\u003e, the implementation of real-time classification using REIMS in an industrial setting in the (near) future is mostly hardware and software dependent. The speed of the analysis itself remains unparalleled, but what would additionally be required is sufficient computing power, automatization of the data processing pipeline and integration of several chemometric modelling options into the REIMS-compatible vendor-specific or independent modelling software\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. As such, REIMS can be embedded in production and traceability systems that link the physical and digital attributes of food products across the supply chain, indeed playing an important role in the verification of food safety, authenticity, and quality.\u003c/p\u003e\u003c/div\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eIntegration of REIMS-based metabolomics with multivariate data analysis or machine-learning algorithms demonstrated remarkable potential modelling of pig breed post-slaughter. OPLS-DA, SVM, and RF yielded high prediction accuracies, albeit OPLS-DA and SVM outperformed RF data modelling. Reduction of genetic variability within categories may positively influence classification accuracy, although factors such as feeding, stress, and post-mortem processes also contribute to metabolic heterogeneity. While REIMS analysis also showed promise to accurately distinguish tainted from untainted boar carcasses using OPLS-DA, SVM and RF, the current work underscores the importance of balanced modelling and incorporation of extensive chemical and sensory boar taint analysis during model training, in order to minimize false negative classification.\u003c/p\u003e\u003cp\u003eDespite the foreseeable challenges regarding installation and implementation of REIMS in slaughterhouses, the economic feasibility of REIMS analysis remains promising, especially considering its numerous potential applications, i.e., beyond breed and boar taint classification. The successful integration of real-time classification using REIMS in an industrial setting depends on the selection of a fit-for-purpose multivariate model, reliable model training, other factors such as hardware and software compatibility, as well as data processing automation.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAUTHOR INFORMATION\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThese authors contributed equally: Lynn Vanhaecke, Lieselot Y. Hemeryck\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors and Affiliations\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eInstitute for Global Food Security, School of Biological Sciences National Measurement Laboratory: Centre of Excellence in Agriculture and Food Integrity, Queen\u0026rsquo;s University Belfast, 19 Chlorine Gardens, Belfast BT9 5DL, UK\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eV. Gkarane, C. Elliott, N. Birse, L. Vanhaecke\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eLaboratory of Integrative Metabolomics, Ghent University, Salisburylaan 133, BE-9820 Merelbeke, Belgium\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eM. De Graeve, A.I. Decloedt, P. Vangeenderhuysen, L.Y. Hemeryck, L. Vanhaecke\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eInstitute for Biomedicine, EURAC Research, Via A.-Volta 21, I-39100 Bolzano, Italy\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eM. De Graeve\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eCranswick Country Foods, Preston, Hull, HU12 8TB, UK\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eC. Stephens\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eWaters Corporation, Milford, MA, USA\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eJ. Balog\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eThe International Joint Research Center on Food Security, 113 Thailand Science Park, Phahonyothin Road, Pathum Thani 12120, Thailand\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eC. Elliott\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eScientific Operations, Waters Corporation, Wilmslow, UK\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eS.L. Stead\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eContributions\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eVG:\u0026nbsp;\u003c/strong\u003eFormal Analysis, Investigation, Data Curation, Writing - Original draft, Visualization, Project administration. \u003cstrong\u003eMDG:\u003c/strong\u003e Software, Validation, Formal Analysis, Data Curation, Writing - Original draft, Visualization. \u003cstrong\u003eCS:\u0026nbsp;\u003c/strong\u003eConceptualization, Resources, Supervision, Project administration, Funding Acquisition. \u003cstrong\u003eAID:\u003c/strong\u003eConceptualization, Methodology, Validation, Formal Analysis, Investigation, Writing - Review \u0026amp; Editing, Project administration. \u003cstrong\u003ePV:\u003c/strong\u003e Data curation, Writing - Review \u0026amp; Editing,\u0026nbsp;Visualization. \u003cstrong\u003eJB:\u003c/strong\u003e Methodology, Formal Analysis, Resources. \u003cstrong\u003eCE:\u003c/strong\u003eConceptualization, Writing - Review \u0026amp; Editing, Supervision,\u0026nbsp;Funding Acquisition. \u003cstrong\u003eSLS:\u003c/strong\u003e Methodology, Formal Analysis, Resources. \u003cstrong\u003eNB:\u003c/strong\u003e Conceptualization, Methodology, Software, Validation, Data Curation, Writing - Original draft, Visualization. \u003cstrong\u003eLYH:\u003c/strong\u003eConceptualization, Methodology, Validation, Formal Analysis, Investigation, Writing - Original Draft, Writing - Review \u0026amp; Editing, Supervision, Project administration. \u003cstrong\u003eLV:\u003c/strong\u003e Conceptualization, Writing - Review \u0026amp; Editing, Supervision,\u0026nbsp;Project administration, Funding Acquisition. \u003cstrong\u003eLYH\u003c/strong\u003e and \u003cstrong\u003eLV\u003c/strong\u003e contributed equally to this work; i.e. shared senior authors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCorresponding author\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eCorrespondence to Lynn Vanhaecke:
[email protected]\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eACKNOWLEDGEMENTS\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors would like to thank the staff of the Ballymena and Preston slaughterhouses (Cranswick Country Foods, UK), Sus Campinae Westerlo (Belgium) and Exportslachthuis Tielt (Belgium) for helping accommodate the collection of samples, and Exportslachthuis Tielt (Belgium) more specifically also for enabling the in-situ installation of the REIMS instrument. Thank you also to Dirk Stockx, Beata Pomian, Mieke Naessens, Joke Goedgebuer, Margot De Spiegeleer and Steve Huysman at the Laboratory of Integrative Metabolomics (LIMET) for their technical assistance.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFUNDING DECLARATION\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe pig breed study was partially funded by the Knowledge Transfer Partnership (KTP) Agreement (Application ID: 1025930, Cranswick Foods \u0026amp; Queen\u0026rsquo;s University Belfast), whereas the boar taint work was funded by the Vlaamse Overheid - Departement LNE - Dienst Dierenwelzijn [LNE/STG/DWZ/16/11]. The 3D-printed \u0026lsquo;fat probe\u0026rsquo; was an in-kind contribution by the Budapest Waters Research Centre (Hungary). L.Y. Hemeryck is an FWO (Research Foundation - Flanders) postdoctoral fellow [1297623N]. The funders played no role in study design, data collection, analysis and interpretation of the data, or the writing of this manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eETHICS DECLARATIONS\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eJB and SLS are employed by Waters. The other authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll experiments were performed in accordance with relevant guidelines and regulations. No experiments were performed on live animals; all pigs were slaughtered for commercial purposes prior to this study. Ethics approval was thus not required.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eHassoun, A. et al. Fraud in animal origin food products: Advances in emerging spectroscopic detection methods over the past five years. \u003cem\u003eFoods\u003c/em\u003e 9, 1069 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBarlow, R. S. et al. Rapid evaporative ionization mass spectrometry: a review on its application to the red meat industry with an Australian context. \u003cem\u003eMetabolites\u003c/em\u003e 11, 171 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShackell, G. H. \u0026amp; Dodds, K. G. in \u003cem\u003eMeat Biotechnology\u003c/em\u003e (ed. Toldr\u0026aacute; F.) 61\u0026ndash;88 (Springer New York, 2008).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLee, S. et al. The influence of pork quality traits and muscle fiber characteristics on the eating quality of pork from various breeds. \u003cem\u003eMeat Sci.\u003c/em\u003e 90, 284\u0026ndash;291 (2012).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWilkinson, S. et al. Development of a genetic tool for product regulation in the diverse British pig breed market. \u003cem\u003eBMC Genomics\u003c/em\u003e 13, 1\u0026ndash;12 (2012).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFontanesi, L. \u003cem\u003eLawrie\u0026acute;s Meat Science (Eighth Edition)\u003c/em\u003e (ed. Toldr\u0026aacute; F.) 585\u0026ndash;633 (Woodhead Publishing, 2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGirish, P. \u0026amp; Barbuddhe, S. in \u003cem\u003eMeat Quality Analysis\u003c/em\u003e 153\u0026ndash;170 (Elsevier, 2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRamos, A. M., Megens, H., Crooijmans, R. P. M. A., Schook, L. B. \u0026amp; Groenen, M. A. M. Identification of high utility SNPs for population assignment and traceability purposes in the pig using high-throughput sequencing. \u003cem\u003eAnim. Genet.\u003c/em\u003e 6, 613\u0026ndash;620 (2011).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVignal, A., Milan, D., SanCristobal, M. \u0026amp; Eggen, A. A review on SNP and other types of molecular markers and their use in animal genetics. \u003cem\u003eGenet. Sel. Evol.\u003c/em\u003e 34, 275\u0026ndash;305 (2002).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMontowska, M. \u0026amp; Pospiech, E. Is authentication of regional and traditional food made of meat possible? \u003cem\u003eCrit. Rev. Food Sci. Nutr.\u003c/em\u003e 52, 475\u0026ndash;487 (2012).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eArulandhu, A. J. et al. Development and validation of a multi-locus DNA metabarcoding method to identify endangered species in complex samples. \u003cem\u003eGigaScience\u003c/em\u003e 6, gix080 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBarcaccia, G., Lucchin, M. \u0026amp; Cassandro, M. DNA barcoding as a molecular tool to track down mislabeling and food piracy. \u003cem\u003eDivers.\u003c/em\u003e 8, 2 (2015).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAluw\u0026eacute;, M. et al. Exploratory survey on European consumer and stakeholder attitudes towards alternatives for surgical castration of piglets. \u003cem\u003eAnimals\u003c/em\u003e 10, 1758 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVerplanken, K. et al. Rapid evaporative ionization mass spectrometry for high-throughput screening in food analysis: The case of boar taint. \u003cem\u003eTalanta\u003c/em\u003e 169, 30\u0026ndash;36 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVerplanken, K. et al. Rapid method for the simultaneous detection of boar taint compounds by means of solid phase microextraction coupled to gas chromatography/mass spectrometry. \u003cem\u003eJ. Chromatogr. A\u003c/em\u003e 1462, 124\u0026ndash;133 (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eS\u0026oslash;rensen, K. M. \u0026amp; Engelsen, S. B. Measurement of boar taint in porcine fat using a high-throughput gas chromatography\u0026ndash;mass spectrometry protocol. \u003cem\u003eJ. Agric. Food Chem.\u003c/em\u003e 62, 9420\u0026ndash;9427 (2014).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eS\u0026oslash;rensen, K. M., Westley, C., Goodacre, R. \u0026amp; Engelsen, S. B. Simultaneous quantification of the boar-taint compounds skatole and androstenone by surface-enhanced Raman scattering (SERS) and multivariate data analysis. \u003cem\u003eAnal. Bioanal. Chem.\u003c/em\u003e 407, 7787\u0026ndash;7795 (2015).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu, X., Schmidt, H. \u0026amp; M\u0026ouml;rlein, D. Feasibility of boar taint classification using a portable Raman device. \u003cem\u003eMeat Sci.\u003c/em\u003e 116, 133\u0026ndash;139 (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBekaert, K. et al. Evaluation of different heating methods for the detection of boar taint by means of the human nose. \u003cem\u003eMeat Sci.\u003c/em\u003e 94, 125\u0026ndash;132 (2013).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHaugen, J.-E., Brunius, C. \u0026amp; Zamaratskaia, G. Review of analytical methods to measure boar taint compounds in porcine adipose tissue: The need for harmonised methods. \u003cem\u003eMeat Sci.\u003c/em\u003e 90, 9\u0026ndash;19 (2012).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eB\u0026ouml;hme, K., Calo-Mata, P., Barros-Vel\u0026aacute;zquez, J. \u0026amp; Ortea, I. Recent applications of omics-based technologies to main topics in food authentication. \u003cem\u003eTrAC-Trends Anal. Chem.\u003c/em\u003e 110, 221\u0026ndash;232 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHong, Y. et al. Data fusion and multivariate analysis for food authenticity analysis. \u003cem\u003eNat. Commun.\u003c/em\u003e 14, 3309 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHu, C. \u0026amp; Xu, G. Mass-spectrometry-based metabolomics analysis for foodomics. \u003cem\u003eTrAC-Trends Anal. Chem.\u003c/em\u003e 52, 36\u0026ndash;46 (2013).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBlack, C. et al. A real time metabolomic profiling approach to detecting fish fraud using rapid evaporative ionisation mass spectrometry. \u003cem\u003eMetabolomics\u003c/em\u003e 13, 1\u0026ndash;13 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiebal, U. W., Phan, A. N., Sudhakar, M., Raman, K. \u0026amp; Blank, L. M. Machine learning applications for mass spectrometry-based metabolomics. \u003cem\u003eMetabolites\u003c/em\u003e 10, 243 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKosek, V. et al. Ambient mass spectrometry based on REIMS for the rapid detection of adulteration of minced meats by the use of a range of additives. \u003cem\u003eFood Control\u003c/em\u003e 104, 50\u0026ndash;56 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePlekhova, V. et al. Rapid ex vivo molecular fingerprinting of biofluids using laser-assisted rapid evaporative ionization mass spectrometry. \u003cem\u003eNat. Protoc.\u003c/em\u003e 16, 4327\u0026ndash;4354 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBlack, C. et al. Rapid detection and specific identification of offals within minced beef samples utilising ambient mass spectrometry. \u003cem\u003eSci. Rep.\u003c/em\u003e 9, 6295 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGredell, D. A. et al. Comparison of machine learning algorithms for predictive modeling of beef attributes using rapid evaporative ionization mass spectrometry (REIMS) data. \u003cem\u003eSci. Rep.\u003c/em\u003e 9, 5721 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDe Graeve, M. et al. Multivariate versus machine learning-based classification of rapid evaporative Ionisation mass spectrometry spectra towards industry based large-scale fish speciation. \u003cem\u003eFood Chem.\u003c/em\u003e 404, 134632 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRoss, A. et al. Making complex measurements of meat composition fast: Application of rapid evaporative ionisation mass spectrometry to measuring meat quality and fraud. \u003cem\u003eMeat Sci.\u003c/em\u003e 181, 108333 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBalog, J. et al. Identification of the species of origin for meat products by rapid evaporative ionization mass spectrometry. \u003cem\u003eJ. Agric. Food Chem.\u003c/em\u003e 64, 4793\u0026ndash;4800 (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBirse, N. et al. Ambient mass spectrometry as a tool to determine poultry production system history: A comparison of rapid evaporative ionisation mass spectrometry (REIMS) and direct analysis in real time (DART) ambient mass spectrometry platforms. \u003cem\u003eFood Control\u003c/em\u003e 123, 107740 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePoklukar, K., Čandek-Potokar, M., Batorek Lukač, N., Tomažin, U. \u0026amp; Škrlep, M. Lipid deposition and metabolism in local and modern pig breeds: A review. \u003cem\u003eAnimals\u003c/em\u003e 10, 424 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBekaert, K. et al. A validated ultra-high performance liquid chromatography coupled to high resolution mass spectrometry analysis for the simultaneous quantification of the three known boar taint compounds. \u003cem\u003eJ. Chromatogr. A\u003c/em\u003e 1239, 49\u0026ndash;55 (2012).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLoomas, K. R. et al. Evaluation of rapid evaporative ionization mass spectrometry (REIMS) for the prediction of slice shear force and quality grades in beef longissimus lumborum steaks. \u003cem\u003eMeat Sci.\u003c/em\u003e 222, 109752 (2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, H. et al. Discrimination of dried sea cucumber (Apostichopus japonicus) products from different geographical origins by sequential windowed acquisition of all theoretical fragment ion mass spectra (SWATH-MS)-based proteomic analysis and chemometrics. \u003cem\u003eFood Chem.\u003c/em\u003e 274, 592\u0026ndash;602 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSong, G. et al. In situ and real-time authentication of Thunnus species by iKnife rapid evaporative ionization mass spectrometry based lipidomics without sample pretreatment. \u003cem\u003eFood Chem.\u003c/em\u003e 318, 126504 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHe, Q. et al. Differentiation between fresh and frozen\u0026ndash;thawed meat using rapid evaporative ionization mass spectrometry: the case of beef muscle. \u003cem\u003eJournal of Agricultural and Food Chem.\u003c/em\u003e 69, 5709\u0026ndash;5724 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang, J. et al. Liquid chromatography quadrupole time-of-flight mass spectrometry and rapid evaporative ionization mass spectrometry were used to develop a lamb authentication method: A preliminary study. \u003cem\u003eFoods\u003c/em\u003e 9, 1723 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, R., Ross, A. B., Yoo, M. J. \u0026amp; Farouk, M. M. Metabolic fingerprinting of in-bag dry-and wet-aged lamb with rapid evaporative ionisation mass spectroscopy. \u003cem\u003eFood Chem.\u003c/em\u003e 347, 128999 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMenze, B. H. et al. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. \u003cem\u003eBMC Bioinformatics\u003c/em\u003e 10, 213 (2009).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, R., Realini, C. E., Middlewood, P., Pavan, E. \u0026amp; Ross, A. B. Metabolic fingerprinting using Rapid evaporative ionisation mass spectrometry can discriminate meat quality and composition of lambs from different sexes, breeds and forage systems. \u003cem\u003eFood Chem.\u003c/em\u003e 386, 132758 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePenning, B. W., Snelling, W. M. \u0026amp; Woodward-Greene, M. J. Machine learning in the assessment of meat quality. \u003cem\u003eIT Prof.\u003c/em\u003e 22, 39\u0026ndash;41 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWolpert, D. H. The lack of a priori distinctions between learning algorithms. \u003cem\u003eNeural Comput.\u003c/em\u003e 8, 1341\u0026ndash;1390 (1996).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMuroya, S., Ueda, S., Komatsu, T., Miyakawa, T. \u0026amp; Ertbjerg, P. MEATabolomics: Muscle and meat metabolomics in domestic animals. \u003cem\u003eMetabolites\u003c/em\u003e 10, 188 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKlont, R. E., Kurt, E., Heres, L. \u0026amp; Urlings, B. Production of entire males-challenges and opportunities. \u003cem\u003eFleischwirtschaft\u003c/em\u003e 90, 107\u0026ndash;109 (2010).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eM\u0026ouml;rlein, D. et al. An overlooked compound contributing to boar taint and consumer rejection of meat products: 2-Aminoacetophenone. \u003cem\u003eMeat Sci.\u003c/em\u003e 213, 109497 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMu\u0026ntilde;oz, M. et al. Development of a 64 SNV panel for breed authentication in Iberian pigs and their derived meat products. \u003cem\u003eMeat Sci.\u003c/em\u003e 167, 108152 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWagner, L., Kaufmann, M., Lange, F., Dallmann, A. \u0026amp; Bergmann, M. Differentiation of pork (Sus scrofa domesticus) and wild boar (Sus scrofa) meat using 1H NMR spectroscopy and MALDI-ToF mass spectrometry. \u003cem\u003eEur. Food Res. Technol.\u003c/em\u003e 251, 747\u0026ndash;766 (2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRamiro, J. L., Neo, A. G., P\u0026eacute;rez-Palacios, T., Antequera, T. \u0026amp; Marcos, C. F. Machine learning-enabled fatty acid quantification and classification of pork from autochthonous breeds using low-field 1H NMR spectroscopic data. \u003cem\u003eFood Control\u003c/em\u003e 166, 110753 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRamiro, J. L. et al. Classification of raw cuts from Iberian and Celta pigs based on lipid analysis and chemometrics. \u003cem\u003eJ. Food Compos. Anal.\u003c/em\u003e 130, 106173 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu, H. et al. Metabolomics analysis provides novel insights into the difference in meat quality between different pig breeds. \u003cem\u003eFoods\u003c/em\u003e 12, 3476 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003edel Moral, F. G. et al. Duroc and Iberian pork neural network classification by visible and near infrared reflectance spectroscopy. \u003cem\u003eJ. Food Eng.\u003c/em\u003e 90, 540\u0026ndash;547 (2009).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFont-i-Furnols, M. et al. Feasibility of on/at line methods to determine boar taint and boar taint compounds: an overview. \u003cem\u003eAnimals\u003c/em\u003e 10, 1886 (2020).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"npj-science-of-food","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"npjscifood","sideBox":"Learn more about [npj Science of Food](http://www.nature.com/npjscifood/)","snPcode":"41538","submissionUrl":"https://submission.springernature.com/new-submission/41538/3","title":"npj Science of Food","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"NPJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Ambient Ionisation Mass Spectrometry, Meat Authenticity, Meat Quality, Metabolomic Fingerprinting, Pork Production","lastPublishedDoi":"10.21203/rs.3.rs-7537051/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7537051/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eTo help counteract food fraud and meet consumer expectations, the pork industry requires reliable quality-monitoring and traceability systems. In this context, Rapid Evaporative Ionization Mass Spectrometry (REIMS) could be rolled out as a real-time, accurate metabolic fingerprint-based classifier of pork meat characteristics and quality issues like e.g. genetic origin and taint. Here, fingerprinting of \u0026gt;\u0026thinsp;3000 pig neck fat samples enabled highly accurate pig breed classification (pairwise comparison of Commercials (Pietrain x Hampshires x Durocs, Large-Whites, Durocs), Hampshires and Large-Whites, where data modelling using Support Vector Machine (SVM, all pairwise comparisons\u0026thinsp;\u0026gt;\u0026thinsp;89%) and Orthogonal Partial Least Squares - Discriminant Analysis (OPLS-DA, \u0026gt;90%) outperformed Random Forest (RF, 72.0\u0026ndash;79.5%). Boar taint classification showed comparable results between OPLS-DA, RF, and SVM (93.5\u0026ndash;96.0%), but strategies to avoid false negatives and positives, including the construction of balanced models (tainted \u003cem\u003evs.\u003c/em\u003e non-tainted), proved imperative.\u003c/p\u003e","manuscriptTitle":"Towards Real-Time Industry-Proof Pork Breed and Boar Taint Classification using Rapid Evaporative Ionisation Mass Spectrometry (REIMS)","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-16 18:49:05","doi":"10.21203/rs.3.rs-7537051/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-10-23T07:10:01+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-10-23T06:57:01+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-10-16T17:37:02+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"121815002893704711465588384565815357075","date":"2025-10-02T19:57:55+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"136935544837908302654592346608922229898","date":"2025-10-02T17:37:02+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-10-02T17:21:21+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-09-27T14:50:07+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-09-16T18:11:10+00:00","index":"","fulltext":""},{"type":"submitted","content":"npj Science of Food","date":"2025-09-04T14:09:15+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"npj-science-of-food","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"npjscifood","sideBox":"Learn more about [npj Science of Food](http://www.nature.com/npjscifood/)","snPcode":"41538","submissionUrl":"https://submission.springernature.com/new-submission/41538/3","title":"npj Science of Food","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"NPJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"b7495487-2f7a-4acb-bfde-c55e58f37371","owner":[],"postedDate":"October 16th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":56158686,"name":"Biological sciences/Biological techniques"},{"id":56158687,"name":"Biological sciences/Computational biology and bioinformatics"}],"tags":[],"updatedAt":"2026-01-12T16:06:06+00:00","versionOfRecord":{"articleIdentity":"rs-7537051","link":"https://doi.org/10.1038/s41538-025-00685-4","journal":{"identity":"npj-science-of-food","isVorOnly":false,"title":"npj Science of Food"},"publishedOn":"2026-01-08 15:59:10","publishedOnDateReadable":"January 8th, 2026"},"versionCreatedAt":"2025-10-16 18:49:05","video":"","vorDoi":"10.1038/s41538-025-00685-4","vorDoiUrl":"https://doi.org/10.1038/s41538-025-00685-4","workflowStages":[]},"version":"v1","identity":"rs-7537051","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7537051","identity":"rs-7537051","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.