Large-Scale Genome-Wide Association Analysis Reveals Candidate Genes in Yak Body Size Traits | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Large-Scale Genome-Wide Association Analysis Reveals Candidate Genes in Yak Body Size Traits Jiahong Zhao, Zemin Li, Xinrui Liu, Yaxin Liu, Binglin Yue, Hui Wang, and 9 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6282023/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 26 Sep, 2025 Read the published version in BMC Genomics → Version 1 posted 6 You are reading this latest preprint version Abstract Background The yak is a unique livestock species bred on the Qinghai-Tibet Plateau. We utilized genotypic data obtained from the yak sequencing chip "Qingxin-1" and phenotypic data measured from image photographs using conversion between pixel and distance. The primary objective of this study was to conduct genome-wide association studies (GWAS) using five models to analyze seven body size traits. Specifically, the goals were to (1) characterize the genetic structure of three major yak breeds: Maiwa, Yushu, and Huanhu; (2) identify candidate genes that significantly influence yak body size traits; and (3) compare the prediction accuracy of single-trait and multi-trait genomic selection(GS). Results A total of 94 markers were significantly ( P < 1e-05) associated with yak body size traits. GWAS results revealed that PRKAA2 and SNX9 were important candidate genes affecting the body size traits of yaks. The GS results indicated that combining marker-assisted selection and best linear unbiased prediction significantly improved the accuracy of predicting body size traits, the average accuracy in multi-trait GS was higher than that in single-trait GS. Conclusions Our findings provide valuable insights into the genetic architecture underlying yaks, with implications for the development and selection of yak body size traits. The identification of key genes such as PRKAA2 and SNX9 offers promising targets for breeding programs aimed at optimizing body size traits, thereby supporting genetic improvements in yak populations. Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Background The yak is a versatile livestock animal that is bred on the Qinghai-Tibet Plateau. It provides essential meat, milk, and other necessities of life for pastoralists, serving as a primary mode of transportation in the highlands [ 1 ]. Yaks are essential for people inhabiting the Qinghai-Tibet Plateau and for ecological environment. The Maiwa, Huanhu, and Yushu yaks are the three primary breeds, distinguished by their unique body size characteristics and genetic backgrounds. The Maiwa yak is primarily distributed in the Maiwa region of the Aba Tibetan and Qiang Autonomous Prefectures in Sichuan Province. It can be easily distinguished by large body size and substantial meat production, exhibits remarkable adaptability, thriving under the harsh conditions of high-altitude and low-oxygen environments. The Huanhu yak mainly occurs in the areas around Qinghai Lake and is characterized by distinct regional features owing to its unique living environment. Although smaller in size, the Huanhu yak is known for its pronounced endurance and ability to thrive on barren pastures. The Yushu yak is predominantly found in the Yushu Tibetan Autonomous Prefecture in Qinghai province, where is the birthplace source of the Yellow River. Renowned for its cold resistance and high-altitude adaptability, Yushu yak is one of the main livestock breeds for local herders. Further, the Yushu yak has excellent disease resistance and adaptability, and its body size traits, such as weight and length, significantly affect its environmental adaptability. In recent years, with the advancement of genetic research, body size traits in yaks, such as height, and length, have been shown to be complex traits controlled by multiple genes. Exploring the genetic basis of these traits is important for breeding and production. Yak body size is closely related to productivity, adaptability, and labor capabilities. First, yak body size is a significant economic trait. Yaks with larger body sizes may possess higher potential for meat and milk production, thereby increasing economic returns for farmers. Second, yaks with larger body sizes may be better adapted to harsh environmental conditions, such as cold climates and high-altitude areas, thereby enhancing their survival capabilities. Moreover, larger body sizes may indicate stronger physical structures, aiding them in undertaking heavier labor tasks, such as plowing or transportation. Therefore, in some cases, by selectively breeding and cultivating yaks with excellent body sizes, their productivity, adaptability, and labor capabilities can be enhanced, resulting in greater profits and benefits for the livestock industry. Genome-wide association studies (GWAS) test correlations of whole-genome single-nucleotide polymorphisms (SNPs) identified through high-throughput sequencing with specific traits. This approach involves using different models to estimate the effects of the detected loci and then statistically testing the estimated effects [ 2 – 4 ]. In essence, this method utilizes the linkage disequilibrium (LD) across the whole genome to identify genes influencing complex traits or phenotypes [ 5 , 6 ]. With regard to yak body size traits, GWAS provides a precise and efficient method for genetic improvement and livestock production, with broad application prospects in breeding selection. Such analyses can help breeders identify key genes related to yak body size traits, better understand the genetic basis of yak body size traits, and improve herd productivity and economic benefits through genomic selection. This study utilized the yak gene chip "Qingxin-1" to scan the main breeds of the Qinghai-Tibet Plateau, including Maiwa, Yushu, and Huanhu yaks. Using the aid of image recognition technology yak body size information was rapidly obtained. Through combining phenotype and genotype with single and multiple GWAS algorithms, we identified the candidate markers and functional genes associated with body size traits in these three breeds. Based on the common gene between interesting trait and relative trait, Multiple Trait GS model with common genes were used to predict individuals’ genome Estimated Breeding Values. In addition, understanding of these functional genes and individuals assessment of breeding potential ability will help improve genetic progress This study provides a scientific and theoretical basis for improving the breeding of yak varieties on the Qinghai-Tibet Plateau and the development of the yak industry. Methods Samples and phenotypic collection The animal samples were comprised of 826 yaks from the three most widely distributed breeds, including 307 Maiwa yaks from the Longri Livestock Farm in Sichuan Province and the Qinghai-Tibet Plateau Base of Southwest Minzu University, 336 Yushu yaks from the Yellow River Source Yak Breeding Farm, and 183 Huanhu yaks from the Gonghe Country Yak Cooperative. A total of 706 blood and tissue samples(276 Maiwa, 277 Yushu and 153 Huanhu yaks) from all individuals were collected to sequence genotype. Due to missing records of age and gender, only 526 records(131 Maiwa, 242 Yushu and 153 Huanhu yaks) were retained. Hence, 526 yaks with complete phenotypic, genotypic, age and gender data were remained for GWAS and GS steps. Growth traits were obtained using computer vision recognition technology [ 7 ]. Images of yaks were captured by three angles, including a front and two profiles. Using image recognition model, the key bone position information was identified. The distance between key bones was defined as the body size length. The traits were primarily included the phenotypic values of yak body length (BL; cm), body height (BH; cm), mouth length (ML; cm), body canted length (BCL; cm), chest circumference (CC; cm), and circumference of cannon bone one and two (CCB1and CCB2, respectively; cm). Anomalies in the data were processed after aggregation. To precisely define the measurement criteria for each yak body trait, terms such as scapula, caudal vertebrae, and olecranon can be used. BL refers to the distance from the tip of the nose to the base of the caudal vertebrae, excluding the tail. BH was measured vertically from the top of the scapula to the ground. ML was the distance from the tip of the nose to the olecranon of the forelimb. The BCL was measured diagonally from the top of the scapula to the base of the caudal vertebrae. The CC was measured around the yak’s thorax, passing through the widest point behind the scapula. CCB measurements do not refer to the circumference but rather to the diameter of the forelimb below the olecranon, with two values recorded by photography from both the front and side perspectives. The final phenotypic values were defined through the average of the six independently repeated image recognition. Genotype and Sequencing First, blood and tissue samples including ear tissue and hair, were collected from three yak populations. "Qingxin-1",a targeted captured breeding chip [ 8 ], was used to scan the genotypes of 706 yaks. The chip included 30K SNPs, with an average detection rate of over 99.93% for the chip sites in the tested samples. Moreover, quality control was performed to filter SNPs with Minor Allele Frequency (MAF) 5%. The number of SNPs retained in the GWAS was 29,233. Population Structure and Genome-wide Association Analyses The integrated analysis capabilities and detailed output results of the Genome-Associated Prediction Integrated Tool (GAPIT) [ 9 ] were used to perform LD analysis and heterozygosity analysis of individuals between markers. Principal component analysis (PCA), neighbor-joining (NJ) trees, and kinship heat maps were calculated and plotted using the GAPIT package in R software. GWAS was performed using GAPIT software (version 3.0) with two single-loci and three multi-locus models, where PCA, kinship matrix, and yak age and gender data were added as covariates. The P-values of each GWAS model were corrected using Bonferroni correction, and a cutoff of 1e-05 was used to filter significant signals [ 10 ]. Multiple-loci mixed models (MLMM) can effectively detect candidate genes related to traits such as body weight, BH, BL, and CC [ 11 ]. The general expressions of MLMM were consistent. $$\:\text{Y}\:=\:{\mu\:}\:+\:\text{P}\text{C}\text{A}\:+\:\text{p}\text{s}\text{e}\text{u}\text{d}\text{o}\text{Q}\text{T}\text{N}\:+\:\text{S}\text{N}\text{P}\text{i}\:+\:\text{K}\:+\:\text{e}\:\:\:$$ 1 where Y is the collected phenotype data; µ is the matrix of yak age and gender; PCA is used to account for population stratification; 𝑝𝑠𝑒𝑢𝑑𝑜𝑄𝑇𝑁 includes significant markers from previous cycles, initially empty in the first cycle; 𝑆𝑁𝑃𝑖 denotes the markers in each test cycle; K represents the kinship matrix among individuals; and e is the random residual vector and obeys 𝑒~𝑁 (0, 𝜎𝑒 2 )。 The FarmCPU model can be considered as fixed and random two models. all program was performed with multiple iterates with these two models [ 12 ]. The equations are as follows: $$\:\text{Y}\:=\:{\mu\:}\:+\:\text{P}\text{C}\text{A}\:+\:\text{p}\text{s}\text{e}\text{u}\text{d}\text{o}\text{Q}\text{T}\text{N}\:+\:\text{S}\text{N}\text{P}\text{i}\:+\:\text{e}\:\:$$ 2 $$\:\:\text{Y}\:=\:{\mu\:}\:+\:\text{P}\text{C}\text{A}\:+\:\text{p}\text{s}\text{e}\text{u}\text{d}\text{o}\text{Q}\text{T}\text{N}\:+\:\text{S}\text{N}\text{P}\text{i}\:+\:\text{K}\:+\:\text{e}\:\:$$ 3 The parameters in these equations are identical to those in the MLMM. Eq. ( 2 ) was used to estimate effect values and P -values of all SNPs. Eq. ( 3 ) was used to select 𝑝𝑠𝑒𝑢𝑑𝑜𝑄𝑇𝑁. The BLINK method [ 13 ] employs two separate fixed models: one for estimating 𝑝𝑠𝑒𝑢𝑑𝑜𝑄𝑇𝑁 and the other for computing marker effects and P-values. It replaces the random effects model in FarmCPU with Bayesian methods for instantaneous selection and evaluation of 𝑝𝑠𝑒𝑢𝑑𝑜𝑄𝑇𝑁. In the association analysis, we first used general linear models and mixed linear models to detect significant SNP loci associated with various yak traits [ 14 ]. Multiple-loci models, such as MLMM, FarmCPU, and BLINK, have advantages over single-locus models in terms of both reducing false positives and increasing statistical power. In multiple-loci models, markers are iteratively incorporated as covariates if they are determined to be associated with a trait during single-model GWAS (SM-GWAS); other markers are then analyzed using standard procedures for GWAS. The purpose of this approach was to enable the multiple-loci model to assess the relationship between other markers and the trait more accurately, thereby increasing the statistical power and precision of association analysis. Therefore, we used multi-locus models to detect the significant SNP loci associated with various yak traits. The first part of conducting GWAS using five models to separately detect significant SNP loci associated with seven traits in yaks has been completed. The second part of the genome-wide association analysis is the multiple-model GWAS (MM-GWAS), which builds upon the foundation laid in the first part. During the MM-GWAS, we applied five models to analyze each trait in sequence, and the results are presented as a Manhattan plot containing the five models for each trait. This process was repeated until all seven traits were analyzed in the same manner. We then identified and recorded SNP loci that were significant for both traits and located at the exact same position on the same chromosome. Using these loci as key points, we calculated and transformed the phenotypic variance explaination (PVE) values. The transformation process involved dividing all the PVE values by the maximum PVE value and using the percentage of this maximum PVE as a reference. We created a radar plot to visually compare the performance and impact of different SNP loci on these two traits. Multi-trait GWAS (MT-GWAS) was the final and most important factor. First, we filtered and sorted the genotype files based on the numbering in the first column of the age and sex files. As phenotypic correlation can reflect genetic correlation to some extent-meaning that the higher the phenotypic correlation between traits, the higher the genetic correlation—we selected the pair of traits with the highest correlation: CCB1 and CC. From the results generated by SM-GWAS, we identified significant loci associated with one of the traits from the three multi-locus models. The genotypes of these significant loci were extracted into three separate files, which were then merged into age and sex files. These files were then incorporated into the analysis of other traits, ensuring that the model used for MT-GWAS remained consistent with the model used to identify the significant loci. Genome Selection GS with significant markers were used to validate the accuracy of the GWAS results. This study included two parts: a single-trait and a multiple-trait GS. Common genes were used to interpret the genetic relationships between the trait of interest and related traits. For the single-trait GS, the phenotypic values for the two yak traits with the highest correlation were separately and randomly divided into five equally sized groups. One group for a single trait was randomly selected as the testing population and the remaining groups were designated as the training population. Observations phenotype values of testing population were set as "NA" [ 15 ]. The BLINK model was employed to conduct GWAS in GAPIT to train the model using phenotype data of the training population. The prediction accuracy of the model was evaluated using five-fold cross-validation, followed by a linear model to predict the phenotypic values for the testing population. The above steps were repeated for the other traits. Subsequently, we performed multiple-trait GS. A standard GWAS using the BLINK model was conducted to identify significant loci associated with both the trait of interest and the related traits. When predicting the trait of interest, significant loci from the related traits were treated as fixed effects in the GWAS and GS. The correlations between the observed and predicted phenotypic values for the trait of interest were computed. Four different correlations were generated to compare the results of the single-trait and multi-trait GS. The terms "interesting traits" and "related trait" here primarily refer to BL and BCL, or BCL and BL. Cross-validation was iterated until all groups of the interesting traits and related traits were used as the testing population. The entire process was repeated 20 times, and the Pearson correlation coefficient (r) between the phenotypic values (y) and the predicted genome estimated breeding value were used as the evaluation metric for prediction accuracy. The final accuracy was defined as the average of 100 repeats of the correlation values. Results Phenotypic Distribution We analyzed seven body size traits in 826 yaks and summarized the descriptive statistics (mean and standard deviation) and heritability for different body size traits [See Additional file: Tables S1 and S2]. Most records of BL ranged from 34 cm to 191 cm. The large range was because the sample population included both bulls and cows with ages ranging from 2 to 9 years. A BL of 34 cm is likely that of a young cow, whereas a BL of 191 cm would be that of an adult bull. The mean value of BL was 77.1 cm, the mean value of BH was 86.4 cm, the mean value of ML was 18.6 cm, the mean value of BCL was 84.1 cm, the mean value of the CC was 25.9 cm, and the mean value of circumference of the cannon bone one and two were 4.5 and 4.7 cm. Except for CCB1 and CCB2, all the other traits followed a normal distribution. Strong positive correlations were observed between CC and CCB1, whereas CCB2 had moderate negative correlations with CC and CCB1 (Fig. 1 ). SNP Calling and Population Structure A total of 30,000 SNPs genetic markers were detected using the yak chip "Qingxin-1". Overall, 29,233 SNPs remained after filtering. The numerical value of the MAF ranges from 0.05 to 0.5, and the number of SNPs with an MAF between 0.05 and 0.1 is the highest. As the MAF increases, the frequency of the SNPs gradually decreases. The vast majority of SNPs had an MAF < 0.3. Additionally, the heterozygosity of most individuals and SNP markers was low (see Additional file: Fig. S1 ). We used r² as the LD metric because it directly reflects the correlation between SNPs which is suitable for GWAS. The average LD across the whole genome was 1Kb (see Additional file: Fig. S2). The distribution of heterozygosity, MAF, and R² across genome-wide marker loci is provided (see Additional file: Fig. S3). The sample exhibited a distinct population structure. To thoroughly analyze the population structures of the 706 genotyped yaks, we employed both PCA and NJ tree analysis, which are considered highly effective complementary methods. The NJ tree clustering results indicated that the yak population was divided into three groups. Individuals within each breed showed high genetic similarity, whereas there were significant genetic differences among the three breeds. These NJ tree clustering results presented approximately the same population structure as the PCA (Fig. 2 A). In general, the kinship coefficients for the majority of individuals ranged from 0.1 to 0.25, indicating that the kinship within the yak population was relatively distant. However, the kinship heatmap was used to indicate a close relationship among the Yushu yaks population. (Fig. 2 B). Based on the three-dimensional population structure, the distribution of individuals and populations can be clearly separated. It is mainly composed of three groups, with individuals within each group clustered together. Notably, the Huanhu yak population clustered tightly, indicating high genetic similarity among these individuals. (Fig. 2 C and 2 D). The contributions of genetic variance explained by the first three principal components were 2.46%, 1.45%, and 0.84%, respectively. GWAS and Candidate Genes Multiple-locus testing models provide more powerful detection capabilities compared to single-locus models [ 16 ]. To identify SNP markers associated with body size, the results of five models (GLM, MLM, MLMM, FarmCPU, and BLINK) were compared after analyzing the genotypic and phenotypic data of 520 yaks. The 1.0 × 10 − 5 of P value was used to consider as significance. 94 SNPs exceeded the threshold, which were observed association with at least one of the seven traits. The number of significant SNPs was 13 for BL, 9 for BH, 7 for BCL, 34 for CCB1, 19 for CC, and 12 for CCB2. No significant SNPs were identified for ML trait. The most significant SNPs for BL, BH, BCL, CCB1, and CCB2 were as follows: SX_61170346 (chrX: 61,170,346 bp), S14_43229824 (chr14: 43,229,824bp), S7_114182165 (chr7: 114,182,165bp), S3_82995824 (chr3: 82,995,824 bp), and S11_15246495 (chr11: 15,246,495 bp). All three loci, S3_82995824 (chr3: 82,995,824 bp), S11_92286727 (chr11: 92,286,727 bp), and SY_1597301 (chrY: 1,597,301 bp) were the most significant SNPs associated with CC. No significant SNPs were found on 13 chromosomes (chr2, chr9, chr12, chr16, chr17, chr18, chr19, chr21, chr22, chr23, chr24, chr26, and chr28) for any of the traits evaluated. All five methods detected four markers that were located at the same position on the same chromosome and exerted significant effects on the same trait: S3_82995824 (chr3: 82,995,824 bp), S7_114182165 (chr7: 114,182,165 bp), S11_15246495 (chr11: 15,246,495 bp), and SX_61110559 (chrX: 61,110,559 bp) (Fig. 3 A and 3 C). The markers collectively detected by the four methods in the GWAS were S1_32191242 (chr1: 32,191,242bp), S6_28171330 (chr6: 28,171,330 bp), and S20_34316377 (chr20: 34,316,377 bp). Four markers were jointly detected by the three models, and nine markers were detected by two models. The remaining markers were identified using a single GWAS model. The QQ plot results for markers associated with the BL and CC traits were obtained by comparing the results of the five GWAS models against the expected P -values. The QQ plots indicated that the predicted and expected values for BL and CC closely matched across the five models (Fig. 3 B and 3 D). Figure 3 presents the Manhattan and QQ plots for BL and CC, whereas those for the other traits are shown in the Additional file (see Additional file: Fig. S4). In the 1 kb LD region upstream and downstream, significant SNPs were annotated to relevant candidate genes through BLAST, resulting in a total of 23 annotated genes; detailed information on these genes can be found in the Additional file (Table S3). Among these, the protein kinase amp-activated catalytic subunit alpha 2 ( PRKAA2 ) and SNX9 genes have previously been reported to be associated with body size traits [ 17 , 18 ]. Five SNPs detected through GLM, MLM, MLMM, FarmCPU, and BLINK were associated with more than one trait (CCB1 and CC) (Fig. 4 A), including the following: S3_82995824 (chr3: 82,995,824 bp), S11_92286727 (chr11: 92,286,727 bp), S20_34180196 (chr20: 34,180,196 bp), SY_1597301 (chrY: 1,597,301bp), and SY_25036649 (chrY: 25,036,649 bp). We calculated the PVE values for these five SNPs and produced a radar chart to reflect the proportion of phenotypic variance explained by each SNP for the CCB1 and CC traits. The results indicated that most SNPs explained a higher proportion of phenotypic variance for CC than for CCB1. Among these, SNP S20_34180196 explained the highest proportion of the phenotypic variance for both traits, reaching 100% (Fig. 4 B). Genome Selection Phenotypic analysis revealed that the second highest correlation between CC and CCB1 was 0.76. In the first GWAS, the average correlations between the observed and predicted values for CC and CCB1 were 0.743 and 0.512, respectively. In the second GWAS, the average correlation coefficients were 0.781 and 0.596, respectively. The average correlation for each trait in the second GWAS was greater than that in the first (Fig. 5 ).At the same time, we also conducted genomic selection for the two traits most related to the phenotype (BL and BCL), and the results were consistent with those of CC and CCB1(Fig. S5). These results demonstrate that the GS method combining marker-assisted selection with the best linear unbiased prediction improves the accuracy of body size traits. It is widely applied in breeding, aiding the prediction of complex traits and accelerating the genetic improvement process. Discussion We identified candidate genes associated with body size traits in yaks using a GWAS, providing crucial insights into the growth mechanisms of yaks in significantly different ecological environments. Yushu yaks are mainly distributed in the high-altitude regions of the Yushu Tibetan Autonomous Prefecture in Qinghai province and inhabit the cold grasslands near Maiwa Township in Hongyuan County, which is located in the Aba Tibetan and Qiang Autonomous Prefecture of Sichuan province. Huanhu yaks are found around the Qinghai Lake region in Qinghai province, These yaks inhabit various high-altitude and grassland ecosystems in the Qinghai and Sichuan provinces, which in turn results in notable differences in body size traits, such as BL and BH. Notably, the measurement of yak body size traits presents certain challenges owing to their relatively volatile temperament and defensiveness. The large amount of accurate phenotypic data obtained through computer vision recognition technology provides a reliable foundation for GWAS. Previous research has reported that the accuracy of estimation body size data obtained by using image recognition technology can be ranged from a minimum of 70% to as high as 94%-95% [ 19 , 20 ]. This contact-free measurement approach helps prevent handling stress in the animals and improves repeatability, which is typically challenging using conventional measurement methods. In addition, computer vision recognition technology can help standardize the measurement of body size traits, thus reduces errors associated with manual measurements, and improves the efficiency and precision of large-scale data collection. After obtaining the phenotypic data, we recorded the mean and standard deviation of the yak body size traits by age group in the Additional file (Table 1 ). The yaks selected for this study were grazing yaks and had various birth dates. As there were few individuals younger than several months, 1 and 2 years of age, these were grouped together in one age category, which resulted in a slightly larger standard deviation for this age group. Table S2 shows that the overall age of the yaks has a stronger impact on these traits than their specific age in months, as evidenced by the decrease in standard deviation values for most traits as the yaks grow older. However, traits such as ML had a weaker relationship with age; therefore, their standard deviation values did not show a decreasing trend with age. Table 1 Significant SNP and candidate gene information. SNP No. * Chr Position (bp) Gene ID Gene Name S3_97322275 3 97322275 ENSBGRG00000000593 PRKAA2 S10_106316722 10 106316722 ENSBGRG00000026688 SNX9 * SNP No. indicates the sequence number in the entire tag list. Chromosomes and locations refer to physical location information in genomic data. The gene names are annotated from the GTF file of the Bosgru_v3.0 reference genome. In the GWAS, we used five models to jointly detect significant SNP loci associated with traits with the aim of enhancing the robustness and reliability of the detection results through a multi-model approach. If a particular locus is identified as significant by multiple models, we can be more confident that it is associated with the target trait. Nevertheless, multi-model results cannot serve as an absolute validation method because each model has its own false discovery rate. The purpose of using multiple models is to integrate the perspectives of different models to increase the likelihood that detected loci are truly associated, rather than entirely excluding significant loci identified by a single model, which may also be genuinely associated with the target trait. The GWAS results based on five models identified 94 SNP loci that were significantly associated with seven traits. Based on the annotation information provided by the 1 kb upstream and downstream regions of the yak reference genome, we obtained relevant annotation information for only two SNPs. These candidate genes showed an association of S3_97322275 (chrY: 397,322,275 bp) with PRKAA2 , S10_106316722 (chr10: 106,317,722 bp) with SNX9 (Sorting Nexin 9). Among these genes, the PRKAA2 gene is a member of the adenosine monophosphate-activated protein kinase ( AMPK ) family. They are heterotrimeric proteins that mainly detect the state of mammalian cells, regulate the new biosynthesis of fatty acids and cholesterol and play a key role in cell energy metabolism [ 21 , 22 ]. Bovine PRKAA2 is located on chromosome 3, as shown through a somatic cell hybrid cell panel [ 23 ]. A study on PRKAA2 in Pakistani Nili-Ravi and Kundi buffaloes revealed 17 SNP loci, which may be associated with energy metabolism and production traits [ 24 ]. In a Xiangsu hybrid pig population, three SNPs correlated with body size traits were identified in PRKAA2 [ 25 ]. In Yorkshire pigs, BTA9 , an olfactory receptor SNP in SNX9 is associated with growth traits [ 18 ], and in Simmental beef cattle, SNX9 regulates body size at three growth stages [ 26 ]. SNP loci significantly associated with related traits were utilized to predict the traits of interest in the GS. When BL was an interesting trait, no SNP loci significantly associated with BCL were identified. As a result, the multi-trait GS was not implemented, making its predictive accuracy the same as that of the single-trait GS. In light of these results, after randomly missing 20% of the BCL data and conducting a GWAS, significant SNP loci were successfully identified, leading to an improvement in the predictive accuracy of multi-trait GS (see Additional file: Fig. S5). The results demonstrated that the predictive accuracy of the multi-trait GS was higher than that of the single-trait GS. This is primarily because phenotypic correlations reflect phenotypic data; however, the underlying reason for this is the presence of genetic correlations. These genetic correlations arise from the influence of common genes that may have varying effects on different traits. For instance, certain genes may have a significant impact on one trait, whereas their effect on another trait may not be significant; however, they still play a role. Thus, even if the influence of these genes is not significant for certain traits, including those in the analysis, they can still provide a direct predictive capability. Conclusions Phenotype values obtained through image recognition measurements can be considered as traits and used in GWAS and GS. The ''Qingxin-1'' chip is particularly useful for key gene discovery and breeding applications in yaks. Multiple-trait GS involving common genes from related traits can enhance the prediction accuracy of important traits. These findings provide foundational information for the localization of quantitative trait genes and candidate genes associated with yak body size trait formation mechanisms. Declarations Acknowledgements We are grateful to Longri Breeding and Storage Farm in Hongyuan city, Huangheyuan Farm in the Qumalai city, and Yak Farm in the Gangcha city for providing the yaks. We also extend our appreciation to Editage (www.editage.cn) for their assistance with English language editing. Author ’ contributions JZ: data curation, writing of the first draft of the manuscript, visualization, testing, methodology, supervision, writing first draft of the manuscript, and validation. ZL, XL,YL,BY,HW,MZ,WP,SS,JZ,YK,XY and GW: sample collection. DL and JW: software, conceptualization and manuscript revision. All authors read and approved the final manuscript. Funding This work was supported by the National Key Research and Development Project of China (2022YFD1601601); the Qinghai Science and Technology Program, China (2022-NK-110); the Heilongjiang Province Key Research and Development Project, China (2022ZX02B09); the Fundamental Research Funds for the Central Universities, China (Southwest Minzu University, YCZD2024006); and the Program of Chinese National Beef Cattle and Yak Industrial Technology System, China (CARS-37). Competing interests The authors declare no competing interests. Ethics Approval and Consent to Participate All animal procedures were approved by the Animal Ethics and Welfare Association of Southwest Minzu University (Approval No. 16053). All experimental protocols adhered to relevant institutional and national guidelines, and reporting complies with the ARRIVE guidelines. Euthanasia and Sample Collection Procedures Sample collection (blood, ear tissue, and hair) was conducted using standard, non-lethal, and minimally invasive techniques to ensure animal welfare. Yaks were restrained using conventional livestock handling methods without anesthesia. Blood was drawn via jugular venipuncture using sterile equipment, while ear tissue and hair were collected through ear notching and hair plucking methods routinely employed in livestock genetic research. All procedures were performed by trained personnel under veterinary oversight. No animals were euthanized; all were returned to their herds after sampling. Data Availability The datasets generated and/or analyzed during the current study are publicly available at: https://github.com/jiahongZ-l. References Qiu Q, Zhang G, Ma T, et al. The yak genome and adaptation to life at high altitude. Nat Genet. 2012;44(8):946–9. Hayes B. Genome-Wide Association Studies and Genomic Prediction. New York: Springer Science + Business Media; 2013. pp. 149–69. Dimensionality M, Pan Q, Hu T, Moore JH. Genome-Wide Association Studies and Genomic Prediction. Totowa, NJ: Springer Science + Business Media; 2013. p. 1019. Wang H, Li H, Li J, Zhang Z, Zhao S. Genome-wide association study of growth and meat quality traits in yaks (Bos grunniens). BMC Genomics. 2020;21(1):574. de Los Campos G, Vazquez AI, Fernando R, Klimentidis YC, Sorensen D. Prediction of complex human traits using the genomic best linear unbiased predictor. PLoS Genet. 2013;9(7):e1003608. 10.1371/journal.pgen.1003608 . Gusev A, et al. Quantifying missing heritability at known GWAS loci. PLoS Genet. 2013;9(12):e1003993. 10.1371/journal.pgen.1003993 . Zhang LN, Wu BP, Jiang XH, et al. Development and validation of a visual image analysis for monitoring the body size of sheep. J Appl Anim Res. 2018;46(1):1004–15. Sousa Junior LPB, et al. Genome-wide association and functional genomic analyses for various hoof health traits in North American Holstein cattle. J Dairy Sci. 2023;107(4):2207–30. Wang JB, Zhang ZW. GAPIT Version 3: Boosting power and accuracy for genomic association and prediction. Genomics Proteom Bioinf. 2021;19:629–40. Liu XR, Wang MX, et al. Identification of candidate genes associated with yak body size using a genome-wide association study and multiple populations of information. J Anim Sci. 2023;13:1470. Jia C, Li C, Fu D, et al. Identification of genetic loci associated with growth traits at weaning in yak through a genome-wide association study. Anim Genet. 2020;51(2):300–5. Liu XL, Huang M, Fan B, et al. Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PLoS Genet. 2016;12(2):e1005767. 10.1371/journal.pgen.1005767 . Huang M, Liu XL, Zhou Y, et al. BLINK: a package for the next level of genome-wide association studies with both individuals and markers in the millions. Gigascience. 2019;8(2):154. Zhang ZW, Ersoz E, Lai CQ, et al. Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 2010;42:355–60. Li LZ, Zheng XF, Wang J, et al. Joint analysis of phenotype-effect-generation identifies loci associated with grain quality traits in rice hybrids. Nat Commun. 2023;14:3930. Segura V, Vilhjálmsson BJ, Platt A, et al. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat Genet. 2012;44:825–30. Hardie DG, Carling D. The AMP-activated protein kinase–fuel gauge of the mammalian cell? Eur J Biochem. 1997;246:259–73. Meng Q, Wang K, Liu X, et al. Identification of growth trait-related genes in a Yorkshire purebred pig population by genome-wide association studies. Asian Australas J Anim Sci. 2017;30:462–9. Qin Q, Dai DL, Zhang CY, et al. Identification of body size characteristic points based on the Mask R-CNN and correlation with body weight in Ujumqin sheep. Front Vet Sci. 2022;9:995724. Fernandes AFA, Dórea JRR, Fitzgerald R, et al. A novel automated system to acquire biometric and morphological measurements and predict body weight of pigs via 3D computer vision. J Anim Sci. 2019;97(1):496–508. Lee JH, Koh H, Kim M, et al. Energy-dependent regulation of cell structure by AMP-activated protein kinase. Nature. 2007;447:1017–21. Hwang SL, Chang HW. Natural vanadium-containing Jeju ground water stimulates glucose uptake through the activation of AMP-activated protein kinase in L6 myotubes. Mol Cell Biochem. 2012;360:401–9. McKay SD, White SN, Kata SR, et al. The bovine 5’ AMPK gene family: Mapping and single nucleotide polymorphism detection. Mamm Genome. 2003;14:853–8. Khan WA, Hussain T, Babar ME, et al. Polymorphic status of PRKAA2 gene in Pakistani buffaloes. Int J Agric Biol. 2015;18:903–5. Xu J, Ruan Y, Sun J, et al. Association analysis of PRKAA2 and MSMB polymorphisms and growth traits of Xiangsu hybrid pigs. Genes. 2023;14(1):113. An B, Xu L, Xia J, et al. Multiple association analysis of loci and candidate genes that regulate body size at three growth stages in Simmental beef cattle. BMC Genet. 2020;21:32. Additional Declarations No competing interests reported. Supplementary Files Additionalfile.docx Cite Share Download PDF Status: Published Journal Publication published 26 Sep, 2025 Read the published version in BMC Genomics → Version 1 posted Reviewers agreed at journal 08 May, 2025 Reviewers invited by journal 05 May, 2025 Editor assigned by journal 28 Apr, 2025 Editor invited by journal 23 Apr, 2025 Submission checks completed at journal 22 Apr, 2025 First submitted to journal 22 Apr, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6282023","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":452664926,"identity":"b2ae83c8-f30f-4089-8c56-bf79716dae4d","order_by":0,"name":"Jiahong Zhao","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Jiahong","middleName":"","lastName":"Zhao","suffix":""},{"id":452664927,"identity":"18a949d0-1e97-48dc-aa61-bd3944cc725a","order_by":1,"name":"Zemin Li","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Zemin","middleName":"","lastName":"Li","suffix":""},{"id":452664928,"identity":"0ba1f326-d616-4db8-910c-1d0b3838ef59","order_by":2,"name":"Xinrui Liu","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Xinrui","middleName":"","lastName":"Liu","suffix":""},{"id":452664929,"identity":"5dfcbb54-3d67-406b-9b6b-46732f979b07","order_by":3,"name":"Yaxin Liu","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Yaxin","middleName":"","lastName":"Liu","suffix":""},{"id":452664930,"identity":"230b75c6-1148-4347-aac8-95bd1329f0ea","order_by":4,"name":"Binglin Yue","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Binglin","middleName":"","lastName":"Yue","suffix":""},{"id":452664931,"identity":"85968ed4-61ce-437d-9a3b-6f60a3900b45","order_by":5,"name":"Hui Wang","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Hui","middleName":"","lastName":"Wang","suffix":""},{"id":452664932,"identity":"8eb5d9d3-a086-45db-9964-13c24ad008b7","order_by":6,"name":"Ming Zhang","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Ming","middleName":"","lastName":"Zhang","suffix":""},{"id":452664933,"identity":"ff7ee097-7709-4fb7-813d-5e4299aa2183","order_by":7,"name":"Wei Peng","email":"","orcid":"","institution":"Qinghai University","correspondingAuthor":false,"prefix":"","firstName":"Wei","middleName":"","lastName":"Peng","suffix":""},{"id":452664934,"identity":"877e965a-05e4-47c1-a2c7-6cb74843b918","order_by":8,"name":"Shi Shu","email":"","orcid":"","institution":"Qinghai University","correspondingAuthor":false,"prefix":"","firstName":"Shi","middleName":"","lastName":"Shu","suffix":""},{"id":452664935,"identity":"364ed2e9-5ae8-4214-b91a-c0aa0f7e6d98","order_by":9,"name":"Guowen Wang","email":"","orcid":"","institution":"Qinghai University","correspondingAuthor":false,"prefix":"","firstName":"Guowen","middleName":"","lastName":"Wang","suffix":""},{"id":452664936,"identity":"410abfb5-a40d-4998-aa7e-2dc71e97d370","order_by":10,"name":"Jincheng Zhong","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Jincheng","middleName":"","lastName":"Zhong","suffix":""},{"id":452664937,"identity":"d6f5af04-eb02-4bb1-88b2-72c5350a67a0","order_by":11,"name":"Yixi Kangzhu","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Yixi","middleName":"","lastName":"Kangzhu","suffix":""},{"id":452664938,"identity":"82eefb70-add3-4338-a886-8878110dbb03","order_by":12,"name":"Xinjia Yan","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Xinjia","middleName":"","lastName":"Yan","suffix":""},{"id":452664939,"identity":"9de58bde-a54a-4a33-943c-065e45bd54c6","order_by":13,"name":"Daoliang Lan","email":"","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Daoliang","middleName":"","lastName":"Lan","suffix":""},{"id":452664940,"identity":"3c9663b7-0089-4dda-bf79-1e64eb598bbc","order_by":14,"name":"Jiabo Wang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABDUlEQVRIiWNgGAWjYLACCQOGBAYGxoYPQLYcG3v7AaK1NM4Aso35eM4kEGURSBUjSEviPAkHA7xKDY6fPfzCosAuj18iubHh447a9DYJoP4fFdtwazmTl2YhYZBcLDkjsbFx5pnjuW3SjQcYe87cxqnF7ECOmYGEAXPihhuJ7Y95247ltskcSGBmbMOj5fwbkJZ6kJbG5r9tx9LZJBIM8Gu5kWP8QMLgMEQLY1tNAkEt9jfemAED+XixZM/DxsbetgOGbcBAPojPL5L9OcafJf5U5/Gzpz9s+NlWJy/f3n7wwY8K3FqAgE1aAkQJJIDIw2ChA/jUAwHzR1BCYeAHq6sjoHgUjIJRMApGIgAAQuNhDokIagsAAAAASUVORK5CYII=","orcid":"","institution":"Ministry of Education and Sichuan Province, Southwest Minzu University","correspondingAuthor":true,"prefix":"","firstName":"Jiabo","middleName":"","lastName":"Wang","suffix":""}],"badges":[],"createdAt":"2025-03-22 07:23:12","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6282023/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6282023/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12864-025-12017-7","type":"published","date":"2025-09-26T15:58:01+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":82359291,"identity":"c5ab806b-fc33-4473-8876-3f4ead7977fd","added_by":"auto","created_at":"2025-05-09 11:29:36","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":275927,"visible":true,"origin":"","legend":"\u003cp\u003eThe phenotypic distribution and correlation of bodysize traits in three main yak breeds.\u003c/p\u003e\n\u003cp\u003eThe diagonal line from top left to bottom right is a distribution map of the 7 traits, represented as a density map, with the vertical axis showing the frequency density of the observed values and the horizontal axis showing the observed values. The correlation between two different traits is represented at the top of the diagonal line, and the observed values of each yak individual in two different traits are represented by scatter plots at the bottom of the diagonal line.\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-6282023/v1/ec35b9cdeaaf343c1e4ac424.png"},{"id":82357208,"identity":"85ae7af6-dcec-4278-b399-679d8af90c47","added_by":"auto","created_at":"2025-05-09 11:21:36","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":124193,"visible":true,"origin":"","legend":"\u003cp\u003eThe population stratification and individual relationship.\u003c/p\u003e\n\u003cp\u003eAll SNP markers were used to generate NJ-tree of 706 yaks using GAPIT software (A).A heat map of the kinship matrix is created to indicate the relationship between individuals (B).Population structure explained by 3D plots of principal component (PC) using all SNP markers (C-D).\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-6282023/v1/93f246f63640ee16f38bfbbf.png"},{"id":82359292,"identity":"49573cc0-f972-4415-9b35-c3c5f4a7b573","added_by":"auto","created_at":"2025-05-09 11:29:36","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":184500,"visible":true,"origin":"","legend":"\u003cp\u003eManhattan and quantile-quantile plots of the p-values for the genome-wide association study of BL and CC of yaks based on five methods.\u003c/p\u003e\n\u003cp\u003eCircular Manhattan (A), and quantile–quantile plots (B) of five models in detecting body length traits in yaks. From the inner ring to the outer ring are successively general linear models(GLM)、mixed linear models(MLM)、Multiple Loci Mixed Model (MLMM)、the Fixed And Random Model Circulating Probability Unification (FarmCPU) and BLINK, the outermost ring indicates the labeling density of this chromosome. Circular Manhattan (C), and quantile–quantile plots (D) of five models in detecting chest circumference traits in yaks.\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-6282023/v1/f4a4301ce3e880f4a9b789bd.png"},{"id":82360119,"identity":"4d2b71ec-4298-45e4-9a30-672db23d0ba7","added_by":"auto","created_at":"2025-05-09 11:37:36","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":177820,"visible":true,"origin":"","legend":"\u003cp\u003eA radar chart of the common significant locus in both CC and CCB1 traits.\u003c/p\u003e\n\u003cp\u003eThe long solid line indicated the common significant markers were detected by general linear models(GLM)、mixed linear models(MLM)、Multiple Loci Mixed Model (MLMM)、the Fixed And Random Model Circulating Probability Unification (FarmCPU) and BLINK in chest circumference and circumference of the cannon bone one. The short solid line indicated the common significant markers were detected by more than two models in a trait(A).Proportion of genetic variation in chest circumference and circumference of the cannon bone one explained by different significance SNP loci(B).\u003c/p\u003e","description":"","filename":"image4.png","url":"https://assets-eu.researchsquare.com/files/rs-6282023/v1/d14c2274d9eed360af9a2b5f.png"},{"id":82357214,"identity":"1550936a-8230-4e67-8689-81a158699d4a","added_by":"auto","created_at":"2025-05-09 11:21:36","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":47909,"visible":true,"origin":"","legend":"\u003cp\u003eSingle and Multiple Trait Model Accuracy for Body Length and Body Canted Length in Yaks.\u003c/p\u003e\n\u003cp\u003eThe green bars on the left represent the correlations between observed and predicted observations calculated after performing GWAS and GS for the body length trait using the BLINK and linear models. The red bars represent the correlations obtained from the second GWAS and GS, using significant genetic markers identified from the initial GWAS as fixed effects(A). There is a strong correlation between body length and body canted length, The two bars on the right represent the correlations obtained from the two separate GWAS and GS for the body canted length trait(B)\u003cstrong\u003e.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"image5.png","url":"https://assets-eu.researchsquare.com/files/rs-6282023/v1/f6bb76bd9eb1fc5810021cde.png"},{"id":92430641,"identity":"44a37328-96e5-485c-b03a-e4dc96397059","added_by":"auto","created_at":"2025-09-29 16:07:13","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1484846,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6282023/v1/685438e7-59ef-47a3-823e-727088247789.pdf"},{"id":82357211,"identity":"ed9eb88f-744c-4669-95a0-bd7459e2d17e","added_by":"auto","created_at":"2025-05-09 11:21:36","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":2215281,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile.docx","url":"https://assets-eu.researchsquare.com/files/rs-6282023/v1/8b5d8bb2957d1960d00d983f.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Large-Scale Genome-Wide Association Analysis Reveals Candidate Genes in Yak Body Size Traits","fulltext":[{"header":"Background","content":"\u003cp\u003eThe yak is a versatile livestock animal that is bred on the Qinghai-Tibet Plateau. It provides essential meat, milk, and other necessities of life for pastoralists, serving as a primary mode of transportation in the highlands [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Yaks are essential for people inhabiting the Qinghai-Tibet Plateau and for ecological environment. The Maiwa, Huanhu, and Yushu yaks are the three primary breeds, distinguished by their unique body size characteristics and genetic backgrounds. The Maiwa yak is primarily distributed in the Maiwa region of the Aba Tibetan and Qiang Autonomous Prefectures in Sichuan Province. It can be easily distinguished by large body size and substantial meat production, exhibits remarkable adaptability, thriving under the harsh conditions of high-altitude and low-oxygen environments. The Huanhu yak mainly occurs in the areas around Qinghai Lake and is characterized by distinct regional features owing to its unique living environment. Although smaller in size, the Huanhu yak is known for its pronounced endurance and ability to thrive on barren pastures. The Yushu yak is predominantly found in the Yushu Tibetan Autonomous Prefecture in Qinghai province, where is the birthplace source of the Yellow River. Renowned for its cold resistance and high-altitude adaptability, Yushu yak is one of the main livestock breeds for local herders. Further, the Yushu yak has excellent disease resistance and adaptability, and its body size traits, such as weight and length, significantly affect its environmental adaptability. In recent years, with the advancement of genetic research, body size traits in yaks, such as height, and length, have been shown to be complex traits controlled by multiple genes. Exploring the genetic basis of these traits is important for breeding and production.\u003c/p\u003e \u003cp\u003eYak body size is closely related to productivity, adaptability, and labor capabilities. First, yak body size is a significant economic trait. Yaks with larger body sizes may possess higher potential for meat and milk production, thereby increasing economic returns for farmers. Second, yaks with larger body sizes may be better adapted to harsh environmental conditions, such as cold climates and high-altitude areas, thereby enhancing their survival capabilities. Moreover, larger body sizes may indicate stronger physical structures, aiding them in undertaking heavier labor tasks, such as plowing or transportation. Therefore, in some cases, by selectively breeding and cultivating yaks with excellent body sizes, their productivity, adaptability, and labor capabilities can be enhanced, resulting in greater profits and benefits for the livestock industry.\u003c/p\u003e \u003cp\u003eGenome-wide association studies (GWAS) test correlations of whole-genome single-nucleotide polymorphisms (SNPs) identified through high-throughput sequencing with specific traits. This approach involves using different models to estimate the effects of the detected loci and then statistically testing the estimated effects [\u003cspan additionalcitationids=\"CR3\" citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. In essence, this method utilizes the linkage disequilibrium (LD) across the whole genome to identify genes influencing complex traits or phenotypes [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. With regard to yak body size traits, GWAS provides a precise and efficient method for genetic improvement and livestock production, with broad application prospects in breeding selection. Such analyses can help breeders identify key genes related to yak body size traits, better understand the genetic basis of yak body size traits, and improve herd productivity and economic benefits through genomic selection.\u003c/p\u003e \u003cp\u003eThis study utilized the yak gene chip \"Qingxin-1\" to scan the main breeds of the Qinghai-Tibet Plateau, including Maiwa, Yushu, and Huanhu yaks. Using the aid of image recognition technology yak body size information was rapidly obtained. Through combining phenotype and genotype with single and multiple GWAS algorithms, we identified the candidate markers and functional genes associated with body size traits in these three breeds. Based on the common gene between interesting trait and relative trait, Multiple Trait GS model with common genes were used to predict individuals\u0026rsquo; genome Estimated Breeding Values. In addition, understanding of these functional genes and individuals assessment of breeding potential ability will help improve genetic progress This study provides a scientific and theoretical basis for improving the breeding of yak varieties on the Qinghai-Tibet Plateau and the development of the yak industry.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eSamples and phenotypic collection\u003c/h2\u003e \u003cp\u003eThe animal samples were comprised of 826 yaks from the three most widely distributed breeds, including 307 Maiwa yaks from the Longri Livestock Farm in Sichuan Province and the Qinghai-Tibet Plateau Base of Southwest Minzu University, 336 Yushu yaks from the Yellow River Source Yak Breeding Farm, and 183 Huanhu yaks from the Gonghe Country Yak Cooperative. A total of 706 blood and tissue samples(276 Maiwa, 277 Yushu and 153 Huanhu yaks) from all individuals were collected to sequence genotype. Due to missing records of age and gender, only 526 records(131 Maiwa, 242 Yushu and 153 Huanhu yaks) were retained. Hence, 526 yaks with complete phenotypic, genotypic, age and gender data were remained for GWAS and GS steps.\u003c/p\u003e \u003cp\u003eGrowth traits were obtained using computer vision recognition technology [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Images of yaks were captured by three angles, including a front and two profiles. Using image recognition model, the key bone position information was identified. The distance between key bones was defined as the body size length. The traits were primarily included the phenotypic values of yak body length (BL; cm), body height (BH; cm), mouth length (ML; cm), body canted length (BCL; cm), chest circumference (CC; cm), and circumference of cannon bone one and two (CCB1and CCB2, respectively; cm). Anomalies in the data were processed after aggregation. To precisely define the measurement criteria for each yak body trait, terms such as scapula, caudal vertebrae, and olecranon can be used. BL refers to the distance from the tip of the nose to the base of the caudal vertebrae, excluding the tail. BH was measured vertically from the top of the scapula to the ground. ML was the distance from the tip of the nose to the olecranon of the forelimb. The BCL was measured diagonally from the top of the scapula to the base of the caudal vertebrae. The CC was measured around the yak\u0026rsquo;s thorax, passing through the widest point behind the scapula. CCB measurements do not refer to the circumference but rather to the diameter of the forelimb below the olecranon, with two values recorded by photography from both the front and side perspectives. The final phenotypic values were defined through the average of the six independently repeated image recognition.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eGenotype and Sequencing\u003c/h3\u003e\n\u003cp\u003eFirst, blood and tissue samples including ear tissue and hair, were collected from three yak populations. \"Qingxin-1\",a targeted captured breeding chip [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e], was used to scan the genotypes of 706 yaks. The chip included 30K SNPs, with an average detection rate of over 99.93% for the chip sites in the tested samples. Moreover, quality control was performed to filter SNPs with Minor Allele Frequency (MAF)\u0026thinsp;\u0026lt;\u0026thinsp;0.05 and missing rate\u0026thinsp;\u0026gt;\u0026thinsp;5%. The number of SNPs retained in the GWAS was 29,233.\u003c/p\u003e\n\u003ch3\u003ePopulation Structure and Genome-wide Association Analyses\u003c/h3\u003e\n\u003cp\u003eThe integrated analysis capabilities and detailed output results of the Genome-Associated Prediction Integrated Tool (GAPIT) [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] were used to perform LD analysis and heterozygosity analysis of individuals between markers. Principal component analysis (PCA), neighbor-joining (NJ) trees, and kinship heat maps were calculated and plotted using the GAPIT package in R software.\u003c/p\u003e \u003cp\u003eGWAS was performed using GAPIT software (version 3.0) with two single-loci and three multi-locus models, where PCA, kinship matrix, and yak age and gender data were added as covariates. The P-values of each GWAS model were corrected using Bonferroni correction, and a cutoff of 1e-05 was used to filter significant signals [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eMultiple-loci mixed models (MLMM) can effectively detect candidate genes related to traits such as body weight, BH, BL, and CC [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. The general expressions of MLMM were consistent.\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e\n$$\\:\\text{Y}\\:=\\:{\\mu\\:}\\:+\\:\\text{P}\\text{C}\\text{A}\\:+\\:\\text{p}\\text{s}\\text{e}\\text{u}\\text{d}\\text{o}\\text{Q}\\text{T}\\text{N}\\:+\\:\\text{S}\\text{N}\\text{P}\\text{i}\\:+\\:\\text{K}\\:+\\:\\text{e}\\:\\:\\:$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e1\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003ewhere Y is the collected phenotype data; \u0026micro; is the matrix of yak age and gender; PCA is used to account for population stratification; \u0026#119901;\u0026#119904;\u0026#119890;\u0026#119906;\u0026#119889;\u0026#119900;\u0026#119876;\u0026#119879;\u0026#119873; includes significant markers from previous cycles, initially empty in the first cycle; \u0026#119878;\u0026#119873;\u0026#119875;\u0026#119894; denotes the markers in each test cycle; K represents the kinship matrix among individuals; and e is the random residual vector and obeys \u0026#119890;~\u0026#119873; (0, \u0026#120590;\u0026#119890;\u003csup\u003e2\u003c/sup\u003e)。\u003c/p\u003e \u003cp\u003eThe FarmCPU model can be considered as fixed and random two models. all program was performed with multiple iterates with these two models [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. The equations are as follows:\u003cdiv id=\"Equ2\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ2\" name=\"EquationSource\"\u003e\n$$\\:\\text{Y}\\:=\\:{\\mu\\:}\\:+\\:\\text{P}\\text{C}\\text{A}\\:+\\:\\text{p}\\text{s}\\text{e}\\text{u}\\text{d}\\text{o}\\text{Q}\\text{T}\\text{N}\\:+\\:\\text{S}\\text{N}\\text{P}\\text{i}\\:+\\:\\text{e}\\:\\:$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e2\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ3\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ3\" name=\"EquationSource\"\u003e\n$$\\:\\:\\text{Y}\\:=\\:{\\mu\\:}\\:+\\:\\text{P}\\text{C}\\text{A}\\:+\\:\\text{p}\\text{s}\\text{e}\\text{u}\\text{d}\\text{o}\\text{Q}\\text{T}\\text{N}\\:+\\:\\text{S}\\text{N}\\text{P}\\text{i}\\:+\\:\\text{K}\\:+\\:\\text{e}\\:\\:$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e3\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eThe parameters in these equations are identical to those in the MLMM. Eq.\u0026nbsp;(\u003cspan refid=\"Equ2\" class=\"InternalRef\"\u003e2\u003c/span\u003e) was used to estimate effect values and \u003cem\u003eP\u003c/em\u003e-values of all SNPs. Eq.\u0026nbsp;(\u003cspan refid=\"Equ3\" class=\"InternalRef\"\u003e3\u003c/span\u003e) was used to select \u0026#119901;\u0026#119904;\u0026#119890;\u0026#119906;\u0026#119889;\u0026#119900;\u0026#119876;\u0026#119879;\u0026#119873;.\u003c/p\u003e \u003cp\u003eThe BLINK method [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e] employs two separate fixed models: one for estimating \u0026#119901;\u0026#119904;\u0026#119890;\u0026#119906;\u0026#119889;\u0026#119900;\u0026#119876;\u0026#119879;\u0026#119873; and the other for computing marker effects and P-values. It replaces the random effects model in FarmCPU with Bayesian methods for instantaneous selection and evaluation of \u0026#119901;\u0026#119904;\u0026#119890;\u0026#119906;\u0026#119889;\u0026#119900;\u0026#119876;\u0026#119879;\u0026#119873;.\u003c/p\u003e \u003cp\u003eIn the association analysis, we first used general linear models and mixed linear models to detect significant SNP loci associated with various yak traits [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. Multiple-loci models, such as MLMM, FarmCPU, and BLINK, have advantages over single-locus models in terms of both reducing false positives and increasing statistical power. In multiple-loci models, markers are iteratively incorporated as covariates if they are determined to be associated with a trait during single-model GWAS (SM-GWAS); other markers are then analyzed using standard procedures for GWAS. The purpose of this approach was to enable the multiple-loci model to assess the relationship between other markers and the trait more accurately, thereby increasing the statistical power and precision of association analysis. Therefore, we used multi-locus models to detect the significant SNP loci associated with various yak traits. The first part of conducting GWAS using five models to separately detect significant SNP loci associated with seven traits in yaks has been completed.\u003c/p\u003e \u003cp\u003eThe second part of the genome-wide association analysis is the multiple-model GWAS (MM-GWAS), which builds upon the foundation laid in the first part. During the MM-GWAS, we applied five models to analyze each trait in sequence, and the results are presented as a Manhattan plot containing the five models for each trait. This process was repeated until all seven traits were analyzed in the same manner. We then identified and recorded SNP loci that were significant for both traits and located at the exact same position on the same chromosome. Using these loci as key points, we calculated and transformed the phenotypic variance explaination (PVE) values. The transformation process involved dividing all the PVE values by the maximum PVE value and using the percentage of this maximum PVE as a reference. We created a radar plot to visually compare the performance and impact of different SNP loci on these two traits.\u003c/p\u003e \u003cp\u003eMulti-trait GWAS (MT-GWAS) was the final and most important factor. First, we filtered and sorted the genotype files based on the numbering in the first column of the age and sex files. As phenotypic correlation can reflect genetic correlation to some extent-meaning that the higher the phenotypic correlation between traits, the higher the genetic correlation\u0026mdash;we selected the pair of traits with the highest correlation: CCB1 and CC. From the results generated by SM-GWAS, we identified significant loci associated with one of the traits from the three multi-locus models. The genotypes of these significant loci were extracted into three separate files, which were then merged into age and sex files. These files were then incorporated into the analysis of other traits, ensuring that the model used for MT-GWAS remained consistent with the model used to identify the significant loci.\u003c/p\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eGenome Selection\u003c/h2\u003e \u003cp\u003eGS with significant markers were used to validate the accuracy of the GWAS results. This study included two parts: a single-trait and a multiple-trait GS. Common genes were used to interpret the genetic relationships between the trait of interest and related traits.\u003c/p\u003e \u003cp\u003eFor the single-trait GS, the phenotypic values for the two yak traits with the highest correlation were separately and randomly divided into five equally sized groups. One group for a single trait was randomly selected as the testing population and the remaining groups were designated as the training population. Observations phenotype values of testing population were set as \"NA\" [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. The BLINK model was employed to conduct GWAS in GAPIT to train the model using phenotype data of the training population. The prediction accuracy of the model was evaluated using five-fold cross-validation, followed by a linear model to predict the phenotypic values for the testing population. The above steps were repeated for the other traits.\u003c/p\u003e \u003cp\u003eSubsequently, we performed multiple-trait GS. A standard GWAS using the BLINK model was conducted to identify significant loci associated with both the trait of interest and the related traits. When predicting the trait of interest, significant loci from the related traits were treated as fixed effects in the GWAS and GS. The correlations between the observed and predicted phenotypic values for the trait of interest were computed. Four different correlations were generated to compare the results of the single-trait and multi-trait GS. The terms \"interesting traits\" and \"related trait\" here primarily refer to BL and BCL, or BCL and BL. Cross-validation was iterated until all groups of the interesting traits and related traits were used as the testing population. The entire process was repeated 20 times, and the Pearson correlation coefficient (r) between the phenotypic values (y) and the predicted genome estimated breeding value were used as the evaluation metric for prediction accuracy. The final accuracy was defined as the average of 100 repeats of the correlation values.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003ePhenotypic Distribution\u003c/h2\u003e \u003cp\u003eWe analyzed seven body size traits in 826 yaks and summarized the descriptive statistics (mean and standard deviation) and heritability for different body size traits [See Additional file: Tables S1 and S2]. Most records of BL ranged from 34 cm to 191 cm. The large range was because the sample population included both bulls and cows with ages ranging from 2 to 9 years. A BL of 34 cm is likely that of a young cow, whereas a BL of 191 cm would be that of an adult bull. The mean value of BL was 77.1 cm, the mean value of BH was 86.4 cm, the mean value of ML was 18.6 cm, the mean value of BCL was 84.1 cm, the mean value of the CC was 25.9 cm, and the mean value of circumference of the cannon bone one and two were 4.5 and 4.7 cm. Except for CCB1 and CCB2, all the other traits followed a normal distribution. Strong positive correlations were observed between CC and CCB1, whereas CCB2 had moderate negative correlations with CC and CCB1 (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eSNP Calling and Population Structure\u003c/h2\u003e \u003cp\u003eA total of 30,000 SNPs genetic markers were detected using the yak chip \"Qingxin-1\". Overall, 29,233 SNPs remained after filtering. The numerical value of the MAF ranges from 0.05 to 0.5, and the number of SNPs with an MAF between 0.05 and 0.1 is the highest. As the MAF increases, the frequency of the SNPs gradually decreases. The vast majority of SNPs had an MAF\u0026thinsp;\u0026lt;\u0026thinsp;0.3. Additionally, the heterozygosity of most individuals and SNP markers was low (see Additional file: Fig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). We used r\u0026sup2; as the LD metric because it directly reflects the correlation between SNPs which is suitable for GWAS. The average LD across the whole genome was 1Kb (see Additional file: Fig. S2). The distribution of heterozygosity, MAF, and R\u0026sup2; across genome-wide marker loci is provided (see Additional file: Fig. S3).\u003c/p\u003e \u003cp\u003eThe sample exhibited a distinct population structure. To thoroughly analyze the population structures of the 706 genotyped yaks, we employed both PCA and NJ tree analysis, which are considered highly effective complementary methods. The NJ tree clustering results indicated that the yak population was divided into three groups. Individuals within each breed showed high genetic similarity, whereas there were significant genetic differences among the three breeds. These NJ tree clustering results presented approximately the same population structure as the PCA (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). In general, the kinship coefficients for the majority of individuals ranged from 0.1 to 0.25, indicating that the kinship within the yak population was relatively distant. However, the kinship heatmap was used to indicate a close relationship among the Yushu yaks population. (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). Based on the three-dimensional population structure, the distribution of individuals and populations can be clearly separated. It is mainly composed of three groups, with individuals within each group clustered together. Notably, the Huanhu yak population clustered tightly, indicating high genetic similarity among these individuals. (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003eC and \u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). The contributions of genetic variance explained by the first three principal components were 2.46%, 1.45%, and 0.84%, respectively.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eGWAS and Candidate Genes\u003c/h2\u003e \u003cp\u003eMultiple-locus testing models provide more powerful detection capabilities compared to single-locus models [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. To identify SNP markers associated with body size, the results of five models (GLM, MLM, MLMM, FarmCPU, and BLINK) were compared after analyzing the genotypic and phenotypic data of 520 yaks. The 1.0 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;5\u003c/sup\u003e of \u003cem\u003eP\u003c/em\u003e value was used to consider as significance. 94 SNPs exceeded the threshold, which were observed association with at least one of the seven traits. The number of significant SNPs was 13 for BL, 9 for BH, 7 for BCL, 34 for CCB1, 19 for CC, and 12 for CCB2. No significant SNPs were identified for ML trait. The most significant SNPs for BL, BH, BCL, CCB1, and CCB2 were as follows: SX_61170346 (chrX: 61,170,346 bp), S14_43229824 (chr14: 43,229,824bp), S7_114182165 (chr7: 114,182,165bp), S3_82995824 (chr3: 82,995,824 bp), and S11_15246495 (chr11: 15,246,495 bp). All three loci, S3_82995824 (chr3: 82,995,824 bp), S11_92286727 (chr11: 92,286,727 bp), and SY_1597301 (chrY: 1,597,301 bp) were the most significant SNPs associated with CC. No significant SNPs were found on 13 chromosomes (chr2, chr9, chr12, chr16, chr17, chr18, chr19, chr21, chr22, chr23, chr24, chr26, and chr28) for any of the traits evaluated.\u003c/p\u003e \u003cp\u003eAll five methods detected four markers that were located at the same position on the same chromosome and exerted significant effects on the same trait: S3_82995824 (chr3: 82,995,824 bp), S7_114182165 (chr7: 114,182,165 bp), S11_15246495 (chr11: 15,246,495 bp), and SX_61110559 (chrX: 61,110,559 bp) (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e3\u003c/span\u003eA and \u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). The markers collectively detected by the four methods in the GWAS were S1_32191242 (chr1: 32,191,242bp), S6_28171330 (chr6: 28,171,330 bp), and S20_34316377 (chr20: 34,316,377 bp). Four markers were jointly detected by the three models, and nine markers were detected by two models. The remaining markers were identified using a single GWAS model. The QQ plot results for markers associated with the BL and CC traits were obtained by comparing the results of the five GWAS models against the expected \u003cem\u003eP\u003c/em\u003e-values. The QQ plots indicated that the predicted and expected values for BL and CC closely matched across the five models (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e3\u003c/span\u003eB and \u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e3\u003c/span\u003eD). Figure\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e3\u003c/span\u003e presents the Manhattan and QQ plots for BL and CC, whereas those for the other traits are shown in the Additional file (see Additional file: Fig. S4). In the 1 kb LD region upstream and downstream, significant SNPs were annotated to relevant candidate genes through BLAST, resulting in a total of 23 annotated genes; detailed information on these genes can be found in the Additional file (Table S3). Among these, the protein kinase amp-activated catalytic subunit alpha 2 (\u003cem\u003ePRKAA2\u003c/em\u003e) and \u003cem\u003eSNX9\u003c/em\u003e genes have previously been reported to be associated with body size traits [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eFive SNPs detected through GLM, MLM, MLMM, FarmCPU, and BLINK were associated with more than one trait (CCB1 and CC) (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e4\u003c/span\u003eA), including the following: S3_82995824 (chr3: 82,995,824 bp), S11_92286727 (chr11: 92,286,727 bp), S20_34180196 (chr20: 34,180,196 bp), SY_1597301 (chrY: 1,597,301bp), and SY_25036649 (chrY: 25,036,649 bp). We calculated the PVE values for these five SNPs and produced a radar chart to reflect the proportion of phenotypic variance explained by each SNP for the CCB1 and CC traits. The results indicated that most SNPs explained a higher proportion of phenotypic variance for CC than for CCB1. Among these, SNP S20_34180196 explained the highest proportion of the phenotypic variance for both traits, reaching 100% (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e4\u003c/span\u003eB).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eGenome Selection\u003c/h2\u003e \u003cp\u003ePhenotypic analysis revealed that the second highest correlation between CC and CCB1 was 0.76. In the first GWAS, the average correlations between the observed and predicted values for CC and CCB1 were 0.743 and 0.512, respectively. In the second GWAS, the average correlation coefficients were 0.781 and 0.596, respectively. The average correlation for each trait in the second GWAS was greater than that in the first (Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e5\u003c/span\u003e).At the same time, we also conducted genomic selection for the two traits most related to the phenotype (BL and BCL), and the results were consistent with those of CC and CCB1(Fig. S5). These results demonstrate that the GS method combining marker-assisted selection with the best linear unbiased prediction improves the accuracy of body size traits. It is widely applied in breeding, aiding the prediction of complex traits and accelerating the genetic improvement process.\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eWe identified candidate genes associated with body size traits in yaks using a GWAS, providing crucial insights into the growth mechanisms of yaks in significantly different ecological environments. Yushu yaks are mainly distributed in the high-altitude regions of the Yushu Tibetan Autonomous Prefecture in Qinghai province and inhabit the cold grasslands near Maiwa Township in Hongyuan County, which is located in the Aba Tibetan and Qiang Autonomous Prefecture of Sichuan province. Huanhu yaks are found around the Qinghai Lake region in Qinghai province, These yaks inhabit various high-altitude and grassland ecosystems in the Qinghai and Sichuan provinces, which in turn results in notable differences in body size traits, such as BL and BH. Notably, the measurement of yak body size traits presents certain challenges owing to their relatively volatile temperament and defensiveness.\u003c/p\u003e \u003cp\u003eThe large amount of accurate phenotypic data obtained through computer vision recognition technology provides a reliable foundation for GWAS. Previous research has reported that the accuracy of estimation body size data obtained by using image recognition technology can be ranged from a minimum of 70% to as high as 94%-95% [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. This contact-free measurement approach helps prevent handling stress in the animals and improves repeatability, which is typically challenging using conventional measurement methods. In addition, computer vision recognition technology can help standardize the measurement of body size traits, thus reduces errors associated with manual measurements, and improves the efficiency and precision of large-scale data collection. After obtaining the phenotypic data, we recorded the mean and standard deviation of the yak body size traits by age group in the Additional file (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The yaks selected for this study were grazing yaks and had various birth dates. As there were few individuals younger than several months, 1 and 2 years of age, these were grouped together in one age category, which resulted in a slightly larger standard deviation for this age group. Table S2 shows that the overall age of the yaks has a stronger impact on these traits than their specific age in months, as evidenced by the decrease in standard deviation values for most traits as the yaks grow older. However, traits such as ML had a weaker relationship with age; therefore, their standard deviation values did not show a decreasing trend with age.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSignificant SNP and candidate gene information.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSNP No. *\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eChr\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePosition (bp)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eGene ID\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eGene Name\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eS3_97322275\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e97322275\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eENSBGRG00000000593\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003ePRKAA2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eS10_106316722\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e106316722\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eENSBGRG00000026688\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSNX9\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"5\"\u003e* SNP No. indicates the sequence number in the entire tag list. Chromosomes and locations refer to physical location information in genomic data. The gene names are annotated from the GTF file of the Bosgru_v3.0 reference genome.\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eIn the GWAS, we used five models to jointly detect significant SNP loci associated with traits with the aim of enhancing the robustness and reliability of the detection results through a multi-model approach. If a particular locus is identified as significant by multiple models, we can be more confident that it is associated with the target trait. Nevertheless, multi-model results cannot serve as an absolute validation method because each model has its own false discovery rate. The purpose of using multiple models is to integrate the perspectives of different models to increase the likelihood that detected loci are truly associated, rather than entirely excluding significant loci identified by a single model, which may also be genuinely associated with the target trait. The GWAS results based on five models identified 94 SNP loci that were significantly associated with seven traits. Based on the annotation information provided by the 1 kb upstream and downstream regions of the yak reference genome, we obtained relevant annotation information for only two SNPs. These candidate genes showed an association of S3_97322275 (chrY: 397,322,275 bp) with \u003cem\u003ePRKAA2\u003c/em\u003e, S10_106316722 (chr10: 106,317,722 bp) with SNX9 (Sorting Nexin 9). Among these genes, the \u003cem\u003ePRKAA2\u003c/em\u003e gene is a member of the adenosine monophosphate-activated protein kinase (\u003cem\u003eAMPK\u003c/em\u003e) family. They are heterotrimeric proteins that mainly detect the state of mammalian cells, regulate the new biosynthesis of fatty acids and cholesterol and play a key role in cell energy metabolism [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. Bovine \u003cem\u003ePRKAA2\u003c/em\u003e is located on chromosome 3, as shown through a somatic cell hybrid cell panel [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. A study on \u003cem\u003ePRKAA2\u003c/em\u003e in Pakistani Nili-Ravi and Kundi buffaloes revealed 17 SNP loci, which may be associated with energy metabolism and production traits [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. In a Xiangsu hybrid pig population, three SNPs correlated with body size traits were identified in \u003cem\u003ePRKAA2\u003c/em\u003e [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. In Yorkshire pigs, \u003cem\u003eBTA9\u003c/em\u003e, an olfactory receptor SNP in \u003cem\u003eSNX9\u003c/em\u003e is associated with growth traits [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e], and in Simmental beef cattle, \u003cem\u003eSNX9\u003c/em\u003e regulates body size at three growth stages [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eSNP loci significantly associated with related traits were utilized to predict the traits of interest in the GS. When BL was an interesting trait, no SNP loci significantly associated with BCL were identified. As a result, the multi-trait GS was not implemented, making its predictive accuracy the same as that of the single-trait GS. In light of these results, after randomly missing 20% of the BCL data and conducting a GWAS, significant SNP loci were successfully identified, leading to an improvement in the predictive accuracy of multi-trait GS (see Additional file: Fig. S5). The results demonstrated that the predictive accuracy of the multi-trait GS was higher than that of the single-trait GS. This is primarily because phenotypic correlations reflect phenotypic data; however, the underlying reason for this is the presence of genetic correlations. These genetic correlations arise from the influence of common genes that may have varying effects on different traits. For instance, certain genes may have a significant impact on one trait, whereas their effect on another trait may not be significant; however, they still play a role. Thus, even if the influence of these genes is not significant for certain traits, including those in the analysis, they can still provide a direct predictive capability.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003ePhenotype values obtained through image recognition measurements can be considered as traits and used in GWAS and GS. The ''Qingxin-1'' chip is particularly useful for key gene discovery and breeding applications in yaks. Multiple-trait GS involving common genes from related traits can enhance the prediction accuracy of important traits. These findings provide foundational information for the localization of quantitative trait genes and candidate genes associated with yak body size trait formation mechanisms.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe are grateful to Longri Breeding and Storage Farm in Hongyuan city, Huangheyuan Farm in the Qumalai city, and Yak Farm in the Gangcha city for providing the yaks. We also extend our appreciation to Editage (www.editage.cn) for their assistance with English language editing.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor\u003c/strong\u003e\u003cstrong\u003e\u0026rsquo;\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eJZ: data curation, writing of the first draft of the manuscript, visualization, testing, methodology, supervision, writing first draft of the manuscript, and validation. ZL, XL,YL,BY,HW,MZ,WP,SS,JZ,YK,XY and GW: sample collection. DL and JW: software, conceptualization and manuscript revision. All authors read and approved the final manuscript.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by the National Key Research and Development Project of China (2022YFD1601601); the Qinghai Science and Technology Program, China (2022-NK-110); the Heilongjiang Province Key Research and Development Project, China (2022ZX02B09); the Fundamental Research Funds for the Central Universities, China (Southwest Minzu University, YCZD2024006); and the Program of Chinese National Beef Cattle and Yak Industrial Technology System, China (CARS-37).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare\u0026nbsp;no competing\u0026nbsp;interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics Approval and Consent to Participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll animal procedures were approved by the Animal Ethics and Welfare Association of Southwest Minzu University (Approval No. 16053). All experimental protocols adhered to relevant institutional and national guidelines, and reporting complies with the ARRIVE guidelines.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEuthanasia and Sample Collection Procedures\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSample collection (blood, ear tissue, and hair) was conducted using standard, non-lethal, and minimally invasive techniques to ensure animal welfare. Yaks were restrained using conventional livestock handling methods without anesthesia. Blood was drawn via jugular venipuncture using sterile equipment, while ear tissue and hair were collected through ear notching and hair plucking methods routinely employed in livestock genetic research. All procedures were performed by trained personnel under veterinary oversight. No animals were euthanized; all were returned to their herds after sampling.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets generated and/or analyzed during the current study are publicly available at: https://github.com/jiahongZ-l.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eQiu Q, Zhang G, Ma T, et al. The yak genome and adaptation to life at high altitude. Nat Genet. 2012;44(8):946\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHayes B. Genome-Wide Association Studies and Genomic Prediction. New York: Springer Science\u0026thinsp;+\u0026thinsp;Business Media; 2013. pp. 149\u0026ndash;69.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDimensionality M, Pan Q, Hu T, Moore JH. Genome-Wide Association Studies and Genomic Prediction. Totowa, NJ: Springer Science\u0026thinsp;+\u0026thinsp;Business Media; 2013. p. 1019.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang H, Li H, Li J, Zhang Z, Zhao S. Genome-wide association study of growth and meat quality traits in yaks (Bos grunniens). BMC Genomics. 2020;21(1):574.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ede Los Campos G, Vazquez AI, Fernando R, Klimentidis YC, Sorensen D. Prediction of complex human traits using the genomic best linear unbiased predictor. PLoS Genet. 2013;9(7):e1003608. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pgen.1003608\u003c/span\u003e\u003cspan address=\"10.1371/journal.pgen.1003608\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGusev A, et al. Quantifying missing heritability at known GWAS loci. PLoS Genet. 2013;9(12):e1003993. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pgen.1003993\u003c/span\u003e\u003cspan address=\"10.1371/journal.pgen.1003993\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang LN, Wu BP, Jiang XH, et al. Development and validation of a visual image analysis for monitoring the body size of sheep. J Appl Anim Res. 2018;46(1):1004\u0026ndash;15.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSousa Junior LPB, et al. Genome-wide association and functional genomic analyses for various hoof health traits in North American Holstein cattle. J Dairy Sci. 2023;107(4):2207\u0026ndash;30.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang JB, Zhang ZW. GAPIT Version 3: Boosting power and accuracy for genomic association and prediction. Genomics Proteom Bioinf. 2021;19:629\u0026ndash;40.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu XR, Wang MX, et al. Identification of candidate genes associated with yak body size using a genome-wide association study and multiple populations of information. J Anim Sci. 2023;13:1470.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJia C, Li C, Fu D, et al. Identification of genetic loci associated with growth traits at weaning in yak through a genome-wide association study. Anim Genet. 2020;51(2):300\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu XL, Huang M, Fan B, et al. Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PLoS Genet. 2016;12(2):e1005767. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pgen.1005767\u003c/span\u003e\u003cspan address=\"10.1371/journal.pgen.1005767\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang M, Liu XL, Zhou Y, et al. BLINK: a package for the next level of genome-wide association studies with both individuals and markers in the millions. Gigascience. 2019;8(2):154.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang ZW, Ersoz E, Lai CQ, et al. Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 2010;42:355\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi LZ, Zheng XF, Wang J, et al. Joint analysis of phenotype-effect-generation identifies loci associated with grain quality traits in rice hybrids. Nat Commun. 2023;14:3930.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSegura V, Vilhj\u0026aacute;lmsson BJ, Platt A, et al. An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nat Genet. 2012;44:825\u0026ndash;30.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHardie DG, Carling D. The AMP-activated protein kinase\u0026ndash;fuel gauge of the mammalian cell? Eur J Biochem. 1997;246:259\u0026ndash;73.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMeng Q, Wang K, Liu X, et al. Identification of growth trait-related genes in a Yorkshire purebred pig population by genome-wide association studies. Asian Australas J Anim Sci. 2017;30:462\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQin Q, Dai DL, Zhang CY, et al. Identification of body size characteristic points based on the Mask R-CNN and correlation with body weight in Ujumqin sheep. Front Vet Sci. 2022;9:995724.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFernandes AFA, D\u0026oacute;rea JRR, Fitzgerald R, et al. A novel automated system to acquire biometric and morphological measurements and predict body weight of pigs via 3D computer vision. J Anim Sci. 2019;97(1):496\u0026ndash;508.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee JH, Koh H, Kim M, et al. Energy-dependent regulation of cell structure by AMP-activated protein kinase. Nature. 2007;447:1017\u0026ndash;21.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHwang SL, Chang HW. Natural vanadium-containing Jeju ground water stimulates glucose uptake through the activation of AMP-activated protein kinase in L6 myotubes. Mol Cell Biochem. 2012;360:401\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcKay SD, White SN, Kata SR, et al. The bovine 5\u0026rsquo; AMPK gene family: Mapping and single nucleotide polymorphism detection. Mamm Genome. 2003;14:853\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhan WA, Hussain T, Babar ME, et al. Polymorphic status of PRKAA2 gene in Pakistani buffaloes. Int J Agric Biol. 2015;18:903\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXu J, Ruan Y, Sun J, et al. Association analysis of PRKAA2 and MSMB polymorphisms and growth traits of Xiangsu hybrid pigs. Genes. 2023;14(1):113.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAn B, Xu L, Xia J, et al. Multiple association analysis of loci and candidate genes that regulate body size at three growth stages in Simmental beef cattle. BMC Genet. 2020;21:32.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"bmc-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gics","sideBox":"Learn more about [BMC Genomics](http://bmcgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gics","title":"BMC Genomics","twitterHandle":"#BMCGenomics","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-6282023/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6282023/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eThe yak is a unique livestock species bred on the Qinghai-Tibet Plateau. We utilized genotypic data obtained from the yak sequencing chip \"Qingxin-1\" and phenotypic data measured from image photographs using conversion between pixel and distance. The primary objective of this study was to conduct genome-wide association studies (GWAS) using five models to analyze seven body size traits. Specifically, the goals were to (1) characterize the genetic structure of three major yak breeds: Maiwa, Yushu, and Huanhu; (2) identify candidate genes that significantly influence yak body size traits; and (3) compare the prediction accuracy of single-trait and multi-trait genomic selection(GS).\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eA total of 94 markers were significantly (\u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;1e-05) associated with yak body size traits. GWAS results revealed that \u003cem\u003ePRKAA2\u003c/em\u003e and \u003cem\u003eSNX9\u003c/em\u003e were important candidate genes affecting the body size traits of yaks. The GS results indicated that combining marker-assisted selection and best linear unbiased prediction significantly improved the accuracy of predicting body size traits, the average accuracy in multi-trait GS was higher than that in single-trait GS.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eOur findings provide valuable insights into the genetic architecture underlying yaks, with implications for the development and selection of yak body size traits. The identification of key genes such as \u003cem\u003ePRKAA2\u003c/em\u003e and \u003cem\u003eSNX9\u003c/em\u003e offers promising targets for breeding programs aimed at optimizing body size traits, thereby supporting genetic improvements in yak populations.\u003c/p\u003e","manuscriptTitle":"Large-Scale Genome-Wide Association Analysis Reveals Candidate Genes in Yak Body Size Traits","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-05-09 11:21:32","doi":"10.21203/rs.3.rs-6282023/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewerAgreed","content":"141835077120731738875197479722189730866","date":"2025-05-08T05:23:33+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-05-06T00:47:04+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-04-28T13:22:04+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-04-23T17:41:31+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-04-22T04:50:01+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Genomics","date":"2025-04-22T04:48:57+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"bmc-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gics","sideBox":"Learn more about [BMC Genomics](http://bmcgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gics","title":"BMC Genomics","twitterHandle":"#BMCGenomics","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"afb2a1cb-404d-4385-96a1-906586758997","owner":[],"postedDate":"May 9th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-09-29T16:04:16+00:00","versionOfRecord":{"articleIdentity":"rs-6282023","link":"https://doi.org/10.1186/s12864-025-12017-7","journal":{"identity":"bmc-genomics","isVorOnly":false,"title":"BMC Genomics"},"publishedOn":"2025-09-26 15:58:01","publishedOnDateReadable":"September 26th, 2025"},"versionCreatedAt":"2025-05-09 11:21:32","video":"","vorDoi":"10.1186/s12864-025-12017-7","vorDoiUrl":"https://doi.org/10.1186/s12864-025-12017-7","workflowStages":[]},"version":"v1","identity":"rs-6282023","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6282023","identity":"rs-6282023","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.