Bridging GWAS to genes: an integrative multi-omics approach using cattle data | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Bridging GWAS to genes: an integrative multi-omics approach using cattle data Mohammad Ghoreishifar, Iona M. Macleod, Tuan Nguyen, Thomas J. Lopdell, and 5 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7693421/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 14 Jan, 2026 Read the published version in BMC Genomics → Version 1 posted 10 You are reading this latest preprint version Abstract Genome-wide association studies (GWASs) have identified thousands of loci for complex traits, but pinpointing causal variants and linking them to target genes remains challenging. Several strategies have been proposed to address these challenges, e.g., learning across the genome, using larger and multi-breed datasets, multi-trait analyses, leveraging multi-omics data, etc. We used a multi-breed dataset of over 81,000 cows from Australia, including Holstein, Jersey, and Australian Red, with phenotypes for milk lactose percentage (LP) and imputed sequence genotypes. LD pruning excluded SNPs with r2 > 0.95. We used BayesR to estimate SNP effects for LP (~ 1.1 million SNPs remained after LD pruning); These SNP effects were used to predict local genomic breeding values (GEBVs) for ~ 400 mammary RNA-sequenced cows from New Zealand. Then, genetic score omics regression (GSOR) was applied to test associations between observed gene expression and local GEBVs, identifying 711 significant genes (FDR ≤ 0.1) out of 12,000 genes expressed in the mammary gland. We developed a window-based test to investigate the significance of colocalization between GSOR results and GWAS summary statistics obtained from an independent study. We found 30 windows containing both GWAS signals and GSOR-significant genes (i.e., 34 genes), the overlap which was significantly higher than chance expectation ( P Fisher = 2.96×10⁻⁹). Among the 34 genes analyzed, 20 contributed to the significantly enriched gene ontology term ‘transmembrane transport’ and its child terms (FDR < 0.05). These terms are relevant to the physiology of lactose production in the mammary gland. We hypothesized that the 20 genes are the most likely causal genes for the trait because: mammary expression of these genes was associated with GEBV for the trait, they were significantly colocalized with GWAS signals, and they were enriched in gene ontology terms relevant to physiology of the trait. Our approach provides strong evidence for causal genes supported by multiple lines of evidence (GWAS, GSOR, and functional enrichment) and demonstrates the power of multi-trait & multi-omics data integration. Figures Figure 1 Figure 2 Figure 3 Figure 4 Introduction Genome-wide association studies (GWASs) have identified numerous genetic variants (such as single nucleotide polymorphisms or SNPs) associated with complex traits, yet the biological mechanisms connecting these variants to phenotypes remain elusive. Determining which variants are truly causal and which genes they affect is complicated by the extensive linkage disequilibrium (LD) surrounding lead SNPs [ 1 ]. In addition, because the majority of GWAS signals fall within noncoding regions of the genome [ 2 ], directly linking these variants to their target genes remains a significant challenge. Furthermore, the intricate regulatory mechanisms of genes, combined with the possibility that multiple genes within a single locus contribute to the trait, make identifying the true causal genes even more challenging [ 3 ]. To address these complexities, integrative approaches such as multi-trait multi-omics fine-mapping can help identify the genes through which SNPs influence quantitative traits. Many SNPs exhibit pleiotropic effects, influencing various biologically significant complex trait phenotypes [ 4 , 5 ]. Consequently, employing multi-trait fine-mapping techniques that analyse several traits at once can enhance the power of fine-mapping. When examining two traits, one being a complex trait of interest and the other a molecular trait, like the expression of a particular gene [ 6 ], multi-trait fine-mapping resembles colocalization analysis [ 7 ]. This analysis evaluates the genetic connection between the traits by investigating whether they share the same causal variants at a specific locus [ 7 ]. An illustrative case of pleiotropy arises when a SNP associated with a complex trait through GWAS also influences gene expression, thereby acting as both a trait QTL and an expression QTL (eQTL) [ 6 ]. Such SNPs highlight the genetic link between gene expression and phenotypic variation [ 6 ]. In particular, genes through which QTL act can be inferred by integrating multi-omics (or multi-trait) data analyses: (i) trait-associated QTL can be identified from genomic data, allowing the assignment of nearby candidate genes to them; and (ii) genes located near trait-associated QTL that also show correlated expression with trait variation (i.e., cis -regulated by nearby variants) can be inferred from combined transcriptomic and genomic data [ 6 ]. The functional knowledge of genes serves as another source of evidence for mapping GWAS loci (QTL) to their target genes through which they operate [ 6 , 8 ]. This information, organized in databases like the Gene Ontology (GO) resources [ 9 ], aids in pinpointing genes associated with specific biological functions. The functional enrichment of a subset of identified candidate genes within trait-relevant categories serves as additional evidence that the identified genes are the most likely causal genes [ 6 ]. In this study, we use milk lactose percentage (LP) as a model trait. LP is a useful model for dissecting the genetic architecture of complex traits because its synthesis is governed by a relatively simple and well-characterized pathway, mainly involving lactose synthase activity in the mammary gland [ 10 , 11 ]. This feature makes lactose an ideal trait for testing integrative genomic approaches to identify causal genes. Each of the three types of evidence including proximity of genes to trait-associated SNPs, correlation of gene’s expression with the phenotype of interest, and functional enrichment of genes in a biological pathway—has limitations and can result in false positives or negatives [ 6 ]. Nonetheless, when multiple methods identify the same genes, our confidence in their biological significance increases. Expanding on our earlier integrative approach [ 6 ], we now apply this methodology to a different trait using a much larger multi-breed dataset to enhance statistical power and improve the robustness and generalizability of our findings. The objectives of this study are (i) to perform gene-based association tests to identify significant gene expression–trait associations using genetic score omics regression (GSOR), introduced by Xiang et al. [ 12 ], (ii) to test the significance of window-based co-localization between GSOR-identified genes and GWAS loci reported by Lopdell et al. [ 13 ], and (iii) to obtain a list of candidate genes from the co-localization of GSOR-identified genes with GWAS loci for functional enrichment analyses. Overall, we aim to demonstrate the utility of combining GSOR with GWAS and functional enrichment analyses to map candidate genes to QTL, with milk LP as a model trait. Results Figure 1 illustrates Manhattan (Miami) plot showing mammary GSOR-identified genes in this study as well as trait associated SNPs identified through GWAS for milk lactose percentage (LP) reported by Lopdell et al. [ 13 ]. A total of 12, 237 genes were expressed in the mammary RNA-seq dataset [ 13 – 15 ], of which 711 were significantly associated with local GEBVs for milk LP in the GSOR analysis (FDR ≤ 0.10) (Additional file 1). Among the 12,237 expressed genes, 242 genes were located within 100 kb windows that contained at least one GWAS locus. Of those 242, 34 genes were also significant based on the GSOR analysis (Additional file 1). We also used white blood cells (WBC) RNA-seq dataset [ 16 – 18 ]. Descriptive statistics about the GSOR analyses in both mammary and WBC datasets are presented in Table 1 . A total of 12,536 genes were expressed in the WBC RNA-seq dataset, of which 986 were significantly associated with local GEBVs for milk LP in the GSOR analysis. Among the 12,536 expressed genes in WBC, 242 genes were located within 100 kb windows that contained at least one GWAS locus. Of the 242 WBC genes, 33 genes were also significant based on the GSOR analysis (Table 1 & Additional file 2). These genes do not completely overlap with those identified using mammary RNA-seq. Table 1 Descriptive statistics of mammary and WBC GSOR genes and their co-localization with GWAS loci based on 100 Kb windows RNA-seq data Total expressed Genes N significant GSOR genes 1 N expressed genes in windows including GWAS loci 2 N significant GSOR genes in windows including GWAS loci 3 Mammary 12, 237 711 242 34 WBC 12, 536 986 242 33 1 Number of GSOR-identified genes at FDR ≤ 0.1; 2 Number of expressed genes located within non-overlapping windows of 100 Kb length including at least one GWAS locus; For GWAS, summary statistics for milk LP from an independent study using 12,000 samples was used [ 13 ] (see Methods); 3 Number of GSOR-identified genes located within 100 Kb windows including at least one GWAS loci. Table 1 shows window-based co-localization between GSOR-identified genes and trait-associated SNPs (GWAS loci) obtained using 100 Kb non-overlapping windows. We tested statistical significance of this co-localization, and the results are presented in Fig. 2 . Based on Fisher Exact test, our results show that co-localization between GSOR-identified genes and GWAS loci within 100 Kb and 500 Kb windows were significantly greater than expected by chance in both mammary and WBC datasets. For example, using 100 Kb windows with mammary RNA-seq data, we found 30 windows where the GSOR-identified genes shared windows with GWAS loci, 291 windows contained only GWAS loci, 631 included only GSOR-identified genes, while 23,909 windows had neither GWAS loci nor GSOR-identified gene ( P Fisher = 2.96×10 − 9 ; odds ratio = 3.9). We found that smaller window size (100 kb) revealed stronger co-localization, as evidenced by more significant P value and higher odds ratios (Fig. 2 ). Furthermore, across the RNA-seq datasets, mammary RNA-seq demonstrated stronger co-localization with GWAS loci than WBC RNA-seq, supported by more stringent P values and higher odds ratios (Fig. 2 ). Table 2 shows descriptive statistics about the 34 mammary GSOR-identified genes co-localized with GWAS loci. Of the 34 co-localised GSOR-GWAS genes presented in Table 2 , 31 were successfully converted to human ortholog and used in functional enrichment analyses, along with the 11,728 background genes that successfully converted (out of 12, 237 expressed genes). This data is presented in Additional file 3. Successfully converted were used to conduct enrichment analyses using gprofiler2 [ 19 ] R package. Table 2 GSOR-identified genes co-localized in 30 windows (100 Kb) with GWAS loci. P (GWAS) represents the most stringent P value within the corresponding window Chr Win Start Win End SNP P (GWAS) GSOR-gene FDR (GSOR) 1 150530998 150630997 1:150591449 8.12E-09 KCNJ15 0.008 1 151130998 151230997 1:151180934 2.39E-11 ETS2 0.087 1 152230998 152330997 1:152324264 2.31E-09 SH3BP5 9.36E-49 3 15344539 15444538 3:15368547 3.75E-11 THBS3, GBA 0.057, 0.011 3 15444539 15544538 3:15464742 1.67E-14 SLC50A1 7.62E-06 3 53544539 53644538 3:53597089 3.28E-15 LRRC8C 0.002 3 53644539 53744538 3:53674128 7.86E-16 LRRC8B 1.01E-17 3 54044539 54144538 3:54109337 9.76E-14 GBP5 1.35E-09 3 54844539 54944538 3:54943906 1.04E-09 KYAT3 0.020 6 22391294 22491293 6:22469293 4.95E-11 SLC39A8 0.0003 6 45091294 45191293 6:45129256 3.64E-09 SLC34A2 0.004 6 85991294 86091293 6:86055882 1.06E-11 ENAM 0.001 10 2166988 2266987 10:2190824 2.94E-13 NREP 0.0002 13 54439729 54539728 13:54476595 5.13E-09 SLC17A9 3.07E-08 15 27470773 27570772 15:27543169 8.40E-11 APOA1 0.015 16 66207162 66307161 16:66279444 1.11E-12 IVNS1ABP 2.31E-17 17 51725017 51825016 17:51739903 1.22E-09 DNAH10 3.07E-08 17 53925017 54025016 17:53934607 2.89E-10 P2RX4 0.011 17 72325017 72425016 17:72377251 1.26E-10 LRRC74B 0.057 19 32957926 33057925 19:33056069 4.58E-14 TVP23B 0.072 19 41857926 41957925 19:41909633 2.90E-11 HAP1 0.003 19 42157926 42257925 19:42236007 1.53E-13 NKIRAS2, RAB5C, DHX58 0.0032, 0.035, 9.8e-08 19 42257926 42357925 19:42349652 9.38E-14 GHDC, STAT5B 0.035, 0.087 19 42357926 42457925 19:42358091 9.38E-14 STAT3 0.072 19 58057926 58157925 19:58114277 3.08E-10 C19H17orf80 0.087 19 60557926 60657925 19:60560812 1.23E-24 KCNJ2 3.04E-23 19 61357926 61457925 19:61408199 2.44E-09 ABCA10 7.62E-06 19 61457926 61557925 19:61512690 1.66E-09 ABCA9 0.072 27 36520984 36620983 27:36523102 2.84E-22 GPAT4 0.0007 29 9447532 9547531 29:9545883 4.03E-35 PICALM 2.63E-45 We identified 22 significantly enriched gene ontology biological process (GO:BP) and Reactome pathways, where terms “transport”, “transmembrane transport” and their child terms dominated the list (Table 3 ). Table 3 Gene list enrichment analyses using mammary GSOR-identified genes co-localized with GWAS loci in 100 Kb windows Source Term ID Term Name FDR Genes GO:BP GO:0001408 guanine nucleotide transport 0.001 LRRC8C, LRRC8B, SLC17A9 GO:BP GO:1903790 guanine nucleotide transmembrane transport 0.001 LRRC8C, LRRC8B, SLC17A9 GO:BP GO:0055085 transmembrane transport 0.001 KCNJ15, SLC50A1, LRRC8C, LRRC8B, SLC39A8, SLC34A2, SLC17A9, P2RX4, HAP1, KCNJ2, ABCA10, ABCA9 GO:BP GO:0098739 import across plasma membrane 0.005 KCNJ15, LRRC8C, LRRC8B, SLC39A8, KCNJ2 GO:BP GO:1901679 nucleotide transmembrane transport 0.007 LRRC8C, LRRC8B, SLC17A9 GO:BP GO:1901264 carbohydrate derivative transport 0.009 SLC50A1, LRRC8C, LRRC8B, SLC17A9 GO:BP GO:0015868 purine ribonucleotide transport 0.010 LRRC8C, LRRC8B, SLC17A9 GO:BP GO:0051503 adenine nucleotide transport 0.011 LRRC8C, LRRC8B, SLC17A9 GO:BP GO:0072530 purine-containing compound transmembrane transport 0.011 LRRC8C, LRRC8B, SLC17A9 GO:BP GO:0015865 purine nucleotide transport 0.011 LRRC8C, LRRC8B, SLC17A9 GO:BP GO:0006862 nucleotide transport 0.016 LRRC8C, LRRC8B, SLC17A9 GO:BP GO:0034220 monoatomic ion transmembrane transport 0.016 KCNJ15, LRRC8C, LRRC8B, SLC39A8, SLC34A2, P2RX4, HAP1, KCNJ2 GO:BP GO:0040014 regulation of multicellular organism growth 0.024 STAT5B, STAT3, GPAT4 GO:BP GO:0042592 homeostatic process 0.042 GBA1, SLC39A8, SLC34A2, P2RX4, HAP1, NKIRAS2, STAT5B, STAT3, KCNJ2, PICALM GO:BP GO:0098659 inorganic cation import across plasma membrane 0.042 KCNJ15, SLC39A8, KCNJ2 GO:BP GO:0099587 inorganic ion import across plasma membrane 0.042 KCNJ15, SLC39A8, KCNJ2 GO:BP GO:0006811 monoatomic ion transport 0.042 KCNJ15, LRRC8C, LRRC8B, SLC39A8, SLC34A2, P2RX4, HAP1, KCNJ2 GO:BP GO:0006810 transport 0.042 KCNJ15, SLC50A1, LRRC8C, LRRC8B, SLC39A8, SLC34A2, SLC17A9, P2RX4, HAP1, RAB5C, STAT5B, STAT3, KCNJ2, ABCA10, ABCA9, GPAT4, PICALM GO:BP GO:0048871 multicellular organismal-level homeostasis 0.044 GBA1, SLC39A8, P2RX4, NKIRAS2, STAT5B, STAT3, PICALM REAC REAC:R-HSA-382551 Transport of small molecules 0.008 SLC50A1, LRRC8C, LRRC8B, SLC39A8, SLC34A2, ABCA10, ABCA9 REAC REAC:R-HSA-186797 Signaling by PDGF 0.008 THBS3, STAT5B, STAT3 REAC REAC:R-HSA-425407 SLC-mediated transmembrane transport 0.035 SLC50A1, SLC39A8, SLC34A2 1. GSOR-identified genes were converted to human orthologs before gene list enrichment analyses. We also performed gene list enrichment analysis using WBC GSOR-identified genes co-localized with GWAS loci in the 100 Kb windows. This analysis revealed five significant GO terms, of which only the GO term “response to growth hormone” (GO:0060416) could be indirectly related to physiology of milk LP. These results are presented in Additional file 4. We found that only two genes, STAT5B and HAP1 shared in both mammary and WBC GSOR-GWAS genes (Table 3 & Additional file 4). According to Wainberg et al. [ 20 ], transcriptome-wide association study or TWAS [ 21 ] is particularly susceptible to false-positive gene associations when expression data come from irrelevant tissues or cell types. It seems the same may also be true for GSOR [ 12 ], as the GSOR-identified genes from mammary gland expression data revealed functionally enriched terms that aligned more closely with milk lactose physiology, whereas the pathways detected from WBC data tended to be broader and less specific. To assess whether a broader genomic window could improve the detection of relevant signals to LP, we increased the window size, allowing mammary GSOR-identified genes to fall within 500 Kb of GWAS loci. A total of 22 significant terms were identified (Additional file 5), of which 15 shared with terms identified when using 100 Kb windows (Fig. 3 ). Discussion In our previous study [ 6 ], we showed that integrating GSOR, GWAS, and functional enrichment can help prioritize candidate genes for milk composition traits, though inference was limited by small sample sizes. Here, we leveraged a large, multi-breed reference population (> 81,000 cows) to train genomic prediction models, increasing both GEBV accuracy and the power to detect expression–trait associations. This enabled us to identify 20 genes likely to mediate lactose QTL, supported by convergence between GSOR, GWAS, and enrichment results. Because long-range LD, complex regulation, and small QTL effects complicate causal gene discovery, our framework, integrating local GEBV with tissue-relevant expression data, GWAS co-localization, and functional enrichment, may help overcome these challenges and offer a scalable, biologically grounded strategy to dissect the molecular basis of complex traits in livestock and other species. When the most significant GWAS variant is located within a coding region, the causal gene may be directly implicated [ 6 ]. However, the majority of trait QTL reside in non-coding regions [ 6 , 20 , 22 , 23 ], and thus, they have unknown functions. Non-coding QTL can impact phenotypes by regulating gene expression through cis or trans regulatory mechanisms. Even when causal variants are known, it is challenging to determine their functional impact and link them to specific target genes. This is because regulatory elements can act over long genomic distances and their effects can be highly cell-type specific [ 24 ]. TWASs were proposed to fill this gap by linking predicted tissue-specific gene expression to observed phenotypes [ 20 , 21 , 25 ]. However, one limitation of TWAS is that they usually rely on a limited set of individuals with assayed expression data and genotypes, which can limit the statistical power and decrease the accuracy of models used for predicting gene expression [ 12 ]. In this study, we used GSOR to identify statistically significant gene-trait associations. Xiang et al. [ 12 ] found that GSOR is more powerful than TWAS because it uses the contribution to phenotype of the variants close to the gene whose expression is being investigated rather than the complete phenotype. However, the problem of spurious associations due to LD still exists [ 6 ]. Gene list enrichment analyses may help overcome LD issues by aggregating signals across functionally related genes, highlighting biologically coherent associations rather than single-gene signals confounded by LD. As mentioned, GSOR does not provide direct evidence of causality, as correlations between gene expression and the trait can be induced by LD, among other factors. For instance, if two genes share cis -eQTLs in LD but only one eQTL is causal, both genes may appear statistically associated with the trait [ 6 ]. Therefore, an additional source of evidence is required to prioritize candidate causal genes. We evaluated significance of the overlap between GSOR-identified genes and GWAS loci by dividing the genome into non-overlapping windows of 100 Kb. We showed that GSOR-identified genes were significantly enriched within windows containing GWAS loci. This window based co-localization enrichment is consistent with our previous findings using a different milk composition trait [ 6 ], suggesting that the expression of these genes may mediate the effects of GWAS loci on complex traits through a cis -regulatory mechanism. However, LD can still confound these signals, so functional enrichment analyses are particularly important for highlighting the most relevant GWAS-GSOR genes. Milk LP is a promising target phenotype to test this hypothesis because its underlying biology is relatively well understood, with established roles for specific pathways such as lactose synthesis, ion transport, and hormonal signaling [ 10 , 11 , 13 ], allowing for meaningful interpretation of enrichment results. Of the 31 genes used in functional enrichment analyses, 55% share the term transport (GO:0006810 ; P = 0.042) and 39% share its descendant term “transmembrane transport” (GO:0055085; P = 0.001). Lactose is the major osmotic solute in milk, responsible for drawing water into the alveolar lumen and driving milk volume [ 10 , 26 ]. As a result, LP shows limited variation. However, it is not the only osmolarity regulator and an increase in other osmolytes can change LP. Indeed, studies show that increases in sodium, potassium or chloride in milk are inversely correlated with lactose concentration—highlighting that any process affecting ionic balance or secretion can shift lactose levels via osmotic compensation [ 27 , 28 ]. We identified monoatomic ion transmembrane transport (GO0034220; FDR = 0.016) enriched with KCNJ15 , LRRC8C , LRRC8B , SLC39A8 , SLC34A2 , P2RX4 , HAP1 , and KCNJ2 genes. KCNJ2 (Kir2.1) and KCNJ15 (Kir4.2) both encode inwardly rectifying potassium channels (Kir) that favor K⁺ influx under hyperpolarizing conditions [ 29 ]. Kamikawa and Ishikawa [ 30 ] identified Kir2.1-like channels — encoded by KCNJ2 — functionally expressed in secretory mammary epithelial cells of lactating mice. These authors concluded that different types of K + channels might play a role in producing species-specific milk, while particular type of K + channels might generally express and participate in milk production in mammalian milk secretary cells [ 30 ]. An early study on ionic concentrations in milk found a significant association between different ions, such as K + , and lactose levels in milk [ 31 ]. In cattle, a QTL on BTA19 including KCNJ2 and KCNJ16 has been associated with lactose concentration [ 13 , 32 ], protein concentration [ 33 ] and milk yield [ 34 ]. Therefore, an eQTL that affects the abundance of KCNJ2 or KCNJ15 proteins (and K + ion transport) might lead to osmotic compensation influencing milk LP. Maintaining cell volume is essential for mammary epithelial cells to remain functional during the osmotic shifts involved in milk secretion. Voltage-regulated anion channels (VRACs) is one mechanism by which they can do this. VRACs encoded by LRRC8 proteins A to E, help regulate cell volume by exporting Cl − ions and small organic anions [ 35 , 36 ]. P2RX4 is an ATP-gated cation channel permeable to Ca²⁺; ATP-triggered P2RX4 activation modulates VRAC in rat liver cells [ 37 ]. While not confirmed in mammary cells, this mechanism may apply during milk secretion. Together, these “ion transporters” help set the ionic gradients that draw water into the alveoli along with lactose. Lactose, the primary sugar in milk, is synthesized in Golgi vesicles and secreted along with ions and water. In fact, lactose accumulation in secretory vesicles draws water into milk by osmosis – milk volume is almost perfectly correlated (r ≈ 0.99) with lactose production [ 38 ]. This osmotic role of lactose means that many membrane transporters (for solutes and ions) must be active during lactation to supply substrates and maintain ion gradients. Consistent with this, a recent study found that genes involved in membrane transport were enriched among loci affecting milk lactose content [ 13 ]. Thus, our finding of enriched GO terms for “transmembrane transport” and “ion transmembrane transport” fits well with known lactation physiology, highlights that genes in these categories can modulate lactose synthesis and milk osmolarity [ 13 ]. Some genes may affect LP by directly affecting lactose production. SLC50A1 (SWEET1) is a Golgi-localized sugar transporter that exports glucose (direct precursor for lactose synthesis) into the Golgi lumen [ 39 ]. Experimental studies support SLC50A1 providing glucose for lactose production in mammary cells [ 39 , 40 ]. Thus, variation in SLC50A1 could directly affect lactose synthesis by altering sugar supply. More broadly, enrichment of “SLC-mediated transmembrane transport” (Reactome R-HSA-425407) in our data – which includes SLC50A1 , SLC39A8 , and SLC34A2 – points to solute carrier proteins as key players. Other solute carriers in our gene list fit this picture. SLC39A8 (ZIP8) is a zinc/manganese importer; it moves Mn 2+ and Zn 2+ into cells [ 41 ]. Manganese is an essential cofactor for many glycosyltransferases, including the β4-galactosyltransferase I subunit of the lactose synthase, and is required for the function of various Golgi enzymes [ 42 ]. SLC34A2 (also known as NPT2b ) is a sodium-dependent phosphate transporter highly expressed in lactating mammary epithelium [ 43 ]. While its precise role in milk synthesis was not directly tested, its known function in phosphate uptake suggests a potential role in supporting ATP and nucleotide sugar synthesis during lactose production. These transporters illustrate how the enriched GO terms reflect lactation biology: they supply essential substrates (glucose, phosphate) and cofactors (Mn, Zn, Ca) for lactose synthesis, while maintaining ion gradients and water balance critical for milk secretion. The Reactome pathway “Signaling by PDGF (Platelet-derived growth factor)” involves STAT5B , STAT3 , and THBS3 . STAT5 (both STAT5A and STAT5B ) is a key transcription factor for lactation: activated by prolactin via Janus kinases (JAK2), it drives the expression of milk proteins and enzymes, and epithelial cell differentiation during pregnancy [ 44 ]. Indeed, genetic knockout of STAT5A in mice causes failure of alveolar differentiation and lactogenesis [ 44 ] and experiments note STAT5 as “a primary transcription factor for milk production” [ 45 ]. PDGF promotes stromal/epithelial proliferation and survival. Thus, enrichment of this pathway suggests that growth factor‐STAT signaling networks influence mammary development, which potentially impact milk LP phenotype. For example, greater STAT5 signaling would boost lactose synthesis (via more alveolar cells and lactose enzymes), whereas STAT3 activity would reduce lactose output. In sum, the “Signaling by PDGF” term likely flags a set of regulatory genes (STAT5B/STAT3) that govern cell differentiation, survival, and secretory activity in the udder. Changes in these signals would alter mammary function (and indirectly lactose percentage) even though they are not lactose‐specific per se. In conclusion, we demonstrated the utility of integrating multi-omics data with functional enrichment to identify genes likely regulated by QTL associated with milk LP. We used LP as a test case to assess whether combining gene expression, GWAS, and functional enrichment could pinpoint causal genes for complex traits. To mitigate the confounding effects of LD, we leveraged functional enrichment analyses, which highlighted biologically coherent associations. Our results show that this strategy successfully identified 20 genes with clear mechanistic links to LP, acting primarily through indirect regulatory pathways. Future experimental validations will be needed to confirm these findings. Methods Overview This study used four independent datasets: (1) ~ 400 New Zealand (NZ) cows with mammary RNA-seq data [ 13 – 15 ]. This data is used for gene-based associations test, specifically genetic score omics regression (GSOR); introduced by Xiang et al. [ 12 ]. (2) ~ 400 Australian (AU) cows with white blood cells (WBC) RNA-seq data [ 16 – 18 ]. This data also used for gene-based associations test (GSOR). (3) ~ 81,000 AU multibreed cows with lactose percentage (LP) phenotypes and SNP genotypes (referred to as GSOR reference population). This data used to estimate local GEBVs for LP in the RNA-sequenced cows of NZ and AU. (4) GWAS summary statistics for milk LP, based on 12,000 NZ cows and reported by Lopdell et al. [ 13 ] were used to test for co-localization between significant GSOR genes and GWAS loci. Phenotypic data for milk lactose percentage Phenotypic data for milk test-day LP was provided by DataGene (an independent industry-owned organisation that provides the national genetic evaluations for dairy cattle in Australia). Animals born between 2012 to 2024 that had Holstein, Jersey or Australian Red breed codes were retained. Outliers deviating ± 3 SD of the mean phenotypic value of LP were excluded. Test-day records were included if a cow’s age at calving was between 18 to 25 months and days in milk (DIM) between 5 to 315 days. Additionally, herds and test dates with less than five observations were excluded from the analysis. The final data contained 4,995,316 test-day records belonging to 477,822 cows. ASReml [ 46 ] was used to adjust phenotypes for fixed effects and average them for each cow (i.e. effect of Cow) following the model proposed by [ 47 ] $$\:{\text{y}}_{\text{i}\text{j}\text{k}\text{l}\text{m}}={\mu\:}+{\mathbf{H}}_{\mathbf{i}}{\mathbf{T}\mathbf{D}}_{\mathbf{j}}+\:{\mathbf{M}}_{\mathbf{k}}+\mathbf{p}\mathbf{o}\mathbf{l}\left(\mathbf{D}\mathbf{I}\mathbf{M},\:8\right)+\mathbf{p}\mathbf{o}\mathbf{l}\left(\mathbf{A}\mathbf{g}\mathbf{e},\:2\right)+{\mathbf{C}\mathbf{o}\mathbf{w}}_{\mathbf{l}}+\:{\mathbf{e}}_{\mathbf{i}\mathbf{j}\mathbf{k}\mathbf{l}\mathbf{m}}$$ where, \(\:{\text{y}}_{\text{i}\text{j}\text{k}\text{l}\text{m}}\) is the test-day record for LP (N = 4, 995,316), \(\:{\mu\:}\) is the effect of overall mean, \(\:{\text{H}}_{\text{i}}{\text{T}\text{D}}_{\text{j}}\) is the effect of the \(\:{i}^{th}\) herd and \(\:{j}^{th}\) test-day (N = 82,058); \(\:{\text{M}}_{\text{k}}\) is the effect of the \(\:{k}^{th}\) calving month (N = 12); \(\:\text{p}\text{o}\text{l}\left(\text{D}\text{I}\text{M},\:8\right)\:\) and \(\:\text{p}\text{o}\text{l}\left(\text{A}\text{g}\text{e},\:2\right)\) are the regression coefficients of Legendre polynomials of order 1–8 for DIM and of order 1–2 for age at calving in months; \(\:{\text{C}\text{o}\text{w}}_{\text{l}}\) and \(\:{\text{e}}_{\text{i}\text{j}\text{k}\text{l}\text{m}}\) are the random effects of the \(\:{l}^{th}\) cow (N = 477,822) and the random residual term, respectively. The following sections will explain our integrative multi-omics approach in detail. A schematic overview of the method is illustrated in Fig. 4 . Genotypic data for the GSOR reference population Of the cows with phenotypic data described above, SNP panel genotypes were available for 81,658 cows, comprising 79% Holstein, 16% Jersey and 5% Australian Red. All SNP markers were mapped to the ARS-UCD1.2 reference genome [ 48 ], and included autosomal markers as well as those on the non-pseudo autosomal region of the X chromosome. Any raw genotypes with a GenCall score of < 0.6 were set to missing and any marker or animal with 10% or more missing genotypes was discarded. The remaining sporadic missing genotypes were imputed with FImpute v.3 software [ 49 ]. The genotypes were from a range of SNP panels (≥ 6,000 markers) and were imputed with FImpute v.3 to a custom 74K SNP genotype panel that is used by DataGene for national genetic evaluations [ 50 ]. The imputation reference population for the 74K SNP panel included over 28,000 animals (Holstein, Jersey and Australian Red breeds). Next, the 74K SNP genotypes were imputed to the Illumina High Density (HD) Bovine SNP panel that included 714,451 SNP in an imputation reference population of 2,910 animals (breeds as for the 74K panel). Prior to HD imputation, approximately 20,000 SNP in the custom 74K set that did not overlap the HD set were removed and then added back in before the final imputation to whole genome sequences (WGS). The sequenced imputation reference population included 5,036 Bos taurus cattle from Run9 of the 1000 Bull Genomes project [ 51 ]. Following Nguyen et al. [ 52 ], sequence variants were pre-filtered (49,114,602 variants remaining) and phased with Eagle v2 [ 53 ] before using Beagle v5.2.1 [ 54 ] to impute all animals to WGS. Post-imputation, sequence variants with a Beagle DR2 (estimated imputation accuracy) < 0.9 were excluded, as well as those with minor allele frequency (MAF) < 0.01 and genotype frequencies deviating from Hardy-Weinberg equilibrium ( P ≤ 1×10 − 8 ). LD pruning was performed using PLINK v1.9 [ 55 ] with parameters --indep-pairwise 5000 500 0.95 to exclude variants that were in high LD (r2 > 0.95). These procedures retained 1,181,628 variants for subsequent analyses. We used this data to train a BayesR model [ 56 ] using BayesR3 software [ 57 ] to estimate prediction equations (SNP effects) for LP. The model was as follows: \(\:\mathbf{y}=\mathbf{X}\mathbf{u}+\mathbf{V}\mathbf{g}+\mathbf{e}\) , where \(\:\mathbf{y}\) is an \(\:\text{n}\times\:1\) vector of phenotypic records, in which \(\:\text{n}\) is the number of animals in the reference population (N=81,658); \(\:\mathbf{X}\) is an \(\:\text{n}\times\:\text{m}\) incidence matrix, \(\:\mathbf{u}\) is \(\:\text{m}\) × 1 vector of fixed effects and \(\:m\) corresponds to fixed effects including breed effect with three levels; \(\:\mathbf{V}\) is the coded genotype, representing the observed genotypes of each individual; g is a vector of SNP effects; and e is the residual term. BayesR3 was run with 50,000 MCMC iterations and 25,000 burn-in. In the BayesR3 model, the SNP effects follow a mixture of four normal distributions with zero mean and additive genetic variances of zero, 0.0001, 0.001, and 0.01 times the genetic variance. Starting values for proportions of the four SNP effect distributions were defined as 0.994, 0.0055, 0.00049, and 0.00001, respectively. Prediction equations were applied to the RNA-sequenced animals to calculate their local GEBVs for LP (described below). RNA-seq data and gene-based associations test (GSOR) We analysed two distinct sets of RNA sequencing data from WBC and mammary tissue. The WBC gene expression data were obtained from 313 lactating cows of multiple breeds from the Agriculture Victoria research farm (Ellinbank Smart Farm). Details of sample processing, RNA extraction, library preparation, and sequencing are provided in [ 16 – 18 ]. For the WBC RNA-seq animals, genotypes were imputed to WGS using Run9 of the 1000 Bull Genomes project [ 51 ] as described above for the GSOR reference population. The mammary gene expression data include 386 lactating NZ cows, including Holstein, Jersey and their crosses. The cows in this dataset, were previously imputed to WGS using 1,298 imputation reference animals, including 306 Holstein-Friesian, 219 Jersey, 717 crossbreds (Holstein-Friesian x Jersey) and 56 other breeds as described in [ 58 ]. For the WBC and mammary RNA-sequenced animals, we retained the same 1,181,628 variants that were retained in the GSOR reference population. Prior to fitting per-gene GSOR models, we calculated local GEBVs for RNA-sequenced animals using the estimated effects for variants located within a ± 1 Mb window centred on the transcription start site of the gene being tested. These local GEBVs served as response variables in the GSOR analysis, in which gene expression levels were tested as predictors to identify genes whose expression is associated with genetically driven, cis -regulatory variation in the trait (Fig. 4 ). The following per-gene GSOR model was applied for each RNA-seq dataset [ 6 ]: $$\:{\widehat{\mathbf{G}\mathbf{E}\mathbf{B}\mathbf{V}}}_{\mathbf{l}\mathbf{o}\mathbf{c}\mathbf{a}\mathbf{l}}={\mathbf{b}}_{1}\varvec{\Omega\:}+{\mathbf{b}}_{2}\mathbf{x}+\mathbf{g}+\:\mathbf{e}$$ where \(\:{\widehat{\mathbf{G}\mathbf{E}\mathbf{B}\mathbf{V}}}_{\mathbf{l}\mathbf{o}\mathbf{c}\mathbf{a}\mathbf{l}}\) is an \(\:\text{m}\times\:1\) vector of local GEBVs predicted (in the RNA-sequenced cows) using the SNP effects from a ± 1 Mb window around the gene being tested; \(\:\varvec{\Omega\:}\) is a \(\:\text{m}\times\:1\) vector of tissue-specific expression of the gene across the corresponding RNA-sequenced cows; \(\:{\mathbf{b}}_{1}\) is the regression coefficient of the \(\:{\widehat{\mathbf{G}\mathbf{E}\mathbf{B}\mathbf{V}}}_{\mathbf{l}\mathbf{o}\mathbf{c}\mathbf{a}\mathbf{l}}\) on \(\:\varvec{\Omega\:}\) ; x represents a design matrix for fixed effects (see next paragraph), and \(\:{\text{b}}_{2}\) is the vector of fixed effects for the corresponding RNA-sequenced animals; \(\:\mathbf{g}\) is a vector of random polygenic effects across the same RNA-sequenced cows, assumed to follow a normal distribution \(\:\mathbf{g}\sim\text{N}(0,\:\mathbf{G}{{\sigma\:}}_{\text{g}}^{2})\) , where G is the genomic relationship matrix [ 59 ], and \(\:{{\sigma\:}}_{\text{g}}^{2}\) is the additive genetic variance explained by the whole genome SNPs; \(\:\mathbf{e}\) is the vector of residuals, assumed to follow a normal distribution \(\:\mathbf{e}\sim\text{N}\left(0,\:\mathbf{I}{{\sigma\:}}_{\text{e}}^{2}\right)\) , where I is an identity matrix, and \(\:{{\sigma\:}}_{\text{e}}^{2}\) is residual variance. Models for the WBC RNA-seq dataset incorporated the experiment as a categorical fixed effect, defined by five levels corresponding to sampling times, while days in milk (DIM) was used as a quantitative fixed effect, with a mean and SD of 86 (± 36) days. In contrast, no fixed effects were necessary for the mammary RNA-seq dataset [ 6 ]. After applying the per-gene GSOR model to all genes within each RNA-seq dataset, the P values were corrected to address the multiple testing issue. Within each dataset, genes with a FDR GSOR ≤ 0.1 were regarded significant (GSOR-identified genes). GWAS summary statistics : The original SNP coordinates were based on the UMD3.1 bovine reference genome. To ensure consistency with our datasets, we converted these coordinates to the ARS-UCD1.2 [ 48 ] reference genome using the UCSC LiftOver tool [ 60 ] ( https://genome.ucsc.edu/cgi-bin/hgLiftOver ), leaving 1,088,337 SNPs for downstream analyses. We regarded SNPs with P ≤ 1×10 − 8 as genome-wide significant (GWAS loci). Window-based co-localization between GSOR-identified genes and GWAS loci : We partitioned the genome into non-overlapping windows of 100 kb and 500 kb in separate analyses, to test whether GSOR-identified genes and GWAS loci co-occur in the same genomic regions more often than expected by chance. For each window size, we classified windows based on the presence or absence of GSOR-identified genes and GWAS loci into four categories, including windows containing: (1) both GSOR-identified genes and GWAS loci; (2) only GSOR-identified gene(s); (3) only GWAS loci; and (4) neither. A GSOR-identified gene was assigned to a window if its transcription start site fell within that window. We then applied Fisher’s exact test to assess the statistical significance of the co-localization, considering \(\:{P}_{\text{F}\text{i}\text{s}\text{h}\text{e}\text{r}}\le\:\:0.05\:\) as significant. Gene list enrichment analysis : Candidate genes were defined as GSOR-identified gene(s) located within non-overlapping genomic windows that contained at least one GWAS locus. We used R (v4.4.3) package gprofiler2 (v0.2.3) [ 19 ] to convert bovine genes to their human ( Homo sapiens ) orthologs and to perform gene list enrichment analyses. We examined overrepresented Gene Ontology (GO) Biological Process (GO:BP) terms and Reactome pathways among the candidate genes, and terms with \(\:{\text{F}\text{D}\text{R}}_{\text{t}\text{e}\text{r}\text{m}}\le\:0.05\) were regarded significant. All the genes in the corresponding RNA-seq data were used as background genes after being converted to their human ( Homo sapiens ) orthologs. We hypothesized that the strongest candidate causal genes are those that (i) show tissue-specific expression correlated with local GEBVs (GSOR-identified genes), (ii) are enriched in window-based co-localization with GWAS loci, and (iii) are enriched for trait-relevant physiological functions. References Schaid DJ, Chen W, Larson NB. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet. 2018;19(8):491–504. Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J, et al. Systematic Localization of Common Disease-Associated Variation in Regulatory DNA. Science. 2012;337(6099):1190–5. Qi T, Song L, Guo Y, Chen C, Yang J. From genetic associations to genes: methods, applications, and challenges. Trends Genet. 2024;40(8):642–67. Bush WS, Oetjens MT, Crawford DC. Unravelling the human genome–phenome relationship using phenome-wide association studies. Nat Rev Genet. 2016;17(3):129–45. van Rheenen W, Peyrot WJ, Schork AJ, Lee SH, Wray NR. Genetic correlations of polygenic disease traits: from theory to practice. Nat Rev Genet. 2019;20(10):567–81. Ghoreishifar M, Macleod IM, Chamberlain AJ, Liu Z, Lopdell TJ, Littlejohn MD, Xiang R, Pryce JE, Goddard ME. An integrative approach to prioritize candidate causal genes for complex traits in cattle. PLoS Genet. 2025;21(5):e1011492. Li Z, Zhou X. Towards improved fine-mapping of candidate causal variants. Nat Rev Genet 2025. Brodie A, Azaria JR, Ofran Y. How far from the SNP may the causative genes be? Nucleic Acids Res. 2016;44(13):6046–54. Consortium TGO. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2020;49(D1):D325–34. Sadovnikova A, Garcia SC, Hovey RC. A Comparative Review of the Cell Biology, Biochemistry, and Genetics of Lactose Synthesis. J Mammary Gland Biol Neoplasia. 2021;26(2):181–96. Strucken EM, Laurenson YC, Brockmann GA. Go with the flow-biology and genetics of the lactation cycle. Front Genet. 2015;6:118. Xiang R, Fang L, Liu S, Liu GE, Tenesa A, Gao Y, Mason BA, Chamberlain AJ, Goddard ME, Consortium C. Genetic score omics regression and multi-trait meta-analysis detect widespread cis-regulatory effects shaping bovine complex traits. PNAS Nexus 2025. Lopdell TJ, Tiplady K, Struchalin M, Johnson TJJ, Keehan M, Sherlock R, Couldrey C, Davis SR, Snell RG, Spelman RJ, et al. DNA and RNA-sequence based GWAS highlights membrane-transport genes as key modulators of milk lactose content. BMC Genomics. 2017;18(1):968. Littlejohn MD, Tiplady K, Fink TA, Lehnert K, Lopdell T, Johnson T, Couldrey C, Keehan M, Sherlock RG, Harland C, et al. Sequence-based Association Analysis Reveals an MGST1 eQTL with Pleiotropic Effects on Bovine Milk Composition. Sci Rep. 2016;6(1):25376. Prowse-Wilkins CP, Lopdell TJ, Xiang R, Vander Jagt CJ, Littlejohn MD, Chamberlain AJ, Goddard ME. Genetic variation in histone modifications and gene expression identifies regulatory variants in the mammary gland of cattle. BMC Genomics. 2022;23(1):815. Chamberlain A, Hayes B, Xiang R, Vander Jagt C, Reich C, Macleod I, Prowse-Wilkins C, Mason B, Daetwyler H, Goddard M. Identification of regulatory variation in dairy cattle with RNA sequence data. In: Proceedings of the 11th World Congress on Genetics Applied to Livestock Production : 2018; 2018: 11–16. Xiang R, Fang L, Liu S, Macleod IM, Liu Z, Breen EJ, Gao Y, Liu GE, Tenesa A, Mason BA et al. Gene expression and RNA splicing explain large proportions of the heritability for complex traits in cattle. Cell Genomics 2023, 3(10). Xiang R, Hayes BJ, Vander Jagt CJ, MacLeod IM, Khansefid M, Bowman PJ, Yuan Z, Prowse-Wilkins CP, Reich CM, Mason BA, et al. Genome variants associated with RNA splicing variations in bovine are extensively shared between tissues. BMC Genomics. 2018;19(1):521. Kolberg L, Raudvere U, Kuzmin I, Vilo J, Peterson H. gprofiler2–an R package for gene list functional enrichment analysis and namespace conversion toolset g: Profiler. F1000Research 2020, 9:ELIXIR-709. Wainberg M, Sinnott-Armstrong N, Mancuso N, Barbeira AN, Knowles DA, Golan D, Ermel R, Ruusalepp A, Quertermous T, Hao K. Opportunities and challenges for transcriptome-wide association studies. Nat Genet. 2019;51(4):592–9. Gamazon ER, Wheeler HE, Shah KP, Mozaffari SV, Aquino-Michaels K, Carroll RJ, Eyler AE, Denny JC, Nicolae DL, Cox NJ, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat Genet. 2015;47(9):1091–8. Ghoreishifar M, Chamberlain AJ, Xiang R, Prowse-Wilkins CP, Lopdell TJ, Littlejohn MD, Pryce JE, Goddard ME. Allele-specific binding variants causing ChIP-seq peak height of histone modification are not enriched in expression QTL annotations. Genet Selection Evol. 2024;56(1):50. Ward LD, Kellis M. Interpreting noncoding genetic variation in complex traits and human disease. Nat Biotechnol. 2012;30(11):1095–106. Spitz F. Gene regulation at a distance: From remote enhancers to 3D regulatory ensembles. Semin Cell Dev Biol. 2016;57:57–67. Mancuso N, Shi H, Goddard P, Kichaev G, Gusev A, Pasaniuc B. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet. 2017;100(3):473–87. Costa A, Egger-Danner C, Mészáros G, Fuerst C, Penasa M, Sölkner J, Fuerst-Waltl B. Genetic associations of lactose and its ratios to other milk solids with health traits in Austrian Fleckvieh cows. J Dairy Sci. 2019;102(5):4238–48. Holt C. Interrelationships of the concentrations of some ionic constituents of human milk and comparison with cow and goat milks. Comp Biochem Physiol Part A: Physiol. 1993;104(1):35–41. Wack RP, Lien EL, Taft D, Roscelli JD. Electrolyte composition of human breast milk beyond the early postpartum period. Nutrition. 1997;13(9):774–7. Kubo Y, Adelman JP, Clapham DE, Jan LY, Karschin A, Kurachi Y, Lazdunski M, Nichols CG, Seino S, Vandenberg CA. International Union of Pharmacology. LIV. Nomenclature and molecular relationships of inwardly rectifying potassium channels. Pharmacol Rev. 2005;57(4):509–26. Kamikawa A, Ishikawa T. Functional expression of a Kir2. 1-like inwardly rectifying potassium channel in mouse mammary secretory cells. Am J Physiology-Cell Physiol. 2014;306(3):C230–40. Barry J, Rowland S. Variations in the ionic and lactose concentrations of milk. Biochem J. 1953;54(4):575. Tiplady KM, Lopdell TJ, Reynolds E, Sherlock RG, Keehan M, Johnson TJJ, Pryce JE, Davis SR, Spelman RJ, Harris BL, et al. Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle. Genet Selection Evol. 2021;53(1):62. Pedrosa VB, Schenkel FS, Chen S-Y, Oliveira HR, Casey TM, Melka MG, Brito LF. Genomewide Association Analyses of Lactation Persistency and Milk Production Traits in Holstein Cattle Based on Imputed Whole-Genome Sequence Data. Genes. 2021;12(11):1830. Pausch H, Emmerling R, Gredler-Grandl B, Fries R, Daetwyler HD, Goddard ME. Meta-analysis of sequence-based association studies across three cattle breeds reveals 25 QTL for fat and protein percentages in milk at nucleotide resolution. BMC Genomics. 2017;18(1):853. Voss FK, Ullrich F, Münch J, Lazarow K, Lutter D, Mah N, Andrade-Navarro MA, von Kries JP, Stauber T, Jentsch TJ. Identification of LRRC8 Heteromers as an Essential Component of the Volume-Regulated Anion Channel VRAC. Science. 2014;344(6184):634–8. Syeda R, Qiu Z, Dubin AE, Murthy SE, Florendo MN, Mason DE, Mathur J, Cahalan SM, Peters EC, Montal M, et al. LRRC8 Proteins Form Volume-Regulated Anion Channels that Sense Ionic Strength. Cell. 2016;164(3):499–511. Varela D, Penna A, Simon F, Eguiguren AL, Leiva-Salcedo E, Cerda O, Sala F, Stutzin A. P2X4 Activation Modulates Volume-sensitive Outwardly Rectifying Chloride Channels in Rat Hepatoma Cells *. J Biol Chem. 2010;285(10):7566–74. Sneddon N, Lopez-Villalobos N, Davis S, Hickson R, Shalloo L. Genetic parameters for milk components including lactose from test day records in the New Zealand dairy herd. New Z J Agricultural Res. 2015;58(2):97–107. Wang G, Jin W, Zhang L, Dong M, Zhang X, Zhou Z, Wang X. SLC50A1 inhibits the doxorubicin sensitivity in hepatocellular carcinoma cells through regulating the tumor glycolysis. Cell Death Discovery. 2024;10(1):495. Chen L-Q, Hou B-H, Lalonde S, Takanaga H, Hartung ML, Qu X-Q, Guo W-J, Kim J-G, Underwood W, Chaudhuri B, et al. Sugar transporters for intercellular exchange and nutrition of pathogens. Nature. 2010;468(7323):527–32. Nebert DW, Liu Z. SLC39A8 gene encoding a metal ion transporter: discovery and bench to bedside. Hum Genomics. 2019;13(Suppl 1):51. Guo M. 2 - Chemical composition of human milk. In: Human Milk Biochemistry and Infant Formula Manufacturing Technology. Edited by Guo M: Woodhead Publishing; 2014: 19–32. Wang X, Zhang B, Dong W, Zhao Y, Zhao X, Zhang Y, Zhang Q. SLC34A2 Targets in Calcium/Phosphorus Homeostasis of Mammary Gland and Involvement in Development of Clinical Mastitis in Dairy Cows. Anim (Basel) 2024, 14(9). Liu X, Robinson GW, Wagner KU, Garrett L, Wynshaw-Boris A, Hennighausen L. Stat5a is mandatory for adult mammary gland development and lactogenesis. Genes Dev. 1997;11(2):179–86. Kobayashi K, Wakasa H, Han L, Koyama T, Tsugami Y, Nishimura T. Lactose on the basolateral side of mammary epithelial cells inhibits milk production concomitantly with signal transducer and activator of transcription 5 inactivation. Cell Tissue Res. 2022;389(3):501–15. Gilmour A, Gogel B, Cullis B, Welham S, R T. ASReml user guide release 4.2 structural specification. Hemel Hempstead, UK. In.;: VSN International; 2022. Khansefid M, Pryce JE, Shahinfar S, Axford M, Goddard ME, Haile-Mariam M. Improving accuracy and stability of genetic predictions for dairy cow survival. Anim Prod Sci. 2023;63(11):1031–42. Rosen BD, Bickhart DM, Schnabel RD, Koren S, Elsik CG, Tseng E, Rowan TN, Low WY, Zimin A, Couldrey C et al. De novo assembly of the cattle reference genome with single-molecule sequencing. GigaScience 2020, 9(3). Sargolzaei M, Chesnais JP, Schenkel FS. A new approach for efficient genotype imputation using information from relatives. BMC Genomics. 2014;15(1):478. van den Berg I, Nguyen TV, Nguyen TTT, Pryce JE, Nieuwhof GJ, MacLeod IM. Imputation accuracy and carrier frequency of deleterious recessive defects in Australian dairy cattle. J Dairy Sci. 2024;107(11):9591–601. Daetwyler HD, Capitan A, Pausch H, Stothard P, van Binsbergen R, Brøndum RF, Liao X, Djari A, Rodriguez SC, Grohs C, et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat Genet. 2014;46(8):858–65. Nguyen TV, Bolormaa S, Reich CM, Chamberlain AJ, Vander Jagt CJ, Daetwyler HD, MacLeod IM. Empirical versus estimated accuracy of imputation: optimising filtering thresholds for sequence imputation. Genet Selection Evol. 2024;56(1):72. Loh PR, Danecek P, Palamara PF, Fuchsberger C, Y AR HKF, Schoenherr S, Forer L, McCarthy S, Abecasis GR, et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat Genet. 2016;48(11):1443–8. Browning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84(2):210–23. Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7. Erbe M, Hayes BJ, Matukumalli LK, Goswami S, Bowman PJ, Reich CM, Mason BA, Goddard ME. Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. J Dairy Sci. 2012;95(7):4114–29. Breen EJ, MacLeod IM, Ho PN, Haile-Mariam M, Pryce JE, Thomas CD, Daetwyler HD, Goddard ME. BayesR3 enables fast MCMC blocked processing for largescale multi-trait genomic prediction and QTN mapping analysis. Commun Biology. 2022;5(1):661. Trebes H, Wang Y, Reynolds E, Tiplady K, Harland C, Lopdell T, Johnson T, Davis S, Harris B, Spelman R, et al. Identification of candidate novel production variants on the Bos taurus chromosome X. J Dairy Sci. 2023;106(11):7799–815. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42(7):565–9. Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G, Clawson H, Diekhans M, Furey TS, Harte RA, Hsu F, et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006;34(suppl1):D590–8. Declarations Ethics approval and consent to participate All animal experiments were conducted in strict accordance with the rules and guidelines outlined in the New Zealand Animal Welfare Act 1999. Most data were generated as part of a mammary tissue biopsy experiment, with all samples obtained in accordance with protocols approved by the Ruakura Animal Ethics Committee, Hamilton, New Zealand (approval AEC 12845). No animals were sacrificed for this study. The study is reported in accordance with ARRIVE guidelines. Consent for publication Not applicable. Availability of data and materials DataGene Australia ( http://www.datagene.com.au/ ) is the custodian of the raw phenotype and genotype data of Australian farm animals. Access to these data for research requires permission from DataGene under a Data Use Agreement. Other supporting data are shown in the Supplementary Materials of the manuscript. Code and tutorials for GSOR are available at https://github.com/rxiangr/GSOR-and-MTAO . All gene expression data was taken from previously published studies as detailed in the Methods section. Competing interests The authors declare no competing interests. Funding This study was undertaken as part of the DairyBio program, which is jointly funded by Dairy Australia (Melbourne, Australia), Agriculture Victoria (Melbourne, Australia), and The Gardiner Foundation (Melbourne, Australia). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Authors' contributions Conceptualization: M. Goddard, M Ghoreishifar Methodology & Formal Analyses: M. Ghoreishifar Writing Original Draft & Visualization: M. Ghoreishifar Writing Reviewing Editing: M. Ghoreishifar, I. Macleod, T. Nguyen, T. Lopdell, M. Littlejohn, R. Xiang, A. Chamberlain, J. Pryce, M. Goddard Supervision: M. Goddard, J. Pryce, A. Chamberlain Funding Acquisition: J. Pryce Additional Declarations No competing interests reported. Supplementary Files Additionalfile1.xlsx Additional file 1: GSOR analyses using mammary RNA-seq data (i.e., Correlation between mammary gene expression and local GEBV for milk lactose percentage). Additionalfile2.xlsx Additional file 2: GSOR analyses using WBC RNA-seq data (i.e., Correlation between WBC gene expression and local GEBV for milk lactose percentage). Additionalfile3.xlsx Additional file 3: Genes successfully converted to human orthologs and used in gene list enrichment analyses including 31 genes and 11,728 background genes. Additionalfile4.xlsx Additional file 4: Gene ontology (GO) terms enriched using gene list that both were significant in GSOR (i.e., WBC data) and co-localized with GWAS loci in 100 Kb windows. Additionalfile5.xlsx Additional file 5: Gene ontology (GO) terms and Reactome pathways enriched using gene list that both were significant in GSOR (i.e., mammary data) and co-localized with GWAS loci in 500 Kb windows. Cite Share Download PDF Status: Published Journal Publication published 14 Jan, 2026 Read the published version in BMC Genomics → Version 1 posted Editorial decision: Revision requested 01 Dec, 2025 Reviews received at journal 26 Nov, 2025 Reviewers agreed at journal 14 Nov, 2025 Reviews received at journal 31 Oct, 2025 Reviewers agreed at journal 28 Oct, 2025 Reviewers agreed at journal 06 Oct, 2025 Reviewers invited by journal 06 Oct, 2025 Editor assigned by journal 25 Sep, 2025 Submission checks completed at journal 25 Sep, 2025 First submitted to journal 23 Sep, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7693421","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":525459841,"identity":"f20fd8b0-f680-4896-8856-d010a66a5bc6","order_by":0,"name":"Mohammad Ghoreishifar","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABiklEQVRIie2Rv0vDQBTHX3iQLoFbr0j7HwhXAtGiNP9KQkCXFgeXSGu9IjRLpXOh+B8IukTcrhy0S9S1xaVFcBRRkIogXtto/AG6CuYDd3yPuw/veA8gJeVPYsx3jS8CBYbaGMBXGdUax6/EzwoygChWHOdXBYCBUrRmfPumfGA5OOjdj8qlHCdHvWvfX7VXMqg97ByWSqcB3lB3KrcIGAUxhd08E3gDYEUX3lIl9ExOb71CFFH3bB8xex56XlfqFnUcWexwg/VaMDCZ0C2lDMtMKehyGlnZRpM6TBKRbYToUTT0mcLUS0cC9N0Oh4ViPlfCvVh5oTaTiE+N7l6i2ImSeZwrlqoiXU5aSuFUO5aoqyBLSRVAoZSa2wZjViWKrLVKODCb1DALvE/dmVLk/YFDUTdXnY1NRiWKXosJk6CxrZRBy7yqhNVcm0SFCa/VbXYpccRrVZsSORnera8xEjT276Z+Pa9ngpMP3VZ/+Nx+l79HnM9LxiGBiM9n++sI69+GmpKSkvJPeAXvcZKrT4TLAAAAAABJRU5ErkJggg==","orcid":"","institution":"Agriculture Victoria","correspondingAuthor":true,"prefix":"","firstName":"Mohammad","middleName":"","lastName":"Ghoreishifar","suffix":""},{"id":525459842,"identity":"62ce30dd-1209-48a2-92c1-cdc89fe1d6dd","order_by":1,"name":"Iona M. Macleod","email":"","orcid":"","institution":"Agriculture Victoria","correspondingAuthor":false,"prefix":"","firstName":"Iona","middleName":"M.","lastName":"Macleod","suffix":""},{"id":525459843,"identity":"4ee0b232-7907-42d7-920e-b9c978ee399c","order_by":2,"name":"Tuan Nguyen","email":"","orcid":"","institution":"Agriculture Victoria","correspondingAuthor":false,"prefix":"","firstName":"Tuan","middleName":"","lastName":"Nguyen","suffix":""},{"id":525459844,"identity":"68debd7f-90aa-49c0-aa15-6d5e4d170673","order_by":3,"name":"Thomas J. Lopdell","email":"","orcid":"","institution":"Livestock Improvement Corporation","correspondingAuthor":false,"prefix":"","firstName":"Thomas","middleName":"J.","lastName":"Lopdell","suffix":""},{"id":525459845,"identity":"e68118fe-940f-4efe-934b-11257272ddeb","order_by":4,"name":"Mathew D. Littlejohn","email":"","orcid":"","institution":"Livestock Improvement Corporation","correspondingAuthor":false,"prefix":"","firstName":"Mathew","middleName":"D.","lastName":"Littlejohn","suffix":""},{"id":525459846,"identity":"dae38f96-ce26-4e4c-8afa-b3bb3f0c7e8b","order_by":5,"name":"Ruidong Xiang","email":"","orcid":"","institution":"Agriculture Victoria","correspondingAuthor":false,"prefix":"","firstName":"Ruidong","middleName":"","lastName":"Xiang","suffix":""},{"id":525459847,"identity":"7b14c929-b0af-42e3-8105-6652a62d8758","order_by":6,"name":"Amanda J. Chamberlain","email":"","orcid":"","institution":"Agriculture Victoria","correspondingAuthor":false,"prefix":"","firstName":"Amanda","middleName":"J.","lastName":"Chamberlain","suffix":""},{"id":525459848,"identity":"49e48f4e-e98d-4827-b068-f74ab8dc558a","order_by":7,"name":"Jennie E. Pryce","email":"","orcid":"","institution":"Agriculture Victoria","correspondingAuthor":false,"prefix":"","firstName":"Jennie","middleName":"E.","lastName":"Pryce","suffix":""},{"id":525459849,"identity":"7f20ba3c-ac5c-4188-b7bc-2a3d31aa32fc","order_by":8,"name":"Michael E. Goddard","email":"","orcid":"","institution":"Agriculture Victoria","correspondingAuthor":false,"prefix":"","firstName":"Michael","middleName":"E.","lastName":"Goddard","suffix":""}],"badges":[],"createdAt":"2025-09-23 10:53:25","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7693421/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7693421/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12864-026-12525-0","type":"published","date":"2026-01-14T16:29:25+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":93529251,"identity":"63e594a7-d310-44bd-a168-44831fcb61e2","added_by":"auto","created_at":"2025-10-14 20:49:52","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":146103,"visible":true,"origin":"","legend":"","description":"","filename":"Manuscript4.docx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/aa06c8c60ac0a5b47fcb984b.docx"},{"id":93529253,"identity":"063cc09a-ad04-4bb1-8fbf-9038a34969b1","added_by":"auto","created_at":"2025-10-14 20:49:52","extension":"tif","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":992124,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1A.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/60372dcf5bc447031e0ed3dd.tif"},{"id":93529255,"identity":"5936503a-ddd8-4132-8e1d-2daab0eb0a17","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"tif","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":534762,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1B.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/d71bf6f6fe15b155bb9574a4.tif"},{"id":93530559,"identity":"6e71056c-4e91-4351-bf66-3c7047f25887","added_by":"auto","created_at":"2025-10-14 20:57:53","extension":"tif","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":747500,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/ea4f1f096faa9f6de915701d.tif"},{"id":93529257,"identity":"d9de34b3-4292-40a2-abba-d5711e7cb841","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"tif","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":42072,"visible":true,"origin":"","legend":"","description":"","filename":"Figure3.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/f43afea30834c78be8e5c7af.tif"},{"id":93529267,"identity":"f8b0a7b4-b1fc-4bf5-9727-f8f182fd21d9","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"tif","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":473910,"visible":true,"origin":"","legend":"","description":"","filename":"Figure4.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/78a1810dc0b2d372162807a6.tif"},{"id":93529262,"identity":"4938bf55-3da3-4efc-a0b9-43381f2aec25","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"json","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":10847,"visible":true,"origin":"","legend":"","description":"","filename":"1d2a8766ea824b60991b0d9389f9400f.json","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/9d969e08d41c6fc471ec1698.json"},{"id":93530779,"identity":"be970532-e3b9-4186-b629-18e0bfbed38d","added_by":"auto","created_at":"2025-10-14 21:05:53","extension":"xlsx","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1114402,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/c9ec5d802132f6a05146b66a.xlsx"},{"id":93529270,"identity":"f9a454d4-7f3d-4b68-b3a4-66f231da82f6","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"xlsx","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1135560,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/be4187f213b38bbaf5df4327.xlsx"},{"id":93529265,"identity":"cfd6d584-d524-4528-8ad3-809bd4e668b8","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"xlsx","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":866785,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/5e57b1b6ed964deb8d6fb3a0.xlsx"},{"id":93529276,"identity":"7f4a4453-e80b-4a68-a739-1687a04b8ff7","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"xlsx","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":12141,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile4.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/58afe59f4659a3efdb0ce3e5.xlsx"},{"id":93530562,"identity":"7475b0a6-7770-41a3-a0dd-dcc58faf8320","added_by":"auto","created_at":"2025-10-14 20:57:53","extension":"xlsx","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":13629,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile5.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/6c78ea53ad737159fc7d8158.xlsx"},{"id":93530567,"identity":"b4d553a0-9879-4aac-ba00-d3cba3ee83c6","added_by":"auto","created_at":"2025-10-14 20:57:53","extension":"xml","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":192680,"visible":true,"origin":"","legend":"","description":"","filename":"1d2a8766ea824b60991b0d9389f9400f1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/52d3187d88e02ad6595e4307.xml"},{"id":93530564,"identity":"2c674c43-9d27-4ef9-bc87-563723f6af97","added_by":"auto","created_at":"2025-10-14 20:57:53","extension":"tif","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":992124,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1A.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/92069f308c1c22b3df37f73e.tif"},{"id":93529278,"identity":"89482448-91d2-4133-b408-8bdf7dda1634","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"tif","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":534762,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1B.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/d49330e671f3a988a4067615.tif"},{"id":93529281,"identity":"4df42166-3bb4-4f51-ad08-e36b3df3ef1e","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"tif","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":747500,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/db0dd0bab732e247cf6bdf98.tif"},{"id":93530778,"identity":"2d8b30df-fc96-49c4-826e-0cc9551ad747","added_by":"auto","created_at":"2025-10-14 21:05:53","extension":"tif","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":42072,"visible":true,"origin":"","legend":"","description":"","filename":"Figure3.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/1cfff34102f90ef91ef410fc.tif"},{"id":93529279,"identity":"475d541b-ec69-4e9c-a086-af995eb0066d","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"tif","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":473910,"visible":true,"origin":"","legend":"","description":"","filename":"Figure4.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/326929589f54a3724103af1a.tif"},{"id":93529273,"identity":"64bb38ef-af08-444d-91f1-11e512e26fb8","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"png","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":118375,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure1A.png","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/51987fe967a6793711aac3ba.png"},{"id":93529264,"identity":"c9069162-06c8-4f24-838d-997706e64c1d","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"png","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":78938,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure1B.png","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/bfb1d7ea06551223e78627ac.png"},{"id":93529275,"identity":"ddb9a5f4-6cf9-43b0-8821-7e93f8fd6e64","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"png","order_by":20,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":123608,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure2.png","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/04919c26fd5ad09cf2553ade.png"},{"id":93529271,"identity":"078854ad-5503-47ac-baab-99449e6ddf1d","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"png","order_by":21,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":17380,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure3.png","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/4829c97c9aac8bdde7667ab5.png"},{"id":93530565,"identity":"d637a3b1-1c9e-42d5-b733-ed094b541ef1","added_by":"auto","created_at":"2025-10-14 20:57:53","extension":"png","order_by":22,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":69629,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure4.png","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/dd00a8a04c7c8b913549d055.png"},{"id":93530780,"identity":"ee0aae42-f7e9-4c9f-9dd1-2ffccc6b7634","added_by":"auto","created_at":"2025-10-14 21:05:53","extension":"xml","order_by":23,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":192661,"visible":true,"origin":"","legend":"","description":"","filename":"1d2a8766ea824b60991b0d9389f9400f1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/c6b3c62dc63dcc35f527f83e.xml"},{"id":93529277,"identity":"6729a1fb-b938-4406-8224-cec9c31c6b23","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"html","order_by":24,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":205246,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/64899781893ce20fce50787b.html"},{"id":93529249,"identity":"d5ee20e0-7a29-4eae-96f4-639ef207021b","added_by":"auto","created_at":"2025-10-14 20:49:52","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":1394399,"visible":true,"origin":"","legend":"\u003cp\u003eMiami plot showing GSOR-identified genes (in the mammary RNA-seq data) and GWAS loci. The dashed lines in GSOR represents FDR = 0.01 (P-value = 0.005; upper plot) and in GWAS represents \u003cem\u003eP\u003c/em\u003e-value 1×10\u003csup\u003e-8 \u003c/sup\u003e(lower plot). Overlapping mammary GSOR genes with GWAS loci, defined using 100 kb windows, are highlighted in red. For visualization purposes, the highlighted GWAS regions are displayed larger than 100 kb to ensure they are visible. In addition, SNPs or genes with \u003cem\u003ep\u003c/em\u003e \u0026lt; 1×10⁻¹⁶ were set to \u003cem\u003ep\u003c/em\u003e = 1×10⁻¹⁶ for graphical representation.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/fdf54a1c10be9d6adfdf294f.png"},{"id":93530555,"identity":"955b6845-b8f7-4f10-946a-933db9274759","added_by":"auto","created_at":"2025-10-14 20:57:52","extension":"tif","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":361325,"visible":true,"origin":"","legend":"\u003cp\u003eVenn diagram for co-localization between GWAS loci and GSOR identified genes based on 100 Kb and 500 Kb windows. Values within circles show number of either 100 Kb or 500 Kb windows that include one of this: (1) only GSOR genes, (2) only GWAS loci, (3) both of which, (4) or neither. SNPs with \u003cem\u003eP\u003c/em\u003e-value ≤ 1×10⁻⁸were regarded as GWAS loci. Genes based on FDR ≤ 0.1 in GSOR were regarded as significant (i.e., GSOR genes).\u003c/p\u003e","description":"","filename":"Figure2.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/69662428e093552fcdcf5b48.tif"},{"id":93530556,"identity":"5eb246a7-9eaa-4105-9706-e8f2112c27c7","added_by":"auto","created_at":"2025-10-14 20:57:52","extension":"tif","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":40606,"visible":true,"origin":"","legend":"\u003cp\u003eVenn diagram showing the number of significant GO and KEGG terms identified using GSOR-GWAS gene lists derived from 100 kb versus 500 kb non-overlapping windows in mammary tissue.\u003c/p\u003e","description":"","filename":"Figure3.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/9ae572a79cec5dd335281b85.tif"},{"id":93530775,"identity":"0bbc4b96-4365-4f75-a5d9-dc177b7599bc","added_by":"auto","created_at":"2025-10-14 21:05:53","extension":"tif","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":193107,"visible":true,"origin":"","legend":"\u003cp\u003eA schematic overview of the method used in this study. A) Reference population including animals with both SNP genotype and phenotype of LP were used to predict SNP effects. B) Predicted SNP effects were used to calculate local GEBV in RNA-sequenced animals, local GEBV and gene expression data were combined in GSOR analyses. C) GSOR results and an independent GWAS were combined to investigate significance of co-localization between GSOR-identified genes and GWAS loci in 100 Kb windows. D) Genes were selected if they were both significant in GSOR and co-localized with GWAS loci in 100 Kb windows and E) used in gene list enrichment analyses.\u003c/p\u003e","description":"","filename":"Figure4.tif","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/848a8201353813a0a83255ac.tif"},{"id":100614600,"identity":"bd8d4a62-a7ca-4d73-902b-5330231cb7b5","added_by":"auto","created_at":"2026-01-19 17:22:13","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2686673,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/13880690-ccce-4ab7-8df5-732134e78ae5.pdf"},{"id":93530557,"identity":"be5401a3-b17a-47d4-967f-82eff0a0b451","added_by":"auto","created_at":"2025-10-14 20:57:53","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":1114402,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 1: GSOR analyses using mammary RNA-seq data (i.e., Correlation between mammary gene expression and local GEBV for milk lactose percentage).\u003c/p\u003e","description":"","filename":"Additionalfile1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/7593eac254ecc795a4dbe901.xlsx"},{"id":93531021,"identity":"56a44c91-4f48-4298-929a-f7fa332f7af3","added_by":"auto","created_at":"2025-10-14 21:13:53","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":1135560,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 2: GSOR analyses using WBC RNA-seq data (i.e., Correlation between WBC gene expression and local GEBV for milk lactose percentage).\u003c/p\u003e","description":"","filename":"Additionalfile2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/1c62a301e9f44bd31b9dde0a.xlsx"},{"id":93529263,"identity":"1d0cb80d-f44c-48d8-9dd9-a8f1e407cd9c","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":866785,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 3: Genes successfully converted to human orthologs and used in gene list enrichment analyses including 31 genes and 11,728 background genes.\u003c/p\u003e","description":"","filename":"Additionalfile3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/1712e51baa7f6315c90c3dc6.xlsx"},{"id":93530777,"identity":"08116382-ea1f-4aaa-a02e-5cf6bdaf11a8","added_by":"auto","created_at":"2025-10-14 21:05:53","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":12141,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 4: Gene ontology (GO) terms enriched using gene list that both were significant in GSOR (i.e., WBC data) and co-localized with GWAS loci in 100 Kb windows.\u003c/p\u003e","description":"","filename":"Additionalfile4.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/970de39b3607405d159676b0.xlsx"},{"id":93529259,"identity":"e2d7c4cb-7639-4729-9add-d5ebe2c80bcd","added_by":"auto","created_at":"2025-10-14 20:49:53","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":13629,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 5: Gene ontology (GO) terms and Reactome pathways enriched using gene list that both were significant in GSOR (i.e., mammary data) and co-localized with GWAS loci in 500 Kb windows.\u003c/p\u003e","description":"","filename":"Additionalfile5.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7693421/v1/813e6ae8c92417b4326480f4.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Bridging GWAS to genes: an integrative multi-omics approach using cattle data","fulltext":[{"header":"Introduction","content":"\u003cp\u003eGenome-wide association studies (GWASs) have identified numerous genetic variants (such as single nucleotide polymorphisms or SNPs) associated with complex traits, yet the biological mechanisms connecting these variants to phenotypes remain elusive. Determining which variants are truly causal and which genes they affect is complicated by the extensive linkage disequilibrium (LD) surrounding lead SNPs [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. In addition, because the majority of GWAS signals fall within noncoding regions of the genome [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e], directly linking these variants to their target genes remains a significant challenge. Furthermore, the intricate regulatory mechanisms of genes, combined with the possibility that multiple genes within a single locus contribute to the trait, make identifying the true causal genes even more challenging [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eTo address these complexities, integrative approaches such as multi-trait multi-omics fine-mapping can help identify the genes through which SNPs influence quantitative traits. Many SNPs exhibit pleiotropic effects, influencing various biologically significant complex trait phenotypes [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Consequently, employing multi-trait fine-mapping techniques that analyse several traits at once can enhance the power of fine-mapping. When examining two traits, one being a complex trait of interest and the other a molecular trait, like the expression of a particular gene [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e], multi-trait fine-mapping resembles colocalization analysis [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. This analysis evaluates the genetic connection between the traits by investigating whether they share the same causal variants at a specific locus [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. An illustrative case of pleiotropy arises when a SNP associated with a complex trait through GWAS also influences gene expression, thereby acting as both a trait QTL and an expression QTL (eQTL) [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Such SNPs highlight the genetic link between gene expression and phenotypic variation [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. In particular, genes through which QTL act can be inferred by integrating multi-omics (or multi-trait) data analyses: (i) trait-associated QTL can be identified from genomic data, allowing the assignment of nearby candidate genes to them; and (ii) genes located near trait-associated QTL that also show correlated expression with trait variation (i.e., \u003cem\u003ecis\u003c/em\u003e-regulated by nearby variants) can be inferred from combined transcriptomic and genomic data [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eThe functional knowledge of genes serves as another source of evidence for mapping GWAS loci (QTL) to their target genes through which they operate [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. This information, organized in databases like the Gene Ontology (GO) resources [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], aids in pinpointing genes associated with specific biological functions. The functional enrichment of a subset of identified candidate genes within trait-relevant categories serves as additional evidence that the identified genes are the most likely causal genes [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. In this study, we use milk lactose percentage (LP) as a model trait. LP is a useful model for dissecting the genetic architecture of complex traits because its synthesis is governed by a relatively simple and well-characterized pathway, mainly involving lactose synthase activity in the mammary gland [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. This feature makes lactose an ideal trait for testing integrative genomic approaches to identify causal genes.\u003c/p\u003e\u003cp\u003eEach of the three types of evidence including proximity of genes to trait-associated SNPs, correlation of gene\u0026rsquo;s expression with the phenotype of interest, and functional enrichment of genes in a biological pathway\u0026mdash;has limitations and can result in false positives or negatives [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Nonetheless, when multiple methods identify the same genes, our confidence in their biological significance increases. Expanding on our earlier integrative approach [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e], we now apply this methodology to a different trait using a much larger multi-breed dataset to enhance statistical power and improve the robustness and generalizability of our findings.\u003c/p\u003e\u003cp\u003eThe objectives of this study are (i) to perform gene-based association tests to identify significant gene expression\u0026ndash;trait associations using genetic score omics regression (GSOR), introduced by Xiang et al. [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e], (ii) to test the significance of window-based co-localization between GSOR-identified genes and GWAS loci reported by Lopdell et al. [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e], and (iii) to obtain a list of candidate genes from the co-localization of GSOR-identified genes with GWAS loci for functional enrichment analyses. Overall, we aim to demonstrate the utility of combining GSOR with GWAS and functional enrichment analyses to map candidate genes to QTL, with milk LP as a model trait.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eFigure \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e illustrates Manhattan (Miami) plot showing mammary GSOR-identified genes in this study as well as trait associated SNPs identified through GWAS for milk lactose percentage (LP) reported by Lopdell et al. [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. A total of 12, 237 genes were expressed in the mammary RNA-seq dataset [\u003cspan additionalcitationids=\"CR14\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e], of which 711 were significantly associated with local GEBVs for milk LP in the GSOR analysis (FDR\u0026thinsp;\u0026le;\u0026thinsp;0.10) (Additional file 1). Among the 12,237 expressed genes, 242 genes were located within 100 kb windows that contained at least one GWAS locus. Of those 242, 34 genes were also significant based on the GSOR analysis (Additional file 1). We also used white blood cells (WBC) RNA-seq dataset [\u003cspan additionalcitationids=\"CR17\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Descriptive statistics about the GSOR analyses in both mammary and WBC datasets are presented in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. A total of 12,536 genes were expressed in the WBC RNA-seq dataset, of which 986 were significantly associated with local GEBVs for milk LP in the GSOR analysis. Among the 12,536 expressed genes in WBC, 242 genes were located within 100 kb windows that contained at least one GWAS locus. Of the 242 WBC genes, 33 genes were also significant based on the GSOR analysis (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e \u0026amp; Additional file 2). These genes do not completely overlap with those identified using mammary RNA-seq.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eDescriptive statistics of mammary and WBC GSOR genes and their co-localization with GWAS loci based on 100 Kb windows\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eRNA-seq data\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTotal expressed Genes\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eN significant GSOR genes\u003csup\u003e1\u003c/sup\u003e\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eN expressed genes in windows including GWAS loci\u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eN significant GSOR genes in windows including GWAS loci\u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMammary\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e12, 237\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e711\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e242\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e34\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eWBC\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e12, 536\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e986\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e242\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e33\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colspan=\"5\" nameend=\"c5\" namest=\"c1\"\u003e\u003cp\u003e\u003csup\u003e1\u003c/sup\u003eNumber of GSOR-identified genes at FDR\u0026thinsp;\u0026le;\u0026thinsp;0.1; \u003csup\u003e2\u003c/sup\u003eNumber of expressed genes located within non-overlapping windows of 100 Kb length including at least one GWAS locus; For GWAS, summary statistics for milk LP from an independent study using 12,000 samples was used [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e] (see Methods); \u003csup\u003e3\u003c/sup\u003eNumber of GSOR-identified genes located within 100 Kb windows including at least one GWAS loci.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e shows window-based co-localization between GSOR-identified genes and trait-associated SNPs (GWAS loci) obtained using 100 Kb non-overlapping windows. We tested statistical significance of this co-localization, and the results are presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. Based on Fisher Exact test, our results show that co-localization between GSOR-identified genes and GWAS loci within 100 Kb and 500 Kb windows were significantly greater than expected by chance in both mammary and WBC datasets. For example, using 100 Kb windows with mammary RNA-seq data, we found 30 windows where the GSOR-identified genes shared windows with GWAS loci, 291 windows contained only GWAS loci, 631 included only GSOR-identified genes, while 23,909 windows had neither GWAS loci nor GSOR-identified gene (\u003cem\u003eP\u003c/em\u003e\u003csub\u003eFisher\u003c/sub\u003e = 2.96\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;9\u003c/sup\u003e ; odds ratio\u0026thinsp;=\u0026thinsp;3.9). We found that smaller window size (100 kb) revealed stronger co-localization, as evidenced by more significant \u003cem\u003eP\u003c/em\u003e value and higher odds ratios (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Furthermore, across the RNA-seq datasets, mammary RNA-seq demonstrated stronger co-localization with GWAS loci than WBC RNA-seq, supported by more stringent \u003cem\u003eP\u003c/em\u003e values and higher odds ratios (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e shows descriptive statistics about the 34 mammary GSOR-identified genes co-localized with GWAS loci. Of the 34 co-localised GSOR-GWAS genes presented in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, 31 were successfully converted to human ortholog and used in functional enrichment analyses, along with the 11,728 background genes that successfully converted (out of 12, 237 expressed genes). This data is presented in Additional file 3. Successfully converted were used to conduct enrichment analyses using gprofiler2 [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e] R package.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eGSOR-identified genes co-localized in 30 windows (100 Kb) with GWAS loci. P (GWAS) represents the most stringent P value within the corresponding window\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"7\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eChr\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eWin Start\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eWin End\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eSNP\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\u003cem\u003eP\u003c/em\u003e (GWAS)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eGSOR-gene\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eFDR (GSOR)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e150530998\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e150630997\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1:150591449\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e8.12E-09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eKCNJ15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.008\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e151130998\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e151230997\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1:151180934\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.39E-11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eETS2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.087\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e152230998\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e152330997\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1:152324264\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.31E-09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSH3BP5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e9.36E-49\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e15344539\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e15444538\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e3:15368547\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e3.75E-11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eTHBS3, GBA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.057, 0.011\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e15444539\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e15544538\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e3:15464742\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.67E-14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSLC50A1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e7.62E-06\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e53544539\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e53644538\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e3:53597089\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e3.28E-15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLRRC8C\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.002\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e53644539\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e53744538\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e3:53674128\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e7.86E-16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLRRC8B\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e1.01E-17\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e54044539\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e54144538\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e3:54109337\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e9.76E-14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eGBP5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e1.35E-09\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e54844539\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e54944538\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e3:54943906\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.04E-09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eKYAT3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.020\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e22391294\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e22491293\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6:22469293\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e4.95E-11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSLC39A8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.0003\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e45091294\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e45191293\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6:45129256\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e3.64E-09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSLC34A2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.004\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e85991294\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e86091293\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e6:86055882\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.06E-11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eENAM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.001\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2166988\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2266987\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e10:2190824\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.94E-13\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eNREP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.0002\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e13\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e54439729\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e54539728\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e13:54476595\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e5.13E-09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSLC17A9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e3.07E-08\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e27470773\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e27570772\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e15:27543169\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e8.40E-11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eAPOA1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.015\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e66207162\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e66307161\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e16:66279444\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.11E-12\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eIVNS1ABP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e2.31E-17\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e51725017\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e51825016\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e17:51739903\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.22E-09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eDNAH10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e3.07E-08\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e53925017\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e54025016\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e17:53934607\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.89E-10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eP2RX4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.011\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e72325017\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e72425016\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e17:72377251\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.26E-10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLRRC74B\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.057\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e32957926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e33057925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:33056069\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e4.58E-14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eTVP23B\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.072\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e41857926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e41957925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:41909633\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.90E-11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eHAP1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.003\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e42157926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e42257925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:42236007\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.53E-13\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eNKIRAS2, RAB5C, DHX58\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.0032, 0.035, 9.8e-08\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e42257926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e42357925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:42349652\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e9.38E-14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eGHDC, STAT5B\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.035, 0.087\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e42357926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e42457925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:42358091\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e9.38E-14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eSTAT3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.072\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e58057926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e58157925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:58114277\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e3.08E-10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eC19H17orf80\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.087\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e60557926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e60657925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:60560812\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.23E-24\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eKCNJ2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e3.04E-23\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e61357926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e61457925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:61408199\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.44E-09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eABCA10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e7.62E-06\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e19\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e61457926\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e61557925\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e19:61512690\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1.66E-09\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eABCA9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.072\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e27\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e36520984\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e36620983\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e27:36523102\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.84E-22\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eGPAT4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.0007\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e29\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e9447532\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e9547531\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e29:9545883\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e4.03E-35\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003ePICALM\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e2.63E-45\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eWe identified 22 significantly enriched gene ontology biological process (GO:BP) and Reactome pathways, where terms \u0026ldquo;transport\u0026rdquo;, \u0026ldquo;transmembrane transport\u0026rdquo; and their child terms dominated the list (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eGene list enrichment analyses using mammary GSOR-identified genes co-localized with GWAS loci in 100 Kb windows\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSource\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTerm ID\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTerm Name\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eFDR\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eGenes\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0001408\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eguanine nucleotide transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.001\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:1903790\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eguanine nucleotide transmembrane transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.001\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0055085\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003etransmembrane transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.001\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKCNJ15, SLC50A1, LRRC8C, LRRC8B, SLC39A8, SLC34A2, SLC17A9, P2RX4, HAP1, KCNJ2, ABCA10, ABCA9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0098739\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eimport across plasma membrane\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.005\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKCNJ15, LRRC8C, LRRC8B, SLC39A8, KCNJ2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:1901679\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003enucleotide transmembrane transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.007\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:1901264\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ecarbohydrate derivative transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.009\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eSLC50A1, LRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0015868\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003epurine ribonucleotide transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.010\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0051503\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eadenine nucleotide transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.011\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0072530\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003epurine-containing compound transmembrane transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.011\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0015865\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003epurine nucleotide transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.011\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0006862\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003enucleotide transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.016\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eLRRC8C, LRRC8B, SLC17A9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0034220\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003emonoatomic ion transmembrane transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.016\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKCNJ15, LRRC8C, LRRC8B, SLC39A8, SLC34A2, P2RX4, HAP1, KCNJ2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0040014\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eregulation of multicellular organism growth\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.024\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eSTAT5B, STAT3, GPAT4\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0042592\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ehomeostatic process\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.042\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eGBA1, SLC39A8, SLC34A2, P2RX4, HAP1, NKIRAS2, STAT5B, STAT3, KCNJ2, PICALM\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0098659\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003einorganic cation import across plasma membrane\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.042\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKCNJ15, SLC39A8, KCNJ2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0099587\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003einorganic ion import across plasma membrane\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.042\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKCNJ15, SLC39A8, KCNJ2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0006811\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003emonoatomic ion transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.042\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKCNJ15, LRRC8C, LRRC8B, SLC39A8, SLC34A2, P2RX4, HAP1, KCNJ2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0006810\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003etransport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.042\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKCNJ15, SLC50A1, LRRC8C, LRRC8B, SLC39A8, SLC34A2, SLC17A9, P2RX4, HAP1, RAB5C, STAT5B, STAT3, KCNJ2, ABCA10, ABCA9, GPAT4, PICALM\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGO:BP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eGO:0048871\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003emulticellular organismal-level homeostasis\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.044\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eGBA1, SLC39A8, P2RX4, NKIRAS2, STAT5B, STAT3, PICALM\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eREAC\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eREAC:R-HSA-382551\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTransport of small molecules\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.008\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eSLC50A1, LRRC8C, LRRC8B, SLC39A8, SLC34A2, ABCA10, ABCA9\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eREAC\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eREAC:R-HSA-186797\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSignaling by PDGF\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.008\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eTHBS3, STAT5B, STAT3\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eREAC\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eREAC:R-HSA-425407\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eSLC-mediated transmembrane transport\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.035\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eSLC50A1, SLC39A8, SLC34A2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colspan=\"5\" nameend=\"c5\" namest=\"c1\"\u003e\u003cp\u003e1. GSOR-identified genes were converted to human orthologs before gene list enrichment analyses.\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e We also performed gene list enrichment analysis using WBC GSOR-identified genes co-localized with GWAS loci in the 100 Kb windows. This analysis revealed five significant GO terms, of which only the GO term \u0026ldquo;response to growth hormone\u0026rdquo; (GO:0060416) could be indirectly related to physiology of milk LP. These results are presented in Additional file 4. We found that only two genes, \u003cem\u003eSTAT5B\u003c/em\u003e and \u003cem\u003eHAP1\u003c/em\u003e shared in both mammary and WBC GSOR-GWAS genes (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e \u0026amp; Additional file 4). According to Wainberg et al. [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e], transcriptome-wide association study or TWAS [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e] is particularly susceptible to false-positive gene associations when expression data come from irrelevant tissues or cell types. It seems the same may also be true for GSOR [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e], as the GSOR-identified genes from mammary gland expression data revealed functionally enriched terms that aligned more closely with milk lactose physiology, whereas the pathways detected from WBC data tended to be broader and less specific.\u003c/p\u003e\u003cp\u003eTo assess whether a broader genomic window could improve the detection of relevant signals to LP, we increased the window size, allowing mammary GSOR-identified genes to fall within 500 Kb of GWAS loci. A total of 22 significant terms were identified (Additional file 5), of which 15 shared with terms identified when using 100 Kb windows (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn our previous study [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e], we showed that integrating GSOR, GWAS, and functional enrichment can help prioritize candidate genes for milk composition traits, though inference was limited by small sample sizes. Here, we leveraged a large, multi-breed reference population (\u0026gt;\u0026thinsp;81,000 cows) to train genomic prediction models, increasing both GEBV accuracy and the power to detect expression\u0026ndash;trait associations. This enabled us to identify 20 genes likely to mediate lactose QTL, supported by convergence between GSOR, GWAS, and enrichment results. Because long-range LD, complex regulation, and small QTL effects complicate causal gene discovery, our framework, integrating local GEBV with tissue-relevant expression data, GWAS co-localization, and functional enrichment, may help overcome these challenges and offer a scalable, biologically grounded strategy to dissect the molecular basis of complex traits in livestock and other species.\u003c/p\u003e\u003cp\u003eWhen the most significant GWAS variant is located within a coding region, the causal gene may be directly implicated [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. However, the majority of trait QTL reside in non-coding regions [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e], and thus, they have unknown functions. Non-coding QTL can impact phenotypes by regulating gene expression through \u003cem\u003ecis\u003c/em\u003e or \u003cem\u003etrans\u003c/em\u003e regulatory mechanisms. Even when causal variants are known, it is challenging to determine their functional impact and link them to specific target genes. This is because regulatory elements can act over long genomic distances and their effects can be highly cell-type specific [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. TWASs were proposed to fill this gap by linking predicted tissue-specific gene expression to observed phenotypes [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. However, one limitation of TWAS is that they usually rely on a limited set of individuals with assayed expression data and genotypes, which can limit the statistical power and decrease the accuracy of models used for predicting gene expression [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. In this study, we used GSOR to identify statistically significant gene-trait associations. Xiang et al. [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e] found that GSOR is more powerful than TWAS because it uses the contribution to phenotype of the variants close to the gene whose expression is being investigated rather than the complete phenotype. However, the problem of spurious associations due to LD still exists [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Gene list enrichment analyses may help overcome LD issues by aggregating signals across functionally related genes, highlighting biologically coherent associations rather than single-gene signals confounded by LD.\u003c/p\u003e\u003cp\u003eAs mentioned, GSOR does not provide direct evidence of causality, as correlations between gene expression and the trait can be induced by LD, among other factors. For instance, if two genes share \u003cem\u003ecis\u003c/em\u003e-eQTLs in LD but only one eQTL is causal, both genes may appear statistically associated with the trait [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Therefore, an additional source of evidence is required to prioritize candidate causal genes.\u003c/p\u003e\u003cp\u003eWe evaluated significance of the overlap between GSOR-identified genes and GWAS loci by dividing the genome into non-overlapping windows of 100 Kb. We showed that GSOR-identified genes were significantly enriched within windows containing GWAS loci. This window based co-localization enrichment is consistent with our previous findings using a different milk composition trait [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e], suggesting that the expression of these genes may mediate the effects of GWAS loci on complex traits through a \u003cem\u003ecis\u003c/em\u003e-regulatory mechanism. However, LD can still confound these signals, so functional enrichment analyses are particularly important for highlighting the most relevant GWAS-GSOR genes. Milk LP is a promising target phenotype to test this hypothesis because its underlying biology is relatively well understood, with established roles for specific pathways such as lactose synthesis, ion transport, and hormonal signaling [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e], allowing for meaningful interpretation of enrichment results. Of the 31 genes used in functional enrichment analyses, 55% share the term transport (GO:0006810 ; \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.042) and 39% share its descendant term \u0026ldquo;transmembrane transport\u0026rdquo; (GO:0055085; \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.001).\u003c/p\u003e\u003cp\u003eLactose is the major osmotic solute in milk, responsible for drawing water into the alveolar lumen and driving milk volume [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]. As a result, LP shows limited variation. However, it is not the only osmolarity regulator and an increase in other osmolytes can change LP. Indeed, studies show that increases in sodium, potassium or chloride in milk are inversely correlated with lactose concentration\u0026mdash;highlighting that any process affecting ionic balance or secretion can shift lactose levels via osmotic compensation [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. We identified monoatomic ion transmembrane transport (GO0034220; FDR\u0026thinsp;=\u0026thinsp;0.016) enriched with \u003cem\u003eKCNJ15\u003c/em\u003e, \u003cem\u003eLRRC8C\u003c/em\u003e, \u003cem\u003eLRRC8B\u003c/em\u003e, \u003cem\u003eSLC39A8\u003c/em\u003e, \u003cem\u003eSLC34A2\u003c/em\u003e, \u003cem\u003eP2RX4\u003c/em\u003e, \u003cem\u003eHAP1\u003c/em\u003e, and \u003cem\u003eKCNJ2\u003c/em\u003e genes. \u003cem\u003eKCNJ2\u003c/em\u003e (Kir2.1) and \u003cem\u003eKCNJ15\u003c/em\u003e (Kir4.2) both encode inwardly rectifying potassium channels (Kir) that favor K⁺ influx under hyperpolarizing conditions [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. Kamikawa and Ishikawa [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] identified Kir2.1-like channels \u0026mdash; encoded by \u003cem\u003eKCNJ2\u003c/em\u003e \u0026mdash; functionally expressed in secretory mammary epithelial cells of lactating mice. These authors concluded that different types of K\u003csup\u003e+\u003c/sup\u003e channels might play a role in producing species-specific milk, while particular type of K\u003csup\u003e+\u003c/sup\u003e channels might generally express and participate in milk production in mammalian milk secretary cells [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. An early study on ionic concentrations in milk found a significant association between different ions, such as K\u003csup\u003e+\u003c/sup\u003e, and lactose levels in milk [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. In cattle, a QTL on BTA19 including \u003cem\u003eKCNJ2\u003c/em\u003e and \u003cem\u003eKCNJ16\u003c/em\u003e has been associated with lactose concentration [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e], protein concentration [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e] and milk yield [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. Therefore, an eQTL that affects the abundance of \u003cem\u003eKCNJ2\u003c/em\u003e or \u003cem\u003eKCNJ15\u003c/em\u003e proteins (and K\u003csup\u003e+\u003c/sup\u003e ion transport) might lead to osmotic compensation influencing milk LP.\u003c/p\u003e\u003cp\u003eMaintaining cell volume is essential for mammary epithelial cells to remain functional during the osmotic shifts involved in milk secretion. Voltage-regulated anion channels (VRACs) is one mechanism by which they can do this. VRACs encoded by \u003cem\u003eLRRC8\u003c/em\u003e proteins A to E, help regulate cell volume by exporting Cl\u003csup\u003e\u0026minus;\u003c/sup\u003e ions and small organic anions [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. \u003cem\u003eP2RX4\u003c/em\u003e is an ATP-gated cation channel permeable to Ca\u0026sup2;⁺; ATP-triggered \u003cem\u003eP2RX4\u003c/em\u003e activation modulates VRAC in rat liver cells [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. While not confirmed in mammary cells, this mechanism may apply during milk secretion. Together, these \u0026ldquo;ion transporters\u0026rdquo; help set the ionic gradients that draw water into the alveoli along with lactose.\u003c/p\u003e\u003cp\u003eLactose, the primary sugar in milk, is synthesized in Golgi vesicles and secreted along with ions and water. In fact, lactose accumulation in secretory vesicles draws water into milk by osmosis \u0026ndash; milk volume is almost perfectly correlated (r\u0026thinsp;\u0026asymp;\u0026thinsp;0.99) with lactose production [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. This osmotic role of lactose means that many membrane transporters (for solutes and ions) must be active during lactation to supply substrates and maintain ion gradients. Consistent with this, a recent study found that genes involved in membrane transport were enriched among loci affecting milk lactose content [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. Thus, our finding of enriched GO terms for \u0026ldquo;transmembrane transport\u0026rdquo; and \u0026ldquo;ion transmembrane transport\u0026rdquo; fits well with known lactation physiology, highlights that genes in these categories can modulate lactose synthesis and milk osmolarity [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eSome genes may affect LP by directly affecting lactose production. \u003cem\u003eSLC50A1\u003c/em\u003e (SWEET1) is a Golgi-localized sugar transporter that exports glucose (direct precursor for lactose synthesis) into the Golgi lumen [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. Experimental studies support \u003cem\u003eSLC50A1\u003c/em\u003e providing glucose for lactose production in mammary cells [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e, \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. Thus, variation in \u003cem\u003eSLC50A1\u003c/em\u003e could directly affect lactose synthesis by altering sugar supply. More broadly, enrichment of \u0026ldquo;SLC-mediated transmembrane transport\u0026rdquo; (Reactome R-HSA-425407) in our data \u0026ndash; which includes \u003cem\u003eSLC50A1\u003c/em\u003e, \u003cem\u003eSLC39A8\u003c/em\u003e, and \u003cem\u003eSLC34A2\u003c/em\u003e \u0026ndash; points to solute carrier proteins as key players. Other solute carriers in our gene list fit this picture. \u003cem\u003eSLC39A8\u003c/em\u003e (ZIP8) is a zinc/manganese importer; it moves Mn\u003csup\u003e2+\u003c/sup\u003e and Zn\u003csup\u003e2+\u003c/sup\u003e into cells [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. Manganese is an essential cofactor for many glycosyltransferases, including the β4-galactosyltransferase I subunit of the lactose synthase, and is required for the function of various Golgi enzymes [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. \u003cem\u003eSLC34A2\u003c/em\u003e (also known as \u003cem\u003eNPT2b\u003c/em\u003e) is a sodium-dependent phosphate transporter highly expressed in lactating mammary epithelium [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]. While its precise role in milk synthesis was not directly tested, its known function in phosphate uptake suggests a potential role in supporting ATP and nucleotide sugar synthesis during lactose production. These transporters illustrate how the enriched GO terms reflect lactation biology: they supply essential substrates (glucose, phosphate) and cofactors (Mn, Zn, Ca) for lactose synthesis, while maintaining ion gradients and water balance critical for milk secretion.\u003c/p\u003e\u003cp\u003eThe Reactome pathway \u0026ldquo;Signaling by PDGF (Platelet-derived growth factor)\u0026rdquo; involves \u003cem\u003eSTAT5B\u003c/em\u003e, \u003cem\u003eSTAT3\u003c/em\u003e, and \u003cem\u003eTHBS3\u003c/em\u003e. \u003cem\u003eSTAT5\u003c/em\u003e (both \u003cem\u003eSTAT5A\u003c/em\u003e and \u003cem\u003eSTAT5B\u003c/em\u003e) is a key transcription factor for lactation: activated by prolactin via Janus kinases (JAK2), it drives the expression of milk proteins and enzymes, and epithelial cell differentiation during pregnancy [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]. Indeed, genetic knockout of \u003cem\u003eSTAT5A\u003c/em\u003e in mice causes failure of alveolar differentiation and lactogenesis [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e] and experiments note \u003cem\u003eSTAT5\u003c/em\u003e as \u0026ldquo;a primary transcription factor for milk production\u0026rdquo; [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]. PDGF promotes stromal/epithelial proliferation and survival. Thus, enrichment of this pathway suggests that growth factor‐STAT signaling networks influence mammary development, which potentially impact milk LP phenotype. For example, greater \u003cem\u003eSTAT5\u003c/em\u003e signaling would boost lactose synthesis (via more alveolar cells and lactose enzymes), whereas \u003cem\u003eSTAT3\u003c/em\u003e activity would reduce lactose output. In sum, the \u0026ldquo;Signaling by PDGF\u0026rdquo; term likely flags a set of regulatory genes (STAT5B/STAT3) that govern cell differentiation, survival, and secretory activity in the udder. Changes in these signals would alter mammary function (and indirectly lactose percentage) even though they are not lactose‐specific per se.\u003c/p\u003e\u003cp\u003eIn conclusion, we demonstrated the utility of integrating multi-omics data with functional enrichment to identify genes likely regulated by QTL associated with milk LP. We used LP as a test case to assess whether combining gene expression, GWAS, and functional enrichment could pinpoint causal genes for complex traits. To mitigate the confounding effects of LD, we leveraged functional enrichment analyses, which highlighted biologically coherent associations. Our results show that this strategy successfully identified 20 genes with clear mechanistic links to LP, acting primarily through indirect regulatory pathways. Future experimental validations will be needed to confirm these findings.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003eOverview\u003c/p\u003e\u003cp\u003eThis study used four independent datasets:\u003c/p\u003e\u003cp\u003e(1) ~\u0026thinsp;400 New Zealand (NZ) cows with mammary RNA-seq data [\u003cspan additionalcitationids=\"CR14\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. This data is used for gene-based associations test, specifically genetic score omics regression (GSOR); introduced by Xiang et al. [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e].\u003c/p\u003e\u003cp\u003e(2) ~\u0026thinsp;400 Australian (AU) cows with white blood cells (WBC) RNA-seq data [\u003cspan additionalcitationids=\"CR17\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. This data also used for gene-based associations test (GSOR).\u003c/p\u003e\u003cp\u003e(3) ~\u0026thinsp;81,000 AU multibreed cows with lactose percentage (LP) phenotypes and SNP genotypes (referred to as GSOR reference population). This data used to estimate local GEBVs for LP in the RNA-sequenced cows of NZ and AU.\u003c/p\u003e\u003cp\u003e(4) GWAS summary statistics for milk LP, based on 12,000 NZ cows and reported by Lopdell et al. [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e] were used to test for co-localization between significant GSOR genes and GWAS loci.\u003c/p\u003e\u003cp\u003e\u003cstrong\u003ePhenotypic data for milk lactose percentage\u003c/strong\u003e\u003cp\u003ePhenotypic data for milk test-day LP was provided by DataGene (an independent industry-owned organisation that provides the national genetic evaluations for dairy cattle in Australia). Animals born between 2012 to 2024 that had Holstein, Jersey or Australian Red breed codes were retained. Outliers deviating\u0026thinsp;\u0026plusmn;\u0026thinsp;3 SD of the mean phenotypic value of LP were excluded. Test-day records were included if a cow\u0026rsquo;s age at calving was between 18 to 25 months and days in milk (DIM) between 5 to 315 days. Additionally, herds and test dates with less than five observations were excluded from the analysis. The final data contained 4,995,316 test-day records belonging to 477,822 cows. ASReml [\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e] was used to adjust phenotypes for fixed effects and average them for each cow (i.e. effect of Cow) following the model proposed by [\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]\u003c/p\u003e\u003c/p\u003e\u003cp\u003e\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\n$$\\:{\\text{y}}_{\\text{i}\\text{j}\\text{k}\\text{l}\\text{m}}={\\mu\\:}+{\\mathbf{H}}_{\\mathbf{i}}{\\mathbf{T}\\mathbf{D}}_{\\mathbf{j}}+\\:{\\mathbf{M}}_{\\mathbf{k}}+\\mathbf{p}\\mathbf{o}\\mathbf{l}\\left(\\mathbf{D}\\mathbf{I}\\mathbf{M},\\:8\\right)+\\mathbf{p}\\mathbf{o}\\mathbf{l}\\left(\\mathbf{A}\\mathbf{g}\\mathbf{e},\\:2\\right)+{\\mathbf{C}\\mathbf{o}\\mathbf{w}}_{\\mathbf{l}}+\\:{\\mathbf{e}}_{\\mathbf{i}\\mathbf{j}\\mathbf{k}\\mathbf{l}\\mathbf{m}}$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003ewhere, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{y}}_{\\text{i}\\text{j}\\text{k}\\text{l}\\text{m}}\\)\u003c/span\u003e\u003c/span\u003eis the test-day record for LP (N\u0026thinsp;=\u0026thinsp;4, 995,316), \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\mu\\:}\\)\u003c/span\u003e\u003c/span\u003e is the effect of overall mean, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{H}}_{\\text{i}}{\\text{T}\\text{D}}_{\\text{j}}\\)\u003c/span\u003e\u003c/span\u003e is the effect of the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{i}^{th}\\)\u003c/span\u003e\u003c/span\u003e herd and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{j}^{th}\\)\u003c/span\u003e\u003c/span\u003e test-day (N\u0026thinsp;=\u0026thinsp;82,058); \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{M}}_{\\text{k}}\\)\u003c/span\u003e\u003c/span\u003e is the effect of the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{k}^{th}\\)\u003c/span\u003e\u003c/span\u003e calving month (N\u0026thinsp;=\u0026thinsp;12); \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{p}\\text{o}\\text{l}\\left(\\text{D}\\text{I}\\text{M},\\:8\\right)\\:\\)\u003c/span\u003e\u003c/span\u003eand \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{p}\\text{o}\\text{l}\\left(\\text{A}\\text{g}\\text{e},\\:2\\right)\\)\u003c/span\u003e\u003c/span\u003e are the regression coefficients of Legendre polynomials of order 1\u0026ndash;8 for DIM and of order 1\u0026ndash;2 for age at calving in months; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{C}\\text{o}\\text{w}}_{\\text{l}}\\)\u003c/span\u003e\u003c/span\u003e and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{e}}_{\\text{i}\\text{j}\\text{k}\\text{l}\\text{m}}\\)\u003c/span\u003e\u003c/span\u003e are the random effects of the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{l}^{th}\\)\u003c/span\u003e\u003c/span\u003e cow (N\u0026thinsp;=\u0026thinsp;477,822) and the random residual term, respectively.\u003c/p\u003e\u003cp\u003eThe following sections will explain our integrative multi-omics approach in detail. A schematic overview of the method is illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eGenotypic data for the GSOR reference population\u003c/strong\u003e\u003cp\u003eOf the cows with phenotypic data described above, SNP panel genotypes were available for 81,658 cows, comprising 79% Holstein, 16% Jersey and 5% Australian Red. All SNP markers were mapped to the ARS-UCD1.2 reference genome [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e], and included autosomal markers as well as those on the non-pseudo autosomal region of the X chromosome. Any raw genotypes with a GenCall score of \u0026lt;\u0026thinsp;0.6 were set to missing and any marker or animal with 10% or more missing genotypes was discarded. The remaining sporadic missing genotypes were imputed with FImpute v.3 software [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]. The genotypes were from a range of SNP panels (\u0026ge;\u0026thinsp;6,000 markers) and were imputed with FImpute v.3 to a custom 74K SNP genotype panel that is used by DataGene for national genetic evaluations [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. The imputation reference population for the 74K SNP panel included over 28,000 animals (Holstein, Jersey and Australian Red breeds). Next, the 74K SNP genotypes were imputed to the Illumina High Density (HD) Bovine SNP panel that included 714,451 SNP in an imputation reference population of 2,910 animals (breeds as for the 74K panel). Prior to HD imputation, approximately 20,000 SNP in the custom 74K set that did not overlap the HD set were removed and then added back in before the final imputation to whole genome sequences (WGS). The sequenced imputation reference population included 5,036 \u003cem\u003eBos taurus\u003c/em\u003e cattle from Run9 of the 1000 Bull Genomes project [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e]. Following Nguyen et al. [\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e], sequence variants were pre-filtered (49,114,602 variants remaining) and phased with Eagle v2 [\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e] before using Beagle v5.2.1 [\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e] to impute all animals to WGS.\u003c/p\u003e\u003c/p\u003e\u003cp\u003ePost-imputation, sequence variants with a Beagle DR2 (estimated imputation accuracy)\u0026thinsp;\u0026lt;\u0026thinsp;0.9 were excluded, as well as those with minor allele frequency (MAF)\u0026thinsp;\u0026lt;\u0026thinsp;0.01 and genotype frequencies deviating from Hardy-Weinberg equilibrium (\u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026le;\u0026thinsp;1\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;8\u003c/sup\u003e). LD pruning was performed using PLINK v1.9 [\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e] with parameters --indep-pairwise 5000 500 0.95 to exclude variants that were in high LD (r2\u0026thinsp;\u0026gt;\u0026thinsp;0.95). These procedures retained 1,181,628 variants for subsequent analyses. We used this data to train a BayesR model [\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e] using BayesR3 software [\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e] to estimate prediction equations (SNP effects) for LP. The model was as follows: \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{y}=\\mathbf{X}\\mathbf{u}+\\mathbf{V}\\mathbf{g}+\\mathbf{e}\\)\u003c/span\u003e\u003c/span\u003e, where \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{y}\\)\u003c/span\u003e\u003c/span\u003e is an \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{n}\\times\\:1\\)\u003c/span\u003e\u003c/span\u003e vector of phenotypic records, in which \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{n}\\)\u003c/span\u003e\u003c/span\u003e is the number of animals in the reference population (N=81,658); \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{X}\\)\u003c/span\u003e\u003c/span\u003e is an \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{n}\\times\\:\\text{m}\\)\u003c/span\u003e\u003c/span\u003e incidence matrix, \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{u}\\)\u003c/span\u003e\u003c/span\u003e is \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{m}\\)\u003c/span\u003e\u003c/span\u003e \u0026times; 1 vector of fixed effects and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:m\\)\u003c/span\u003e\u003c/span\u003e corresponds to fixed effects including breed effect with three levels; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{V}\\)\u003c/span\u003e\u003c/span\u003e is the coded genotype, representing the observed genotypes of each individual; \u003cb\u003eg\u003c/b\u003e is a vector of SNP effects; and \u003cb\u003ee\u003c/b\u003e is the residual term. BayesR3 was run with 50,000 MCMC iterations and 25,000 burn-in. In the BayesR3 model, the SNP effects follow a mixture of four normal distributions with zero mean and additive genetic variances of zero, 0.0001, 0.001, and 0.01 times the genetic variance. Starting values for proportions of the four SNP effect distributions were defined as 0.994, 0.0055, 0.00049, and 0.00001, respectively. Prediction equations were applied to the RNA-sequenced animals to calculate their local GEBVs for LP (described below).\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eRNA-seq data and gene-based associations test (GSOR)\u003c/strong\u003e\u003cp\u003eWe analysed two distinct sets of RNA sequencing data from WBC and mammary tissue.\u003c/p\u003e\u003c/p\u003e\u003cp\u003eThe WBC gene expression data were obtained from 313 lactating cows of multiple breeds from the Agriculture Victoria research farm (Ellinbank Smart Farm). Details of sample processing, RNA extraction, library preparation, and sequencing are provided in [\u003cspan additionalcitationids=\"CR17\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. For the WBC RNA-seq animals, genotypes were imputed to WGS using Run9 of the 1000 Bull Genomes project [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e] as described above for the GSOR reference population.\u003c/p\u003e\u003cp\u003eThe mammary gene expression data include 386 lactating NZ cows, including Holstein, Jersey and their crosses. The cows in this dataset, were previously imputed to WGS using 1,298 imputation reference animals, including 306 Holstein-Friesian, 219 Jersey, 717 crossbreds (Holstein-Friesian x Jersey) and 56 other breeds as described in [\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e]. For the WBC and mammary RNA-sequenced animals, we retained the same 1,181,628 variants that were retained in the GSOR reference population.\u003c/p\u003e\u003cp\u003e Prior to fitting per-gene GSOR models, we calculated local GEBVs for RNA-sequenced animals using the estimated effects for variants located within a\u0026thinsp;\u0026plusmn;\u0026thinsp;1 Mb window centred on the transcription start site of the gene being tested. These local GEBVs served as response variables in the GSOR analysis, in which gene expression levels were tested as predictors to identify genes whose expression is associated with genetically driven, \u003cem\u003ecis\u003c/em\u003e-regulatory variation in the trait (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). The following per-gene GSOR model was applied for each RNA-seq dataset [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]:\u003cdiv id=\"Equb\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equb\" name=\"EquationSource\"\u003e\n$$\\:{\\widehat{\\mathbf{G}\\mathbf{E}\\mathbf{B}\\mathbf{V}}}_{\\mathbf{l}\\mathbf{o}\\mathbf{c}\\mathbf{a}\\mathbf{l}}={\\mathbf{b}}_{1}\\varvec{\\Omega\\:}+{\\mathbf{b}}_{2}\\mathbf{x}+\\mathbf{g}+\\:\\mathbf{e}$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003ewhere \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\widehat{\\mathbf{G}\\mathbf{E}\\mathbf{B}\\mathbf{V}}}_{\\mathbf{l}\\mathbf{o}\\mathbf{c}\\mathbf{a}\\mathbf{l}}\\)\u003c/span\u003e\u003c/span\u003e is an \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{m}\\times\\:1\\)\u003c/span\u003e\u003c/span\u003e vector of local GEBVs predicted (in the RNA-sequenced cows) using the SNP effects from a\u0026thinsp;\u0026plusmn;\u0026thinsp;1 Mb window around the gene being tested; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{\\Omega\\:}\\)\u003c/span\u003e\u003c/span\u003e is a \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{m}\\times\\:1\\)\u003c/span\u003e\u003c/span\u003e vector of tissue-specific expression of the gene across the corresponding RNA-sequenced cows; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\mathbf{b}}_{1}\\)\u003c/span\u003e\u003c/span\u003e is the regression coefficient of the \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\widehat{\\mathbf{G}\\mathbf{E}\\mathbf{B}\\mathbf{V}}}_{\\mathbf{l}\\mathbf{o}\\mathbf{c}\\mathbf{a}\\mathbf{l}}\\)\u003c/span\u003e\u003c/span\u003e on \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\varvec{\\Omega\\:}\\)\u003c/span\u003e\u003c/span\u003e; x represents a design matrix for fixed effects (see next paragraph), and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{b}}_{2}\\)\u003c/span\u003e\u003c/span\u003e is the vector of fixed effects for the corresponding RNA-sequenced animals; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{g}\\)\u003c/span\u003e\u003c/span\u003e is a vector of random polygenic effects across the same RNA-sequenced cows, assumed to follow a normal distribution \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{g}\\sim\\text{N}(0,\\:\\mathbf{G}{{\\sigma\\:}}_{\\text{g}}^{2})\\)\u003c/span\u003e\u003c/span\u003e, where \u003cb\u003eG\u003c/b\u003e is the genomic relationship matrix [\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e], and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{{\\sigma\\:}}_{\\text{g}}^{2}\\)\u003c/span\u003e\u003c/span\u003e is the additive genetic variance explained by the whole genome SNPs; \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{e}\\)\u003c/span\u003e\u003c/span\u003e is the vector of residuals, assumed to follow a normal distribution \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\mathbf{e}\\sim\\text{N}\\left(0,\\:\\mathbf{I}{{\\sigma\\:}}_{\\text{e}}^{2}\\right)\\)\u003c/span\u003e\u003c/span\u003e, where \u003cb\u003eI\u003c/b\u003e is an identity matrix, and \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{{\\sigma\\:}}_{\\text{e}}^{2}\\)\u003c/span\u003e\u003c/span\u003e is residual variance.\u003c/p\u003e\u003cp\u003eModels for the WBC RNA-seq dataset incorporated the experiment as a categorical fixed effect, defined by five levels corresponding to sampling times, while days in milk (DIM) was used as a quantitative fixed effect, with a mean and SD of 86 (\u0026plusmn;\u0026thinsp;36) days. In contrast, no fixed effects were necessary for the mammary RNA-seq dataset [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. After applying the per-gene GSOR model to all genes within each RNA-seq dataset, the \u003cem\u003eP\u003c/em\u003e values were corrected to address the multiple testing issue. Within each dataset, genes with a FDR\u003csub\u003eGSOR\u003c/sub\u003e \u0026le; 0.1 were regarded significant (GSOR-identified genes).\u003c/p\u003e\u003cp\u003e\u003cb\u003eGWAS summary statistics\u003c/b\u003e: The original SNP coordinates were based on the UMD3.1 bovine reference genome. To ensure consistency with our datasets, we converted these coordinates to the ARS-UCD1.2 [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e] reference genome using the UCSC LiftOver tool [\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e] (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://genome.ucsc.edu/cgi-bin/hgLiftOver\u003c/span\u003e\u003cspan address=\"https://genome.ucsc.edu/cgi-bin/hgLiftOver\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), leaving 1,088,337 SNPs for downstream analyses. We regarded SNPs with \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026le;\u0026thinsp;1\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;8\u003c/sup\u003e as genome-wide significant (GWAS loci).\u003c/p\u003e\u003cp\u003e\u003cb\u003eWindow-based co-localization between GSOR-identified genes and GWAS loci\u003c/b\u003e: We partitioned the genome into non-overlapping windows of 100 kb and 500 kb in separate analyses, to test whether GSOR-identified genes and GWAS loci co-occur in the same genomic regions more often than expected by chance. For each window size, we classified windows based on the presence or absence of GSOR-identified genes and GWAS loci into four categories, including windows containing: (1) both GSOR-identified genes and GWAS loci; (2) only GSOR-identified gene(s); (3) only GWAS loci; and (4) neither. A GSOR-identified gene was assigned to a window if its transcription start site fell within that window. We then applied Fisher\u0026rsquo;s exact test to assess the statistical significance of the co-localization, considering \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{P}_{\\text{F}\\text{i}\\text{s}\\text{h}\\text{e}\\text{r}}\\le\\:\\:0.05\\:\\)\u003c/span\u003e\u003c/span\u003e as significant.\u003c/p\u003e\u003cp\u003e\u003cb\u003eGene list enrichment analysis\u003c/b\u003e: Candidate genes were defined as GSOR-identified gene(s) located within non-overlapping genomic windows that contained at least one GWAS locus. We used R (v4.4.3) package gprofiler2 (v0.2.3) [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e] to convert bovine genes to their human (\u003cem\u003eHomo sapiens\u003c/em\u003e) orthologs and to perform gene list enrichment analyses. We examined overrepresented Gene Ontology (GO) Biological Process (GO:BP) terms and Reactome pathways among the candidate genes, and terms with \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:{\\text{F}\\text{D}\\text{R}}_{\\text{t}\\text{e}\\text{r}\\text{m}}\\le\\:0.05\\)\u003c/span\u003e\u003c/span\u003e were regarded significant. All the genes in the corresponding RNA-seq data were used as background genes after being converted to their human (\u003cem\u003eHomo sapiens\u003c/em\u003e) orthologs.\u003c/p\u003e\u003cp\u003eWe hypothesized that the strongest candidate causal genes are those that (i) show tissue-specific expression correlated with local GEBVs (GSOR-identified genes), (ii) are enriched in window-based co-localization with GWAS loci, and (iii) are enriched for trait-relevant physiological functions.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eSchaid DJ, Chen W, Larson NB. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet. 2018;19(8):491\u0026ndash;504.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMaurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J, et al. Systematic Localization of Common Disease-Associated Variation in Regulatory DNA. Science. 2012;337(6099):1190\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eQi T, Song L, Guo Y, Chen C, Yang J. From genetic associations to genes: methods, applications, and challenges. Trends Genet. 2024;40(8):642\u0026ndash;67.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBush WS, Oetjens MT, Crawford DC. Unravelling the human genome\u0026ndash;phenome relationship using phenome-wide association studies. Nat Rev Genet. 2016;17(3):129\u0026ndash;45.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003evan Rheenen W, Peyrot WJ, Schork AJ, Lee SH, Wray NR. Genetic correlations of polygenic disease traits: from theory to practice. Nat Rev Genet. 2019;20(10):567\u0026ndash;81.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGhoreishifar M, Macleod IM, Chamberlain AJ, Liu Z, Lopdell TJ, Littlejohn MD, Xiang R, Pryce JE, Goddard ME. An integrative approach to prioritize candidate causal genes for complex traits in cattle. PLoS Genet. 2025;21(5):e1011492.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi Z, Zhou X. Towards improved fine-mapping of candidate causal variants. Nat Rev Genet 2025.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBrodie A, Azaria JR, Ofran Y. How far from the SNP may the causative genes be? Nucleic Acids Res. 2016;44(13):6046\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eConsortium TGO. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2020;49(D1):D325\u0026ndash;34.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSadovnikova A, Garcia SC, Hovey RC. A Comparative Review of the Cell Biology, Biochemistry, and Genetics of Lactose Synthesis. J Mammary Gland Biol Neoplasia. 2021;26(2):181\u0026ndash;96.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eStrucken EM, Laurenson YC, Brockmann GA. Go with the flow-biology and genetics of the lactation cycle. Front Genet. 2015;6:118.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXiang R, Fang L, Liu S, Liu GE, Tenesa A, Gao Y, Mason BA, Chamberlain AJ, Goddard ME, Consortium C. Genetic score omics regression and multi-trait meta-analysis detect widespread cis-regulatory effects shaping bovine complex traits. PNAS Nexus 2025.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLopdell TJ, Tiplady K, Struchalin M, Johnson TJJ, Keehan M, Sherlock R, Couldrey C, Davis SR, Snell RG, Spelman RJ, et al. DNA and RNA-sequence based GWAS highlights membrane-transport genes as key modulators of milk lactose content. BMC Genomics. 2017;18(1):968.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLittlejohn MD, Tiplady K, Fink TA, Lehnert K, Lopdell T, Johnson T, Couldrey C, Keehan M, Sherlock RG, Harland C, et al. Sequence-based Association Analysis Reveals an MGST1 eQTL with Pleiotropic Effects on Bovine Milk Composition. Sci Rep. 2016;6(1):25376.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eProwse-Wilkins CP, Lopdell TJ, Xiang R, Vander Jagt CJ, Littlejohn MD, Chamberlain AJ, Goddard ME. Genetic variation in histone modifications and gene expression identifies regulatory variants in the mammary gland of cattle. BMC Genomics. 2022;23(1):815.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChamberlain A, Hayes B, Xiang R, Vander Jagt C, Reich C, Macleod I, Prowse-Wilkins C, Mason B, Daetwyler H, Goddard M. Identification of regulatory variation in dairy cattle with RNA sequence data. In: \u003cem\u003eProceedings of the 11th World Congress on Genetics Applied to Livestock Production\u003c/em\u003e: 2018; 2018: 11\u0026ndash;16.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXiang R, Fang L, Liu S, Macleod IM, Liu Z, Breen EJ, Gao Y, Liu GE, Tenesa A, Mason BA et al. Gene expression and RNA splicing explain large proportions of the heritability for complex traits in cattle. Cell Genomics 2023, 3(10).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXiang R, Hayes BJ, Vander Jagt CJ, MacLeod IM, Khansefid M, Bowman PJ, Yuan Z, Prowse-Wilkins CP, Reich CM, Mason BA, et al. Genome variants associated with RNA splicing variations in bovine are extensively shared between tissues. BMC Genomics. 2018;19(1):521.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKolberg L, Raudvere U, Kuzmin I, Vilo J, Peterson H. gprofiler2\u0026ndash;an R package for gene list functional enrichment analysis and namespace conversion toolset g: Profiler. \u003cem\u003eF1000Research\u003c/em\u003e 2020, 9:ELIXIR-709.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWainberg M, Sinnott-Armstrong N, Mancuso N, Barbeira AN, Knowles DA, Golan D, Ermel R, Ruusalepp A, Quertermous T, Hao K. Opportunities and challenges for transcriptome-wide association studies. Nat Genet. 2019;51(4):592\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGamazon ER, Wheeler HE, Shah KP, Mozaffari SV, Aquino-Michaels K, Carroll RJ, Eyler AE, Denny JC, Nicolae DL, Cox NJ, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat Genet. 2015;47(9):1091\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGhoreishifar M, Chamberlain AJ, Xiang R, Prowse-Wilkins CP, Lopdell TJ, Littlejohn MD, Pryce JE, Goddard ME. Allele-specific binding variants causing ChIP-seq peak height of histone modification are not enriched in expression QTL annotations. Genet Selection Evol. 2024;56(1):50.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWard LD, Kellis M. Interpreting noncoding genetic variation in complex traits and human disease. Nat Biotechnol. 2012;30(11):1095\u0026ndash;106.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSpitz F. Gene regulation at a distance: From remote enhancers to 3D regulatory ensembles. Semin Cell Dev Biol. 2016;57:57\u0026ndash;67.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMancuso N, Shi H, Goddard P, Kichaev G, Gusev A, Pasaniuc B. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet. 2017;100(3):473\u0026ndash;87.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCosta A, Egger-Danner C, M\u0026eacute;sz\u0026aacute;ros G, Fuerst C, Penasa M, S\u0026ouml;lkner J, Fuerst-Waltl B. Genetic associations of lactose and its ratios to other milk solids with health traits in Austrian Fleckvieh cows. J Dairy Sci. 2019;102(5):4238\u0026ndash;48.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHolt C. Interrelationships of the concentrations of some ionic constituents of human milk and comparison with cow and goat milks. Comp Biochem Physiol Part A: Physiol. 1993;104(1):35\u0026ndash;41.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWack RP, Lien EL, Taft D, Roscelli JD. Electrolyte composition of human breast milk beyond the early postpartum period. Nutrition. 1997;13(9):774\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKubo Y, Adelman JP, Clapham DE, Jan LY, Karschin A, Kurachi Y, Lazdunski M, Nichols CG, Seino S, Vandenberg CA. International Union of Pharmacology. LIV. Nomenclature and molecular relationships of inwardly rectifying potassium channels. Pharmacol Rev. 2005;57(4):509\u0026ndash;26.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKamikawa A, Ishikawa T. Functional expression of a Kir2. 1-like inwardly rectifying potassium channel in mouse mammary secretory cells. Am J Physiology-Cell Physiol. 2014;306(3):C230\u0026ndash;40.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBarry J, Rowland S. Variations in the ionic and lactose concentrations of milk. Biochem J. 1953;54(4):575.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTiplady KM, Lopdell TJ, Reynolds E, Sherlock RG, Keehan M, Johnson TJJ, Pryce JE, Davis SR, Spelman RJ, Harris BL, et al. Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle. Genet Selection Evol. 2021;53(1):62.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePedrosa VB, Schenkel FS, Chen S-Y, Oliveira HR, Casey TM, Melka MG, Brito LF. Genomewide Association Analyses of Lactation Persistency and Milk Production Traits in Holstein Cattle Based on Imputed Whole-Genome Sequence Data. Genes. 2021;12(11):1830.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePausch H, Emmerling R, Gredler-Grandl B, Fries R, Daetwyler HD, Goddard ME. Meta-analysis of sequence-based association studies across three cattle breeds reveals 25 QTL for fat and protein percentages in milk at nucleotide resolution. BMC Genomics. 2017;18(1):853.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVoss FK, Ullrich F, M\u0026uuml;nch J, Lazarow K, Lutter D, Mah N, Andrade-Navarro MA, von Kries JP, Stauber T, Jentsch TJ. Identification of LRRC8 Heteromers as an Essential Component of the Volume-Regulated Anion Channel VRAC. Science. 2014;344(6184):634\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSyeda R, Qiu Z, Dubin AE, Murthy SE, Florendo MN, Mason DE, Mathur J, Cahalan SM, Peters EC, Montal M, et al. LRRC8 Proteins Form Volume-Regulated Anion Channels that Sense Ionic Strength. Cell. 2016;164(3):499\u0026ndash;511.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVarela D, Penna A, Simon F, Eguiguren AL, Leiva-Salcedo E, Cerda O, Sala F, Stutzin A. P2X4 Activation Modulates Volume-sensitive Outwardly Rectifying Chloride Channels in Rat Hepatoma Cells *. J Biol Chem. 2010;285(10):7566\u0026ndash;74.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSneddon N, Lopez-Villalobos N, Davis S, Hickson R, Shalloo L. Genetic parameters for milk components including lactose from test day records in the New Zealand dairy herd. New Z J Agricultural Res. 2015;58(2):97\u0026ndash;107.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang G, Jin W, Zhang L, Dong M, Zhang X, Zhou Z, Wang X. SLC50A1 inhibits the doxorubicin sensitivity in hepatocellular carcinoma cells through regulating the tumor glycolysis. Cell Death Discovery. 2024;10(1):495.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen L-Q, Hou B-H, Lalonde S, Takanaga H, Hartung ML, Qu X-Q, Guo W-J, Kim J-G, Underwood W, Chaudhuri B, et al. Sugar transporters for intercellular exchange and nutrition of pathogens. Nature. 2010;468(7323):527\u0026ndash;32.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNebert DW, Liu Z. SLC39A8 gene encoding a metal ion transporter: discovery and bench to bedside. Hum Genomics. 2019;13(Suppl 1):51.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGuo M. 2 - Chemical composition of human milk. In: \u003cem\u003eHuman Milk Biochemistry and Infant Formula Manufacturing Technology.\u003c/em\u003e Edited by Guo M: Woodhead Publishing; 2014: 19\u0026ndash;32.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang X, Zhang B, Dong W, Zhao Y, Zhao X, Zhang Y, Zhang Q. SLC34A2 Targets in Calcium/Phosphorus Homeostasis of Mammary Gland and Involvement in Development of Clinical Mastitis in Dairy Cows. Anim (Basel) 2024, 14(9).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu X, Robinson GW, Wagner KU, Garrett L, Wynshaw-Boris A, Hennighausen L. Stat5a is mandatory for adult mammary gland development and lactogenesis. Genes Dev. 1997;11(2):179\u0026ndash;86.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKobayashi K, Wakasa H, Han L, Koyama T, Tsugami Y, Nishimura T. Lactose on the basolateral side of mammary epithelial cells inhibits milk production concomitantly with signal transducer and activator of transcription 5 inactivation. Cell Tissue Res. 2022;389(3):501\u0026ndash;15.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGilmour A, Gogel B, Cullis B, Welham S, R T. ASReml user guide release 4.2 structural specification. Hemel Hempstead, UK. In.;: VSN International; 2022.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKhansefid M, Pryce JE, Shahinfar S, Axford M, Goddard ME, Haile-Mariam M. Improving accuracy and stability of genetic predictions for dairy cow survival. Anim Prod Sci. 2023;63(11):1031\u0026ndash;42.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRosen BD, Bickhart DM, Schnabel RD, Koren S, Elsik CG, Tseng E, Rowan TN, Low WY, Zimin A, Couldrey C et al. De novo assembly of the cattle reference genome with single-molecule sequencing. GigaScience 2020, 9(3).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSargolzaei M, Chesnais JP, Schenkel FS. A new approach for efficient genotype imputation using information from relatives. BMC Genomics. 2014;15(1):478.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003evan den Berg I, Nguyen TV, Nguyen TTT, Pryce JE, Nieuwhof GJ, MacLeod IM. Imputation accuracy and carrier frequency of deleterious recessive defects in Australian dairy cattle. J Dairy Sci. 2024;107(11):9591\u0026ndash;601.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDaetwyler HD, Capitan A, Pausch H, Stothard P, van Binsbergen R, Br\u0026oslash;ndum RF, Liao X, Djari A, Rodriguez SC, Grohs C, et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat Genet. 2014;46(8):858\u0026ndash;65.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNguyen TV, Bolormaa S, Reich CM, Chamberlain AJ, Vander Jagt CJ, Daetwyler HD, MacLeod IM. Empirical versus estimated accuracy of imputation: optimising filtering thresholds for sequence imputation. Genet Selection Evol. 2024;56(1):72.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLoh PR, Danecek P, Palamara PF, Fuchsberger C, Y AR HKF, Schoenherr S, Forer L, McCarthy S, Abecasis GR, et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat Genet. 2016;48(11):1443\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBrowning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84(2):210\u0026ndash;23.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eErbe M, Hayes BJ, Matukumalli LK, Goswami S, Bowman PJ, Reich CM, Mason BA, Goddard ME. Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. J Dairy Sci. 2012;95(7):4114\u0026ndash;29.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBreen EJ, MacLeod IM, Ho PN, Haile-Mariam M, Pryce JE, Thomas CD, Daetwyler HD, Goddard ME. BayesR3 enables fast MCMC blocked processing for largescale multi-trait genomic prediction and QTN mapping analysis. Commun Biology. 2022;5(1):661.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTrebes H, Wang Y, Reynolds E, Tiplady K, Harland C, Lopdell T, Johnson T, Davis S, Harris B, Spelman R, et al. Identification of candidate novel production variants on the Bos taurus chromosome X. J Dairy Sci. 2023;106(11):7799\u0026ndash;815.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42(7):565\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G, Clawson H, Diekhans M, Furey TS, Harte RA, Hsu F, et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006;34(suppl1):D590\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll animal experiments were conducted in strict accordance with the rules and guidelines outlined in the New Zealand Animal Welfare Act 1999. Most data were generated as part of a mammary tissue biopsy experiment, with all samples obtained in accordance with protocols approved by the Ruakura Animal Ethics Committee, Hamilton, New Zealand (approval AEC 12845). No animals were sacrificed for this study. The study is reported in accordance with ARRIVE guidelines.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDataGene Australia (\u003ca href=\"http://www.datagene.com.au/\" target=\"_blank\"\u003ehttp://www.datagene.com.au/\u003c/a\u003e) is the custodian of the raw phenotype and genotype data of Australian farm animals. Access to these data for research requires permission from DataGene under a Data Use Agreement. Other supporting data are shown in the Supplementary Materials of the manuscript. Code and tutorials for GSOR are available at \u003ca href=\"https://github.com/rxiangr/GSOR-and-MTAO\"\u003ehttps://github.com/rxiangr/GSOR-and-MTAO\u003c/a\u003e\u003cstrong\u003e.\u003c/strong\u003e All gene expression data was taken from previously published studies as detailed in the Methods section.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was undertaken as part of the DairyBio program, which is jointly funded by Dairy Australia (Melbourne, Australia), Agriculture Victoria (Melbourne, Australia), and The Gardiner Foundation (Melbourne, Australia). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026apos; contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConceptualization: M. Goddard, M Ghoreishifar\u003c/p\u003e\n\u003cp\u003eMethodology \u0026amp; Formal Analyses: M. Ghoreishifar\u003c/p\u003e\n\u003cp\u003eWriting Original Draft \u0026amp; Visualization: M. Ghoreishifar\u003c/p\u003e\n\u003cp\u003eWriting Reviewing Editing:\u0026nbsp;M. Ghoreishifar, I. Macleod, T. Nguyen, T. Lopdell, M. Littlejohn, R. Xiang, A. Chamberlain, J. Pryce, M. Goddard\u003c/p\u003e\n\u003cp\u003eSupervision: M. Goddard, J. Pryce, A. Chamberlain\u003c/p\u003e\n\u003cp\u003eFunding Acquisition: J. Pryce\u003c/p\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"bmc-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gics","sideBox":"Learn more about [BMC Genomics](http://bmcgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gics","title":"BMC Genomics","twitterHandle":"#BMCGenomics","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-7693421/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7693421/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eGenome-wide association studies (GWASs) have identified thousands of loci for complex traits, but pinpointing causal variants and linking them to target genes remains challenging. Several strategies have been proposed to address these challenges, e.g., learning across the genome, using larger and multi-breed datasets, multi-trait analyses, leveraging multi-omics data, etc.\u003c/p\u003e\u003cp\u003eWe used a multi-breed dataset of over 81,000 cows from Australia, including Holstein, Jersey, and Australian Red, with phenotypes for milk lactose percentage (LP) and imputed sequence genotypes. LD pruning excluded SNPs with r2\u0026thinsp;\u0026gt;\u0026thinsp;0.95. We used BayesR to estimate SNP effects for LP (~\u0026thinsp;1.1\u0026nbsp;million SNPs remained after LD pruning); These SNP effects were used to predict local genomic breeding values (GEBVs) for ~\u0026thinsp;400 mammary RNA-sequenced cows from New Zealand. Then, genetic score omics regression (GSOR) was applied to test associations between observed gene expression and local GEBVs, identifying 711 significant genes (FDR\u0026thinsp;\u0026le;\u0026thinsp;0.1) out of 12,000 genes expressed in the mammary gland. We developed a window-based test to investigate the significance of colocalization between GSOR results and GWAS summary statistics obtained from an independent study. We found 30 windows containing both GWAS signals and GSOR-significant genes (i.e., 34 genes), the overlap which was significantly higher than chance expectation (\u003cem\u003eP\u003c/em\u003eFisher\u0026thinsp;=\u0026thinsp;2.96\u0026times;10⁻⁹). Among the 34 genes analyzed, 20 contributed to the significantly enriched gene ontology term \u0026lsquo;transmembrane transport\u0026rsquo; and its child terms (FDR\u0026thinsp;\u0026lt;\u0026thinsp;0.05). These terms are relevant to the physiology of lactose production in the mammary gland.\u003c/p\u003e\u003cp\u003eWe hypothesized that the 20 genes are the most likely causal genes for the trait because: mammary expression of these genes was associated with GEBV for the trait, they were significantly colocalized with GWAS signals, and they were enriched in gene ontology terms relevant to physiology of the trait. Our approach provides strong evidence for causal genes supported by multiple lines of evidence (GWAS, GSOR, and functional enrichment) and demonstrates the power of multi-trait \u0026amp; multi-omics data integration.\u003c/p\u003e","manuscriptTitle":"Bridging GWAS to genes: an integrative multi-omics approach using cattle data","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-14 20:49:48","doi":"10.21203/rs.3.rs-7693421/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-12-01T16:08:27+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-26T09:36:20+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"70622189246667559509946175268755342364","date":"2025-11-14T07:01:04+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-10-31T13:58:11+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"243308438807636258142825894698184171531","date":"2025-10-28T09:41:48+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"147216990986708004137532618513309311919","date":"2025-10-06T14:26:13+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-10-06T09:42:51+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-09-26T02:23:50+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-09-26T02:23:40+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Genomics","date":"2025-09-23T10:46:09+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"bmc-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gics","sideBox":"Learn more about [BMC Genomics](http://bmcgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gics","title":"BMC Genomics","twitterHandle":"#BMCGenomics","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"00fee536-42a1-44a7-961a-90e007b3e5c6","owner":[],"postedDate":"October 14th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2026-01-19T16:45:54+00:00","versionOfRecord":{"articleIdentity":"rs-7693421","link":"https://doi.org/10.1186/s12864-026-12525-0","journal":{"identity":"bmc-genomics","isVorOnly":false,"title":"BMC Genomics"},"publishedOn":"2026-01-14 16:29:25","publishedOnDateReadable":"January 14th, 2026"},"versionCreatedAt":"2025-10-14 20:49:48","video":"","vorDoi":"10.1186/s12864-026-12525-0","vorDoiUrl":"https://doi.org/10.1186/s12864-026-12525-0","workflowStages":[]},"version":"v1","identity":"rs-7693421","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7693421","identity":"rs-7693421","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.