Full text
66,872 characters
· extracted from
preprint-html
· click to expand
Integrative GWAS, transcriptomics, and genomic prediction identify loci and candidate genes for flowering time in alfalfa | Authorea try { document.documentElement.classList.add('js'); } catch (e) { } var _gaq = _gaq || []; _gaq.push(['_setAccount', 'G-8VDV14Y67G']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); Skip to main content Preprints Collections Wiley Open Research IET Open Research Ecological Society of Japan All Collections About About Authorea FAQs Contact Us Quick Search anywhere Search for preprint articles, keywords, etc. Search Search ADVANCED SEARCH SCROLL This is a preprint and has not been peer reviewed. Data may be preliminary. 3 February 2026 V1 Latest version Share on Integrative GWAS, transcriptomics, and genomic prediction identify loci and candidate genes for flowering time in alfalfa Authors : Kai Zhu , Ming Xu , Tiejun Zhang , Xueqian Jiang , Yanchao Xu , Junmei Kang , Qingchuan Yang , Ruicai Long , and Fei He [email protected] Authors Info & Affiliations https://doi.org/10.22541/au.177014924.43030651/v1 193 views 58 downloads Contents Abstract Information & Authors Metrics & Citations View Options References Figures Tables Media Share Abstract Flowering time in alfalfa is a key agronomic trait that influences harvest scheduling and breeding strategies, with direct consequences for forage yield and quality. Despite its importance, the genetic basis underlying flowering time in alfalfa remains poorly understood. In this study, we evaluated flowering time under field conditions over three years using a diverse panel of 176 accessions and investigated its genetic architecture. Phenotypic analysis revealed substantial variation across years. Genome resequencing of the panel yielded 2,043,025 high-quality single-nucleotide polymorphisms (SNPs), which were used for genome-wide association studies (GWAS). GWAS identified 42 significant SNPs and implicated 210 candidate genes in close proximity to these loci. Integration of GWAS with RNA-seq data prioritized 10 candidate genes, and their involvement in flowering regulation was further supported by RT‒qPCR. Haplotype analysis indicated that the number of haplotypes associated with delayed flowering was positively correlated with later flowering time. Genomic prediction (GP) using the top 5,000 SNPs from GWAS achieved accuracies ranging from 0.72 to 0.91 across traits. These findings provide testable targets for elucidating the regulatory mechanisms of flowering time in alfalfa and establish a foundation for molecular design breeding to optimize this trait. Integrative GWAS, transcriptomics, and genomic prediction identify loci and candidate genes for flowering time in alfalfa Kai Zhu a, † , Ming Xu a, † , Tiejun Zhang a, † , Xueqian Jiang a , Yanchao Xu a , Junmei Kang a , Qingchuan Yang a , Ruicai Long a, *, Fei He a, * a Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China * Correspondence: [email protected] (Fei He) ; [email protected] (Ruicai Long) † These authors contributed equally to this work Abstract Flowering time in alfalfa is a key agronomic trait that influences harvest scheduling and breeding strategies, with direct consequences for forage yield and quality. Despite its importance, the genetic basis underlying flowering time in alfalfa remains poorly understood. In this study, we evaluated flowering time under field conditions over three years using a diverse panel of 176 accessions and investigated its genetic architecture. Phenotypic analysis revealed substantial variation across years. Genome resequencing of the panel yielded 2,043,025 high-quality single-nucleotide polymorphisms (SNPs), which were used for genome-wide association studies (GWAS). GWAS identified 42 significant SNPs and implicated 210 candidate genes in close proximity to these loci. Integration of GWAS with RNA-seq data prioritized 10 candidate genes, and their involvement in flowering regulation was further supported by RT ‒ qPCR. Haplotype analysis indicated that the number of haplotypes associated with delayed flowering was positively correlated with later flowering time. Genomic prediction (GP) using the top 5,000 SNPs from GWAS achieved accuracies ranging from 0.72 to 0.91 across traits. These findings provide testable targets for elucidating the regulatory mechanisms of flowering time in alfalfa and establish a foundation for molecular design breeding to optimize this trait. Key words: Alfalfa; GWAS; Flowering time; GP Introduction Alfalfa ( Medicago sativa L.) is one of the most important forage crops worldwide and is often referred to as the “Queen of Forages” because of its excellent palatability, digestibility, and nutritional value, typically containing more than 20% crude protein(He et al. , 2019, Li & Brummer, 2012, Russelle, 2001). As an irreplaceable feed source for dairy cows, beef cattle, and other livestock, alfalfa provides essential support for animal husbandry and the dairy industry(Shen et al. , 2020). In addition, alfalfa establishes symbiotic interactions with rhizobia, enabling biological nitrogen fixation and thereby contributing to sustainable agricultural systems (Wolabu et al. , 2020). Given its agronomic and ecological importance, understanding the genetic control of key developmental traits in alfalfa is of both scientific and practical relevance. The alfalfa life cycle includes vegetative and reproductive phases, and flowering represents a key phenological transition between them (Park et al. , 2014). Flowering time directly affects biomass accumulation and harvest management: early flowering can divert assimilates toward reproductive development, reducing vegetative growth and lowering biomass yield, whereas delayed flowering may increase yield in a single harvest by extending vegetative growth but can reduce total annual yield due to fewer possible harvests within a season (Rajendran et al. , 2021). Because alfalfa regrows rapidly both before and after clipping, it is well suited to multiple harvests each year (Du et al. , 2021, Singer et al. , 2018), and harvesting at an appropriate developmental stage is critical for balancing yield and forage quality (Adhikari et al. , 2019, Lacefield, 2004). In practice, alfalfa is often harvested before seeds fully mature to achieve higher total digestible nutrients (TDN) (He et al. , 2022). Therefore, reliable information on flowering time can help producers optimize harvest timing and maximize economic returns (Arzani et al. , 2004). However, flowering time is a complex quantitative trait jointly shaped by genetic and environmental factors, and the molecular mechanisms controlling flowering time in alfalfa remain insufficiently understood (Koornneef et al. , 1998). The genetic basis of flowering time has been extensively investigated in model and crop species, providing a framework for dissecting this trait in alfalfa. In A. thaliana , flowering is regulated by multiple pathways, including photoperiod, autonomous, and gibberellin (GA)-mediated pathways (Reeves & Coupland, 2001). For example, the GA biosynthesis mutant ga1 shows little change in flowering time under long-day conditions but fails to flower under short days; exogenous GA can restore flowering, supporting a positive role for GA in flowering regulation through photoperiod-related signaling (Wilson et al. , 1992). Consistently, mutations in GA receptor genes ( GIBBERELLIN-INSENSITIVE DWARF1 , GID1 ) lead to delayed flowering, further implicating GA in floral transition (Willige et al. , 2007). In rice ( Oryza sativa L. ), CONSTANS/Heading date genes and the florigen Hd3a constitute core components of the photoperiod pathway, highlighting both conserved and divergent photoperiodic mechanisms across species (Luccioni et al. , 2019, Tamaki et al. , 2007). Photoperiod and temperature are also major environmental cues influencing flowering in alfalfa; as a long-day plant, alfalfa typically requires a minimum day length for flowering induction, suggesting that genetic variation in photoperiod responsiveness is an important target for trait manipulation. Within the genus M. truncatula , several FLOWERING LOCUS T ( FT ) family members (e.g., FTa , FTb , and FTc ) have been validated as key regulators of flowering time in M. truncatula (Hecht et al. , 2005). Functional studies have shown that MtFT genes act as florigens and can rescue late-flowering mutants by promoting early flowering (Laurie et al. , 2011). MtFTa1 appears to be a major florigen, as mtfta1 mutants display delayed flowering across tested conditions; notably, Ms FTa1 is the closest homolog of Mt FTa1 in alfalfa based on exon–intron structure and sequence similarity (Cheng et al. , 2021), and several studies support Ms FTa1 as an FT homolog involved in alfalfa flowering regulation (Kang et al. , 2019, Lorenzo et al. , 2020). In addition to FT homologs, other alfalfa genes such as MsLFY , SPL13 , MsLEA-D34 , and MsZFN have been cloned and functionally characterized, demonstrating their roles in flowering-time regulation through reverse genetic approaches (Chao et al. , 2014, Gao et al. , 2018, Lv et al. , 2021, Zhang et al. , 2013). Together, these findings suggest that alfalfa flowering is governed by a complex regulatory network, but key loci and causal genes controlling natural variation in diverse germplasm remain to be elucidated. Genome-wide association studies (GWAS) enabled by high-density marker datasets have become a cost-effective and efficient approach to map quantitative trait loci (QTLs) and prioritize candidate genes for complex agronomic traits in crops, including alfalfa (Sakiroglu & Brummer, 2017, Salami et al. , 2024, Salami et al. , 2024). Using GWAS, numerous loci associated with traits such as salt tolerance, drought tolerance, disease resistance, plant architecture, and forage quality have been identified (He et al. , 2022, Jiang et al. , 2024, Lin et al. , 2020, Zhang et al. , 2015). For example, the soybean ( Glycine max L.) locus harboring Dt1 is strongly associated with days to flowering (Zhang et al. , 2015); FATB is associated with palmitic acid composition and lipid metabolism in maize ( Zea mays L.) kernels (Li et al. , 2013); and ERF-family transcription factors have been linked to resistance to northern leaf blight in maize (Poland et al. , 2011). A GWAS of 278 soybean accessions identified dozens of significant SNP peaks associated with flowering time and related developmental stages (Li et al. , 2019), and multiple SNP markers for flowering time have also been reported in canola ( Brassica napus L.) across diverse environments (Raman et al. , 2016). These studies collectively demonstrate the power of GWAS to dissect the genetic architecture of flowering time and motivate similar efforts in alfalfa. Despite increasing GWAS applications in alfalfa, most studies have focused on drought resistance (Medina et al. , 2025), salt tolerance (He et al. , 2023), and forage quality (Xu et al. , 2025a), whereas relatively few have examined the genetic basis and regulatory mechanisms underlying flowering time. Recent advances in genome assembly have produced high-quality reference genomes for several alfalfa varieties, including Zhongmu No.1, Xinjiangdaye, and Zhongmu-4 (Chen et al. , 2020, Long et al. , 2021, Shen et al. , 2020), providing valuable resources for gene discovery and molecular breeding. In this study, we evaluated flowering time in 176 diverse alfalfa accessions representing different geographic origins and improvement statuses over three consecutive years. We then performed GWAS to identify SNPs associated with flowering time–related traits and integrated GWAS signals with RNA-seq analyses to prioritize candidate genes. Our findings provide key loci and candidate regulators of flowering time and offer molecular markers and insights to facilitate the genetic improvement of alfalfa. Materials and method Plant materials and growth conditions Detailed information on the origin and collection of the 176 germplasm accessions has been described in our previous study (He et al. , 2025). The flowering time phenotypes from 176 accessions were collected in Yinchuan City, Ningxia Hui Autonomous Region (Yongning County: 38.21° N, 106.22° E). The experimental field was characterized by a temperate continental climate. The accessions were established through asexual propagation to ensure genetic uniformity of the germplasm. The field trial followed a randomized block design with three replicates. Within each plot, five individual plants of the same accession were spaced 60 cm apart, while rows and columns between different accessions were spaced 100 cm apart. Weeding was performed manually once per month during June, July, and August to maintain plant health. No additional field management practices, such as fertilization or irrigation, were applied. Phenotypic evaluation and statistical analysis The flowering time (the date when the first flower appeared) before the first harvesting in 2023, 2024, and 2025 was measured and converted to photothermal units (PTUs) based on the methods described by Grabowski et al . (Grabowski et al. , 2017). \begin{equation} PTU=\sum_{i=1}^{n}{(\frac{\text{dayL}}{24}}*(\frac{minT+maxT}{2}-10))\nonumber \\ \end{equation} MinT refers to the minimum temperature, while maxT represents the maximum temperature. In the calculation of PTU, days were defined as the total number of days from the start of a 5-day consecutive period during which the daily average temperature exceeded 10°C to the date of first flower appearance. DayL was defined as the total number of hours between sunrise and sunset on a given day. The information of daily temperature and dayL was obtained from the weather station of Yi Kang Nong (Beijing, China). The random-effects model employed in this study was described by Zhang et al. (2020) (Zhang et al. , 2020). The frequency distribution, coefficient of variation (CV), broad-sense heritability (H 2 ), and analysis of variance (ANOVA) of the flowering time traits across accessions were calculated using SPSS v17.0 following the methods outlined by Saxena et al. (2014) (Saxena et al. , 2014). The data from each individual year were recorded as 2023PTU, 2024 PTU, and 2025 PTU. R (v4.4.3) package lme4 computed best linear unbiased predictions (BLUPs) values and estimated broad-sense heritability: \begin{equation} H^{2}=\frac{V_{g}}{V_{g}+\frac{V_{\text{gy}}}{Y}+\frac{V_{e}}{Y*R}}\nonumber \\ \end{equation} where V g represents genetic variance, V gy represents genotype × year interaction variance, and V e refers to residual variance, with R = 3 (number of replicates), Y = 3 (number of years). Phenotypic data were statistically analyzed using R software (v4.4.3) to calculate means, standard deviations, and other descriptive statistics. Differences in phenotypic traits among accessions from different geographical origins were evaluated using Student’s t -tests. Phenotypic correlations and cluster analyses were conducted and visualized using the online platform Chiplot ( https://www.chiplot.online/ , accessed on 16 November 2025). The correlation of flowering time between years was estimated using Pearson’s correlations, and visualized by Origin (v2024). Genotyping and genome-wide association study Total genomic DNA was extracted using the CWBIO Plant Genomic DNA Kit (Beijing ComWin Biotech Co., Ltd., Beijing, China), according to the manufacturer’s instructions. Sequencing was carried out on the DNBSEQ platform at BGI-Shenzhen (Shenzhen, China). A total of approximately 36 Gb of raw sequencing data were generated per genotype. Paired-end reads were then aligned to the ZM4 reference genome using BWA-MEM software (version 0.7.17) (Li & Durbin, 2009). A total of approximately 29.6 million SNPs were identified using the SAMtools VarScan pipeline (Li et al. , 2009). Data filtering was performed with vcftools (version 0.1.16) (Danecek et al. , 2011) based on the following criteria: missing rate ≤ 10%, mean read depth ≥ 5, and minor allele frequency (MAF) > 0.05. This resulted in a final dataset comprising 2,043,025 high-quality SNPs. The FarmCPU model was selected for GWAS with the R package (v4.4.3) rMVP(https://github.com/xiaolei-lab/rMVP) . The thresholds for significantly associated loci were a logarithm of odds (LOD) score ≥ 5.5. The R package CMplot was used to draw a Manhattan plot to display the results of the association analysis. Analysis of favorable haplotypes by GWAS Haplotype analysis was performed on key SNP and their surrounding regions to evaluate their potential application in alfalfa breeding programs. T -tests were conducted to ascertain favorable haplotypes by evaluating the connection between various haplotypes and the phenotypic BLUP values. In this study, favorable haplotypes were defined as those associated with higher BLUP values. The analytical procedure was carried out as follows: First, R (version 4.4.3) was used to extract detailed genotypic information for all significant SNPs, each of which exhibited at least two distinct haplotypes. Subsequently, independent samples t -tests were performed using the R packages broom , magrittr , and dplyr to compare the phenotypic BLUP values corresponding to each haplotype of the significant SNPs and determine statistically significant differences between groups. Finally, boxplots were generated using the ggplot2 package to visualize the distribution of BLUP values, and favorable haplotypes were identified based on the observed phenotypic superiority. The resulting visualizations were further refined using the online platform Chiplot. RNA-Seq and transcriptomic analysis Flower buds were collected from five early-flowering and five late-flowering varieties at the onset of bud emergence, with sampling restricted to one plant per accession. Three biological replicates were included for each genotype. Total RNA was extracted using TRIzol reagent (Invitrogen, CA, USA). cDNA library preparation and sequencing were conducted by Novogene (Beijing, China) on the Illumina HiSeq 2500 platform, with three biological replicates per sample for library construction. Clean reads were aligned to the ZM4 reference genome using TopHat2 software (Trapnell et al. , 2012). Gene expression levels were quantified as fragments per kilobase of transcript per million mapped reads (FPKM). Differentially expressed genes (DEGs) were identified using DESeq, with the false discovery rate (FDR) controlled by the Benjamini and Hochberg procedure (Anders & Huber, 2010, Benjamini & Hochberg, 1995). Significant DEGs were defined as those with an adjusted p -value ( P adj ) < 0.05 and an absolute log2 fold change (|log 2 FC|) ≥ 1. For functional characterization, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed on DEGs consistently identified across multiple time points using TBtools software (Chen et al. , 2020). RT ‒ qPCR analysis of candidate genes Seven accessions exhibiting contrasting phenotypes flowering materials were selected based on phenotype performance. Flower tissue was obtained immediately when flowers appeared. Total RNA was isolated using the Eastep® Super Total RNA Extraction Kit (Promega, Beijing, China) following the manufacturer’s protocol. RNA was reverse-transcribed into cDNA with Novizan’s HisScript III All-in-one RT SuperMix Perfect for qPCR, and the resulting cDNA was diluted tenfold for downstream analyses. Gene-specific primers were designed using Primer Premier 5 (Table S4). RT‒qPCR was performed with SYBR Premix Ex Taq (Takara, Tokyo, Japan) on an Applied Biosystems 7500 Real-Time PCR System. Each sample was analyzed in three technical replicates. Gene expression was normalized to the alfalfa ACTIN reference gene, and relative expression levels were calculated using the 2 -ΔΔCT method. Genomic prediction for flowering time-related traits in alfalfa The machine learning approaches employed include: ElasticNet (elastic net regression), ElasticNetCV (elastic net regression cross validation), Lasso (least absolute shrinkage and selection operator), PLSRegression (partial least squares regression), along with support vector machine-based model polynomial kernel SVR. This study employed a repeated five-fold cross-validation framework. In each run, genotypic data were partitioned into 80% training and 20% testing sets across five folds, and the entire procedure was repeated 100 times to enhance statistical robustness. To prevent circularity and information leakage, GWAS-based SNP ranking used to assemble the top-100, 500, 1000, 2000, 3000, 4000 and 5000 marker panels, along with any model tuning, was performed exclusively on the training data within each fold and repeat; the test partition was reserved for performance evaluation. The held-out test fold was reserved exclusively for final evaluation. Model performance was assessed using the Pearson correlation between predicted and observed phenotypic values. Analyses included SNP sets pruned for linkage disequilibrium, as well as GWAS-informed panels comprising the top 100, 500, 1000, 2000, 3000, 4000 and 5000 most significantly associated markers. Phenotypic data analysis Phenotypic data collected over three years, together with the BLUP values, showed continuous variation and an approximately normal distribution, indicating that this trait is a typical quantitative trait (Fig. 1A−D). The average PTU values were 88.78, 151.61, and 200.58, respectively (Table S1). Notably, the PTU value in 2023 was significantly lower than in the other years, likely due to the lower average daily temperature during that year, which adversely affected the estimation of PTU. Furthermore, the broad-sense heritability estimates exceeded 63%, indicating a strong genetic contribution to these traits under field conditions. Correlation analysis revealed significant positive correlations among flowering time in three years and their BLUP value (Fig1 E). Fig. 1 Phenotypic analysis of flowering time in 2023, 2024 and 2025. A−D : Histogram displayed flowering time in single year and BLUP value in three years. E : Analysis of correlations among traits, the color gradients and circle sizes to represent the strength of the correlations. Orange indicates positive correlations, and green indicates negative correlations, with darker color intensities denoting stronger correlations. Significance levels: * p < 0.05, ** p < 0.01, and *** p < 0.001. Genome-wide association study for flowering time In this study, a total of 2,043,025 high-quality SNPs were retained to conduct GWAS for flowering time in alfalfa. Using a significance threshold of LOD > 5.5, we identified 42 significant SNPs associated with flowering time across the three-year phenotypic datasets (2023–2025 PTU) and the BLUP values (Fig. 2A–D; Table S2). These significant markers were distributed across all chromosomes. Chromosome 5 harbored the largest number of associated SNPs (13 SNPs), whereas chromosome 6 contained the fewest (2 SNPs) (Fig. 2A–D; Table S2). Notably, several loci were detected in multiple analyses: chr2_45365897 and chr2_45365876 were significant for both 2023 PTU and PTU_BLUP, and chr1_78229679 and chr1_78303028 were significant for both 2025 PTU and PTU_BLUP. To further resolve these association signals and narrow candidate intervals, we performed local association mapping and linkage disequilibrium (LD) block analyses for the lead loci (Fig. 2E, F). For 2023 PTU, GWAS identified two adjacent significant SNPs on chromosome 2 (chr2_45365876 and chr2_45365897), forming a prominent association peak (Fig. 2C, E). The regional association pattern and the LD heatmap (D′) supported a high-LD block surrounding the lead SNPs, consistent with a single underlying locus; importantly, chr2_45365876 was also significant for PTU_BLUP (Fig. 2E). Based on gene annotation in proximity to this LD-defined interval, we prioritized Msa.H.0093180 , annotated as a heme oxygenase 1 ( HO1 )-encoding gene, as a candidate gene. In addition, another significant SNP (chr8_88013709) was detected on chromosome 8 (Fig. 2C, F). Local association and LD block analysis indicated a distinct candidate interval around this lead SNP (Fig. 2F), within/near which Msa.H.0492010 , annotated as an AINTEGUMENTA-like 5 gene, was proposed as a candidate gene. Based on the GWAS results, we retrieved genes located within ±50 kb of the 42 significant SNPs (i.e., a 100-kb window per SNP, approximating local linkage disequilibrium), yielding 210 candidate genes in total (Table S3). To infer potential functions, these candidate genes were further annotated by identifying putative Arabidopsis thaliana orthologs based on sequence similarity. For example, Msa.H.0261670 , located upstream of SNP chr5_17284544, showed high similarity to RADIATION SENSITIVE23 (RAD23), whereas Msa.H.0491940 , located near SNP chr8_88013709, was annotated as a glucose-6-phosphate/phosphate translocator family gene. These annotations provided a functional framework for subsequent integration with transcriptomic evidence. Fig. 2 GWAS of flowering time in 176 alfalfa accessions. A−D : Manhattan plots for flowering time in 2025 (PTU), 2024 (PTU), 2023 (PTU), and the BLUP across years (PTU_BLUP), respectively. E, F : Regional association plots and linkage disequilibrium (LD) heatmaps for the loci highlighted by blue dashed boxes in (C) and (D), showing local LD structure and the genomic positions of candidate genes. Haplotype analysis and favorable haplotype identification Seven SNPs that were significantly associated with PTU_BLUP defined three major haplotypes among the 176 accessions (Fig. 3A). Accessions carrying Hap2 showed significantly higher PTU values, indicating a delayed-flowering phenotype (Fig. 3C). In addition, two lead SNPs (chr2_45365876 and chr2_45365897), which were jointly associated with both 2023 PTU and PTU_BLUP, were located within the 45.343–45.387 Mb interval on chromosome 2 based on the GWAS results. Haplotype construction using these two SNPs identified three haplotypes (Hap4–Hap6) across the 176 accessions (Fig. 3B). Accessions carrying Hap5 had significantly higher PTU values than those carrying Hap4 or Hap6 (Fig. 3D), suggesting that Hap5 represents a delayed-flowering haplotype. To further quantify the allelic effects on flowering time, we examined the relationship between allele composition and PTU_BLUP. The number of early-flowering alleles per accession was negatively correlated with PTU_BLUP (Pearson’s r = -0.48, P < 0.001), whereas the number of delayed-flowering alleles was positively correlated with PTU_BLUP (Pearson’s r = 0.52, P < 0.001) (Fig. 3E, F). We next assessed whether delayed-flowering allele counts differed among geographic origins. Accessions from South America showed the highest number of delayed-flowering alleles, exceeding those from Africa and Europe (Fig. 3G). Finally, based on PTU_BLUP, we selected 15 delayed-flowering and 15 early-flowering accessions and compared the presence/absence patterns of delayed-flowering haplotypes across 13 significant PTU_BLUP-associated SNPs. Delayed-flowering haplotypes occurred more frequently in the delayed-flowering group than in the early-flowering group (Fig. 3H, I). Haplotype patterns for other SNPs significantly associated with PTU_BLUP are shown in Fig. S1. Fig. 3 Haplotype construction and favorable-allele analysis for flowering time. A : Three haplotypes (Hap1–Hap3) defined by seven significant SNPs jointly associated with 2023 PTU and PTU_BLUP. B : Three haplotypes defined by two significant SNPs associated with PTU_BLUP. In (A) and (B), columns indicate SNPs and rows indicate haplotypes; colors represent genotypes. C−D : Box plots displaying phenotypic differences among haplotypes. * p < 0.05; ** p < 0.01, *** p < 0.001. E−F : Scatter plots showing the relationship between PTU_BLUP and the number of early-flowering (E) or delayed-flowering (F) alleles per accession. r indicates the Pearson correlation coefficient. G : Bar plot showing the distribution of delayed-flowering allele counts among accessions originating from different geographic regions. H−I : Presence/absence heatmaps of early-flowering (H) and delayed-flowering (I) alleles (or haplotypes) across 13 significant SNP loci identified in the PTU_BLUP GWAS; darker shading indicates presence and lighter shading indicates absence. RNA-seq analysis A total of 49,553 differentially expressed genes (DEGs) were identified across the three comparisons (E1_L1, E2_L2, and E3_L3). In the E1_L1 group, 4,984 DEGs were upregulated and 5,144 were downregulated. In the E2_L2 group, 3,849 DEGs were upregulated and 4,339 were downregulated. In the E3_L3 group, 4,527 DEGs were upregulated and 4,143 were downregulated (Fig. 4A). Venn analysis showed that 2,393 DEGs were shared among all three groups (Fig. 4B), including 127 commonly upregulated genes and 125 commonly downregulated genes (Fig. 4C, D). To integrate transcriptomic signals with GWAS-based candidates, we intersected RNA-seq DEGs with the 210 genes within ±50 kb of the significant SNPs and identified 10 overlapping genes (Fig. 4E; Table S3). The expression patterns of these 10 genes were further assessed by RT‒qPCR across different accessions. Five genes ( Msa.H.0261670 , Msa.H.0185980 , Msa.H.0068080 , Msa.H.0093180 , and Msa.H.0402120 ) showed lower expression levels in early-flowering accessions than in delayed-flowering accessions, suggesting their potential involvement in promoting delayed flowering. In addition, Msa.H.0049820 , Msa.H.0402090 , Msa.H.0491940 , Msa.H.0360090 , and Msa.H.0165600 also displayed higher expression in delayed-flowering lines (Fig. S4), indicating that these genes may be associated with flowering-time regulation. In parallel, we constructed a weighted gene co-expression network (WGCNA) based on the RNA-seq expression matrix and identified distinct co-expression modules represented by different colors (Fig. 4F). The eigengene adjacency heatmap revealed clear relationships among modules, indicating coordinated expression patterns across the transcriptome (Fig. 4G). This network-level organization provides an additional framework for prioritizing candidate regulators associated with flowering-time variation. Fig. 4 Transcriptomic analysis. A : Upregulated (orange) and downregulated (green) DEGs in three groups. B : Venn diagrams illustrating the overlap between DEGs in three groups. C−D : Venn diagrams illustrating the overlap between upregulated ( C ) DEGs and downregulated DEGs ( D ) in three groups. E : Venn diagram illustrating the overlap between DEGs and candidate genes within 50 kb upstream and downstream of the SNPs identified by GWAS. F : Hierarchical clustering tree of genes and construction of modules. G : Cluster Dendrogram and heatmap of modules. GO and KEGG analysis A total of 2,393 differentially expressed genes (DEGs) were subjected to Gene Ontology (GO) enrichment analysis and were assigned to three domains: biological process (BP), cellular component (CC), and molecular function (MF) (Fig. S2). In total, 54 GO terms were significantly enriched (24 BP, 18 CC, and 12 MF), and the top 25 enriched terms are shown in Fig. 5A. The DEGs were predominantly enriched in metabolism- and redox-related processes, including oxidation–reduction process (GO:0055114), secondary metabolite biosynthetic process (GO:0044550), secondary metabolic process (GO:0019748), and isoflavonoid biosynthetic process (GO:0009717) (Fig. 5C). Notably, six DEGs were annotated to the isoflavonoid biosynthetic process (GO:0009717) (Fig. 5A). Given the well-established roles of flavonoids and isoflavonoids in plant development, particularly in legumes, these results implicate isoflavonoid metabolism as a potential component of flowering-time variation in alfalfa. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of the same 2,393 DEGs identified 131 significantly enriched pathways, which were grouped into five categories: Metabolism (96 pathways), Genetic Information Processing (22), Cellular Processes (9), Environmental Information Processing (4), and Organismal Systems (2) (Fig. 5B; Fig. S3). The top 25 enriched pathways are shown in Fig. 5D. DEGs were significantly enriched in several key metabolic and signaling pathways, including valine, leucine and isoleucine degradation (ko00280), tropane, piperidine and pyridine alkaloid biosynthesis (ko00960), glycerolipid metabolism (ko00561), phenylpropanoid biosynthesis (ko00940), and flavone and flavonol biosynthesis (ko00944) (Fig. 5D). Collectively, the enrichment of pathways involved in branched-chain amino acid catabolism and phenylpropanoid/flavonoid-related secondary metabolism suggests that metabolic reprogramming accompanies flowering-time divergence during alfalfa development. Fig. 5 GO and KEGG enrichment analyses. A : GO enrichment shown as a circular plot. B: KEGG pathway enrichment shown as a circular plot. C : Top 10 significantly enriched GO terms. D : Top 10 significantly enriched KEGG pathways. In (A) and (B), bar length indicates the number of genes in each category, and color intensity reflects enrichment significance (darker red indicates smaller P values); RichFactor indicates the ratio of DEGs to background genes. Genomic Prediction for flowering time in Alfalfa We evaluated genomic prediction for flowering time-related traits (PTU in 2023–2025 and PTU_BLUP) using five machine-learning models (SVR_poly, PLSRegression, LASSO, ElasticNetCV, and ElasticNet) across seven GWAS-informed SNP panels (Set1–Set7; Fig. 6). Mean prediction accuracy for each model–trait combination was calculated across 100 replicated validations (Table S5). Using the top 100 GWAS-ranked SNPs (Set1), prediction accuracies ranged from 0.62 (SVR_poly) to 0.76 (ElasticNet), indicating that even a small, highly informative marker set enabled moderate prediction. As marker density increased, prediction accuracy improved consistently across models; with the top 5,000 SNPs (Set7), accuracies increased to 0.72–0.91, and PLSRegression achieved the highest accuracy. Notably, the spread of accuracies among models decreased at larger SNP panels (Fig. 6), suggesting improved robustness of prediction when more GWAS-prioritized markers were included. Fig. 6 Genomic prediction of flowering time in alfalfa using five machine-learning models. Prediction accuracies for flowering time in 2023–2025 (PTU) and the across-year BLUP (PTU_BLUP) were evaluated using five machine-learning models (SVR_poly, PLSRegression, Lasso, ElasticNetCV, and ElasticNet). For each trait, seven GWAS-informed SNP panels (Set1–Set7) were constructed using the top 100, 500, 1,000, 2,000, 3,000, 4,000, or 5,000 most significant SNPs, respectively. The circular heatmap shows the mean prediction accuracy across 100 replicated validations; numbers in tiles indicate accuracy values and color intensity reflects performance. The outer colored band denotes the four traits, and concentric rings correspond to different prediction models (as labeled). Discussion Alfalfa is an important forage crop that supplies livestock with high levels of protein and other essential nutrients. For farmers, optimizing harvest timing is critical to maximize both forage quality and economic returns. For breeders, identifying genetic loci that regulate flowering time clarifies the genetic basis of phenological development and facilitates the deployment of cultivars adapted to diverse geographical and agroecological conditions. GWAS provide a robust framework for dissecting the genetic architecture of complex traits (Tibbs Cortes et al. , 2021). In this study, we performed a GWAS of flowering time in alfalfa and identified multiple candidate genes that represent promising targets for cultivar improvement. Analysis of flowering time across three years revealed significant interannual variation in phenotypic values. This result underscores the environmental sensitivity of quantitative traits (Dechaine et al. , 2014). Moreover, PTU values were strongly influenced by both daily and accumulated temperature. For example, the mean temperature from 5 April to 5 May was 12.42°C in 2023 and 15.16°C in 2025. Consequently, PTU values in 2023 were substantially lower than in 2025. We analyzed flowering time in 176 alfalfa accessions using 875,023 high-quality, high-density SNP markers in a genome-wide association study (GWAS). This analysis identified multiple genome-wide significant loci and candidate genes. Integrating GWAS signals with transcriptome profiling (RNA-seq) prioritized ten candidate genes related to flowering time. Notably, SNP chr2_45365876 was significantly associated with both 2023 PTU and PTU_BLUP, suggesting pleiotropic effects. A gene located downstream of this SNP, Msa.H.0093180 , encodes heme oxygenase 1 (HO1). Heme oxygenases are required for phytochrome chromophore biosynthesis and are essential for normal photomorphogenesis in higher plants. In Arabidopsis , homozygous ho2-1 mutants exhibit reduced chlorophyll accumulation, slower growth, accelerated flowering, and impaired de-etiolation (Davis et al. , 2001). Consistent with these observations, our RNA-seq and RT‒qPCR analyses showed lower Msa.H.0093180 expression in early-flowering accessions and higher expression in delayed-flowering accessions. Together, these findings suggest that Msa.H.0093180 may act as a negative regulator of flowering in alfalfa. On chromosome 8, Msa.H.0491940 and Msa.H.0492010 are located downstream of SNP chr8_88013709. Msa.H.0491940 encodes a member of Glucose-6-phosphate/phosphate translocator (GPT) family. In Arabidopsis , GPT2 mediates net import of glucose-6-phosphate from the cytosol into chloroplasts, thereby increasing starch synthesis (Dyson et al. , 2015). RT‒qPCR results showed higher relative expression level of Msa.H.0491940 in early-flowering accessions. Previous studies have shown that floral induction is delayed in the starch-deficient mutants under day lengths shorter than 16 h. These results suggest that Msa.H.0491940 may promote early flowering by enhancing starch accumulation under short-day conditions. Beyond the candidate genes above, additional genes identified in this study likely contribute to flowering-time regulation in alfalfa. Msa.H.0217970 , located downstream of SNP chr4_61490681, encodes an RNA-binding glycine-rich protein. In Arabidopsis , At GRP7 modulates flowering time by affecting the MADS-box repressor FLOWERING LOCUS C ( FLC ); loss of At GRP7 elevates FLC expression and delayed flowering in the atgrp7-1 mutant (Steffen et al. , 2019). Msa.H.0492010 , located downstream of SNP chr8_88013709, encodes an AINTEGUMENTA-like (AIL) protein. In Arabidopsis, AINTEGUMENTA (ANT) and AINTEGUMENTA-LIKE6 act in parallel with MONOPTEROS. They upregulate LEAFY (LFY) in response to auxin, thereby promoting the onset of flower formation (Yamaguchi et al. , 2016). These observations support potential roles for Msa.H.0217970 and Msa.H.0492010 in flowering-time control in alfalfa. Haplotype analysis of the significant SNPs further revealed strong associations between specific haplotypes and flowering-related phenotypes. Across accessions, the cumulative number of haplotypes associated with delayed flowering was positively correlated with flowering-time phenotypes, corroborating the GWAS results. Notably, accessions from South America carried the greatest allelic load for delayed flowering and generally flowered later. This pattern likely reflects both climatic adaptation and germplasm history (Annicchiarico et al. , 2015). The relatively low genetic diversity of alfalfa germplasm in the Americas is likely attributable to its recent introduction (approximately 200–300 years ago) (Prosperi et al. , 2014). A short cultivation history and limited initial diversity may have contributed to the higher prevalence of delayed-flowering alleles in South American varieties. In recent years, identifying favorable haplotypes has become an effective strategy for improving quantitative traits in maize (Bhat et al. , 2021), soybean (Bhat et al. , 2022), and rice (Yu et al. , 2025). Accordingly, our results provide a basis for haplotype-based breeding to optimize flowering-related traits in alfalfa. GP is an efficient technique based on the association of phenotypic traits with whole-genome markers to predict phenotypic performance without prior knowledge of trait-associated genes or QTL effects (Crossa et al. , 2017, Meuwissen et al. , 2001). The most critical step in GP is model construction using a well-characterized training population. A range of prediction models is available for GP, including Lasso, ridge regression, and machine learning methods, are available for GP. approaches, machine-learning methods have recently been applied to predict crop traits and, in several studies, have achieved higher accuracies (Ogutu et al. , 2011, Tong & Nikoloski, 2021). Progress has also been made in GP for alfalfa breeding. For example, the prediction accuracy of yield was 0.32–0.35 in two genetically contrasting populations using genotyping-by-sequencing (GBS) markers (Annicchiarico et al. , 2015). The prediction accuracy of fall dormancy achieved 64.1% by employing SVM regression with linear kernel model (Zhang et al. , 2023). Notably, using all available markers often yields relatively low prediction accuracy when the number of markers substantially exceeds the number of samples. Under such conditions, a large proportion of markers that are either unrelated to the traits or exhibit high collinearity (due to linkage disequilibrium redundancy) introduce noise and inflate the estimation variance, thereby obscuring true genetic signals (Jeong et al. , 2020, Xu et al. , 2025b). Therefore, we evaluated seven SNP sets containing 100 to 5,000 markers in this study. The results showed SVR_poly model achieved the highest prediction accuracy when using the top 5000 SNP markers. Conclusion This study evaluated flowering time over three years in a panel of 176 alfalfa accessions and detected considerable genetic variation across four traits. GWAS identified 42 significant SNPs and 210 associated genes located within ±50 kb of these loci. Haplotype analysis delineated the distribution of favorable alleles, providing a theoretical basis for molecular design breeding through stacking of superior haplotypes. Integration of GWAS with RNA-seq highlighted ten candidate genes. Lastly, SNP-based genomic prediction achieved prediction accuracies of 0.62–0.92. In summary, this study integrates GWAS and GP to deliver genetic resources and establish a framework for the genetic improvement of alfalfa. Supplementary Materials Figure S1: Boxplot displays the phenotypic differences among haplotypes of significantly associated SNPs in PTU_BLUP. * p < 0.05; ** p < 0.01, *** p < 0.001. Figure S2: Histogram shows the results of GO enrichment. Figure S3: Histogram shows the results of KEGG enrichment. Figure S4: RT‒qPCR validation of expression levels for six candidate genes. Bar colors denote different materials: orange indicates early flowering, green indicates delayed flowering, and blue denotes CK. Different letters (a, b, c, etc.) indicate significant differences between groups at the 0.05 significance level. Figure S5: Boxplot shows the prediction accuracy of models in 2023 PTU (A), 2024PTU (B), 2025 PTU (C) and PTU_BLUP (D). Table S1: Phenotypic statistical analysis of flowering time across three years (2023–2025 PTU) and the BLUP values. Table S2: Details of significant SNPs identified by GWAS. Table S3: Details of candidate genes identified by integration of GWAS and RNA-seq. Table S4: Primers of candidate genes used for RT‒qPCR. Table S5: Average prediction accuracy of five machine learning-based GP models. CRediT authorship contribution statement Kai Zhu : Writing – original draft, Software, Methodology, Data curation. Ming Xu : Methodology, Software, Formal analysis. Tiejun Zhang : Conceptualization, supervision, Investigation. Xueqian Jiang : Investigation, Data curation. Yanchao Xu : Validation, Investigation. Junmei Kang : Methodology, Resources. Qingchuan Yang : Resources, Funding acquisition. Ruicai Long : Project administration, Funding acquisition. Fei He : Conceptualization, Supervision, Visualization, Writing – review & editing. All authors approved the final version of the manuscript. Declaration of competing interest The authors declare that they have no conflicts of interest. Acknowledgments The authors declare financial support was received for the research, authorship, and/or publication of this article. This study was financially supported by the Central Public-interest Scientific Institution Basal Research Fund (2025-YWF-ZYSQ-04); Central Public-Interest Scientific Institution Basal Research Fund (Grant No. Y2025YC44); China Agriculture Research System of MOF and MARA (Grant No. CARS-34). Data availability All data generated or analyzed during this study are included in this published article (and its supporting information files). The RNA sequence data from this study have been deposited in the NCBI database under accession code BioProject PRJNA1416271. Adhikari L., Makaju S.O. & Missaoui A.M. (2019) QTL mapping of flowering time and biomass yield in tetraploid alfalfa (Medicago sativa L.). BMC Plant biology , 19 , 359.Anders S. & Huber W. (2010) Differential expression analysis for sequence count data. Nature Precedings , 1.Annicchiarico P., Barrett B., Brummer E.C., Julier B. & Marshall A.H. (2015) Achievements and challenges in improving temperate perennial forage legumes. Critical Reviews in Plant Sciences , 34 , 327-380.Annicchiarico P., Nazzicari N., Li X., Wei Y., Pecetti L. & Brummer E.C. (2015) Accuracy of genomic selection for alfalfa biomass yield in different reference populations. BMC genomics , 16 , 1020.Arzani H., Zohdi M., Fish E., Amiri G.Z., Nikkhah A. & Wester D. (2004) Phenological effects on forage quality of five grass species. Journal of Range management , 57 , 624-629.Benjamini Y. & Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) , 57 , 289-300.Bhat J.A., Karikari B., Adeboye K.A., Ganie S.A., Barmukh R., Hu D., Varshney R.K. & Yu D. (2022) Identification of superior haplotypes in a diverse natural population for breeding desirable plant height in soybean. Theoretical and Applied Genetics , 135 , 2407-2422.Bhat J.A., Yu D., Bohra A., Ganie S.A. & Varshney R.K. (2021) Features and applications of haplotypes in crop breeding. Communications biology , 4 , 1266.Chao Y., Zhang T., Yang Q., Kang J., Sun Y., Gruber M.Y. & Qin Z. (2014) Expression of the alfalfa CCCH-type zinc finger protein gene MsZFN delays flowering time in transgenic Arabidopsis thaliana. Plant Science , 215 , 92-99.Chen C., Chen H., Zhang Y., Thomas H.R., Frank M.H., He Y. & Xia R. (2020) TBtools: an integrative toolkit developed for interactive analyses of big biological data. Molecular plant , 13 , 1194-1202.Chen H., Zeng Y., Yang Y., Huang L., Tang B., Zhang H., Hao F., Liu W., Li Y. & Liu Y. (2020) Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa. Nature communications , 11 , 2494.Cheng X., Li G., Krom N., Tang Y. & Wen J. (2021) Genetic regulation of flowering time and inflorescence architecture by MtFDa and MtFTa1 in Medicago truncatula. Plant Physiology , 185 , 161-178.Crossa J., Pérez-Rodríguez P., Cuevas J., Montesinos-López O., Jarquín D., De Los Campos G., Burgueño J., González-Camacho J.M., Pérez-Elizalde S. & Beyene Y. (2017) Genomic selection in plant breeding: methods, models, and perspectives. Trends in plant science , 22 , 961-975.Danecek P., Auton A., Abecasis G., Albers C.A., Banks E., DePristo M.A., Handsaker R.E., Lunter G., Marth G.T. & Sherry S.T. (2011) The variant call format and VCFtools. Bioinformatics , 27 , 2156-2158.Davis S.J., Bhoo S.H., Durski A.M., Walker J.M. & Vierstra R.D. (2001) The heme-oxygenase family required for phytochrome chromophore biosynthesis is necessary for proper photomorphogenesis in higher plants. Plant Physiology , 126 , 656-669.Dechaine J.M., Brock M.T., Iniguez Luy F.L. & Weinig C. (2014) Quantitative trait loci× environment interactions for plant morphology vary over ontogeny in B rassica rapa. New Phytologist , 201 , 657-669.Du J., Lu S., Chai M., Zhou C., Sun L., Tang Y., Nakashima J., Kolape J., Wen Z. & Behzadirad M. (2021) Functional characterization of PETIOLULE‐LIKE PULVINUS (PLP) gene in abscission zone development in Medicago truncatula and its application to genetic improvement of alfalfa. Plant Biotechnology Journal , 19 , 351-364.Dyson B.C., Allwood J.W., Feil R., Xu Y., Miller M., Bowsher C.G., Goodacre R., Lunn J.E. & Johnson G.N. (2015) Acclimation of metabolism to light in A rabidopsis thaliana: the glucose 6‐phosphate/phosphate translocator GPT 2 directs metabolic acclimation. Plant, cell & environment , 38 , 1404-1417.Gao R., Gruber M.Y., Amyot L. & Hannoufa A. (2018) SPL13 regulates shoot branching and flowering time in Medicago sativa. Plant Molecular Biology , 96 , 119-133.Grabowski P.P., Evans J., Daum C., Deshpande S., Barry K.W., Kennedy M., Ramstein G., Kaeppler S.M., Buell C.R. & Jiang Y. (2017) Genome‐wide associations with flowering time in switchgrass using exome‐capture sequencing data. New Phytologist , 213 , 154-169.He F., Chen S., Zhang Y., Chai K., Zhang Q., Kong W., Qu S., Chen L., Zhang F. & Li M. (2025) Pan-genomic analysis highlights genes associated with agronomic traits and enhances genomics-assisted breeding in alfalfa. Nature Genetics , 1-12.He F., Kang J., Zhang F., Long R., Yu L., Wang Z., Zhao Z., Zhang T. & Yang Q. (2019) Genetic mapping of leaf-related traits in autotetraploid alfalfa (Medicago sativa L.). Molecular Breeding , 39 , 147.He F., Wei C., Zhang Y., Long R., Li M., Wang Z., Yang Q., Kang J. & Chen L. (2022) Genome-wide association analysis coupled with transcriptome analysis reveals candidate genes related to salt stress in alfalfa (Medicago sativa L.). Frontiers in Plant Science , 12 , 826584.He F., Yang T., Zhang F., Jiang X., Li X., Long R., Wang X., Gao T., Wang C. & Yang Q. (2023) Transcriptome and GWAS analyses reveal candidate gene for root traits of alfalfa during germination under salt stress. International Journal of Molecular Sciences , 24 , 6271.He F., Zhang F., Jiang X., Long R., Wang Z., Chen Y., Li M., Gao T., Yang T. & Wang C. (2022) A genome-wide association study coupled with a transcriptomic analysis reveals the genetic loci and candidate genes governing the flowering time in alfalfa (Medicago sativa L.). Frontiers in Plant Science , 13 , 913947.Hecht V., Foucher F., Ferrándiz C., Macknight R., Navarro C., Morin J., Vardy M.E., Ellis N., Beltrán J.P. & Rameau C. (2005) Conservation of Arabidopsis flowering genes in model legumes. Plant Physiology , 137 , 1420-1434.Jeong S., Kim J. & Kim N. (2020) GMStool: GWAS-based marker selection tool for genomic prediction from genomic data. Scientific reports , 10 , 19653.Jiang X., Yang T., He F., Zhang F., Jiang X., Wang C., Gao T., Long R., Li M. & Yang Q. (2024) A genome-wide association study reveals novel loci and candidate genes associated with plant height variation in Medicago sativa. BMC Plant Biology , 24 , 544.Kang J., Zhang T., Guo T., Ding W., Long R., Yang Q. & Wang Z. (2019) Isolation and functional characterization of MsFTa, a FLOWERING LOCUS T homolog from alfalfa (Medicago sativa). International Journal of Molecular Sciences , 20 , 1968.Koornneef M., Alonso-Blanco C., Vries H.B., Hanhart C.J. & Peeters A. (1998) Genetic interactions among late-flowering mutants of Arabidopsis. Genetics , 148 , 885-892.Lacefield D.G. (2004) Alfalfa quality: What it is? What can we do about it? And will it pay? Paper presented at the Proceedings, National Alfalfa Symposium.Laurie R.E., Diwadkar P., Jaudal M., Zhang L., Hecht V., Wen J., Tadege M., Mysore K.S., Putterill J. & Weller J.L. (2011) The Medicago FLOWERING LOCUS T homolog, MtFTa1, is a key regulator of flowering time. Plant physiology , 156 , 2207-2224.Li H. & Durbin R. (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. bioinformatics , 25 , 1754-1760.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. & Subgroup G.P.D.P. (2009) The sequence alignment/map format and SAMtools. bioinformatics , 25 , 2078-2079.Li H., Peng Z., Yang X., Wang W., Fu J., Wang J., Han Y., Chai Y., Guo T. & Yang N. (2013) Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nature genetics , 45 , 43-50.Li M., Liu Y., Tao Y., Xu C., Li X., Zhang X., Han Y., Yang X., Sun J. & Li W. (2019) Identification of genetic loci and candidate genes related to soybean flowering through genome wide association study. BMC genomics , 20 , 987.Li X. & Brummer E.C. (2012) Applied genetics and genomics in alfalfa breeding. Agronomy , 2 , 40-61.Lin S., Medina C.A., Boge B., Hu J., Fransen S., Norberg S. & Yu L. (2020) Identification of genetic loci associated with forage quality in response to water deficit in autotetraploid alfalfa (Medicago sativa L.). BMC Plant Biology , 20 , 303.Long R., Zhang F., Zhang Z., Li M., Chen L., Wang X., Liu W., Zhang T., Yu L. & He F. (2021) Assembly of chromosome-scale and allele-aware autotetraploid genome of the Chinese alfalfa cultivar Zhongmu-4 and identification of SNP loci associated with 27 agronomic traits. bioRxiv , 2021-2022.Lorenzo C.D., García Gagliardi P., Antonietti M.S., Sánchez Lamas M., Mancini E., Dezar C.A., Vazquez M., Watson G., Yanovsky M.J. & Cerdán P.D. (2020) Improvement of alfalfa forage quality and management through the down‐regulation of Ms FT a1. Plant biotechnology journal , 18 , 944-954.Luccioni L., Krzymuski M., Sánchez Lamas M., Karayekov E., Cerdán P.D. & Casal J.J. (2019) CONSTANS delays Arabidopsis flowering under short days. The Plant Journal , 97 , 923-932.Lv A., Su L., Wen W., Fan N., Zhou P. & An Y. (2021) Analysis of the function of the alfalfa MsLEA-D34 gene in abiotic stress responses and flowering time. Plant and Cell Physiology , 62 , 28-42.Medina C.A., Hansen J., Crawford J., Viands D., Sapkota M., Xu Z., Peel M.D. & Yu L. (2025) Genome-Wide Association and Genomic Prediction of Alfalfa (Medicago sativa L.) Biomass Yield Under Drought Stress. International journal of molecular sciences , 26 , 608.Meuwissen T.H., Hayes B.J. & Goddard M.E. (2001) Prediction of total genetic value using genome-wide dense marker maps. genetics , 157 , 1819-1829.Ogutu J.O., Piepho H. & Schulz-Streeck T. (2011) A comparison of random forests, boosting and support vector machines for genomic selection . Paper presented at the BMC proceedings.Park S.J., Jiang K., Tal L., Yichie Y., Gar O., Zamir D., Eshed Y. & Lippman Z.B. (2014) Optimization of crop productivity in tomato using induced mutations in the florigen pathway. Nature genetics , 46 , 1337-1342.Poland J.A., Bradbury P.J., Buckler E.S. & Nelson R.J. (2011) Genome-wide nested association mapping of quantitative resistance to northern leaf blight in maize. Proceedings of the National Academy of Sciences , 108 , 6893-6898.Prosperi J., Jenczewski E., Muller M., Fourtier S., Sampoux J. & Ronfort J. (2014) Alfalfa domestication history, genetic diversity and genetic resources. Legume Perspectives , 4 , 13-14.Rajendran S., Heo J., Kim Y.J., Kim D.H., Ko K., Lee Y.K., Oh S.K., Kim C.M., Bae J.H. & Park S.J. (2021) Optimization of tomato productivity using flowering time variants. Agronomy , 11 , 285.Raman H., Raman R., Coombes N., Song J., Prangnell R., Bandaranayake C., Tahira R., Sundaramoorthi V., Killian A. & Meng J. (2016) Genome‐wide association analyses reveal complex genetic architecture underlying natural variation for flowering time in canola. Plant, Cell & Environment , 39 , 1228-1239.Reeves P.H. & Coupland G. (2001) Analysis of flowering time control in Arabidopsis by comparison of double and triple mutants. Plant Physiology , 126 , 1085-1091.Russelle M.P. (2001) Alfalfa: After an 8,000-year journey, the” Queen of Forages” stands poised to enjoy renewed popularity. American Scientist , 89 , 252-261.Sakiroglu M. & Brummer E.C. (2017) Identification of loci controlling forage yield and nutritive value in diploid alfalfa using GBS-GWAS. Theoretical and applied genetics , 130 , 261-268.Salami M., Heidari B., Alizadeh B., Batley J., Wang J., Tan X., Dadkhodaie A. & Richards C. (2024) Dissection of quantitative trait nucleotides and candidate genes associated with agronomic and yield-related traits under drought stress in rapeseed varieties: integration of genome-wide association study and transcriptomic analysis. Frontiers in Plant Science , 15 , 1342359.Salami M., Heidari B., Batley J., Wang J., Tan X., Richards C. & Tan H. (2024) Integration of genome-wide association studies, metabolomics, and transcriptomics reveals phenolic acid-and flavonoid-associated genes and their regulatory elements under drought stress in rapeseed flowers. Frontiers in Plant Science , 14 , 1249142.Saxena M.S., Bajaj D., Das S., Kujur A., Kumar V., Singh M., Bansal K.C., Tyagi A.K. & Parida S.K. (2014) An integrated genomic approach for rapid delineation of candidate genes regulating agro-morphological traits in chickpea. DNA research , 21 , 695-710.Shen C., Du H., Chen Z., Lu H., Zhu F., Chen H., Meng X., Liu Q., Liu P. & Zheng L. (2020) The chromosome-level genome sequence of the autotetraploid alfalfa and resequencing of core germplasms provide genomic resources for alfalfa research. Molecular Plant , 13 , 1250-1261.Singer S.D., Hannoufa A. & Acharya S. (2018) Molecular improvement of alfalfa for enhanced productivity and adaptability in a changing environment. Plant, cell & environment , 41 , 1955-1971.Steffen A., Elgner M. & Staiger D. (2019) Regulation of flowering time by the RNA-binding proteins at GRP7 and at GRP8. Plant and Cell Physiology , 60 , 2040-2050.Tamaki S., Matsuo S., Wong H.L., Yokoi S. & Shimamoto K. (2007) Hd3a protein is a mobile flowering signal in rice. Science , 316 , 1033-1036.Tibbs Cortes L., Zhang Z. & Yu J. (2021) Status and prospects of genome‐wide association studies in plants. The plant genome , 14 , e20077.Tong H. & Nikoloski Z. (2021) Machine learning approaches for crop improvement: Leveraging phenotypic and genotypic big data. Journal of plant physiology , 257 , 153354.Trapnell C., Roberts A., Goff L., Pertea G., Kim D., Kelley D.R., Pimentel H., Salzberg S.L., Rinn J.L. & Pachter L. (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature protocols , 7 , 562-578.Willige B.C., Ghosh S., Nill C., Zourelidou M., Dohmann E.M., Maier A. & Schwechheimer C. (2007) The DELLA domain of GA INSENSITIVE mediates the interaction with the GA INSENSITIVE DWARF1A gibberellin receptor of Arabidopsis. The Plant Cell , 19 , 1209-1220.Wilson R.N., Heckman J.W. & Somerville C.R. (1992) Gibberellin is required for flowering in Arabidopsis thaliana under short days. Plant physiology , 100 , 403-408.Wolabu T.W., Cong L., Park J., Bao Q., Chen M., Sun J., Xu B., Ge Y., Chai M. & Liu Z. (2020) Development of a highly efficient multiplex genome editing system in outcrossing tetraploid alfalfa (Medicago sativa). Frontiers in Plant Science , 11 , 1063.Xu M., Zhu K., Jiang X., Zhang F., Sod B., Leng H., Zhang T., Xu Y., Yang T. & Li M. (2025a) Determining the Genetic Architecture and Breeding Potential of Quality Traits in Alfalfa (Medicago sativa L.) Through Genome-Wide Association Study and Genomic Prediction. Agronomy , 15 , 2679.Xu M., Zhu K., Jiang X., Zhang F., Sod B., Leng H., Zhang T., Xu Y., Yang T. & Li M. (2025b) Determining the Genetic Architecture and Breeding Potential of Quality Traits in Alfalfa (Medicago sativa L.) Through Genome-Wide Association Study and Genomic Prediction. Agronomy , 15 , 2679.Yamaguchi N., Jeong C.W., Nole-Wilson S., Krizek B.A. & Wagner D. (2016) AINTEGUMENTA and AINTEGUMENTA-LIKE6/PLETHORA3 induce LEAFY expression in response to auxin to promote the onset of flower formation in Arabidopsis. Plant Physiology , 170 , 283-293.Yu J., Suo S., Zhou H., Peng Y., Wang Z., Cao H., Liu Y., Shi X., Liu L. & Yuan D. (2025) Haplotype analysis and molecular marker development for the cold tolerance gene OsCTS11 at the seedling stage of rice. Theoretical and Applied Genetics , 138 , 315.Zhang F., Kang J., Long R., Li M., Sun Y., He F., Jiang X., Yang C., Yang X. & Kong J. (2023) Application of machine learning to explore the genomic prediction accuracy of fall dormancy in autotetraploid alfalfa. Horticulture Research , 10 , uhac225.Zhang F., Kang J., Long R., Yu L.X., Sun Y., Wang Z., Zhao Z., Zhang T. & Yang Q. (2020) Construction of high‐density genetic linkage map and mapping quantitative trait loci (QTL) for flowering time in autotetraploid alfalfa (Medicago sativa L.) using genotyping by sequencing. The Plant Genome , 13 , e20045.Zhang J., Song Q., Cregan P.B., Nelson R.L., Wang X., Wu J. & Jiang G. (2015) Genome-wide association study for flowering time, maturity dates and plant height in early maturing soybean (Glycine max) germplasm. BMC genomics , 16 , 217.Zhang T., Chao Y., Kang J., Ding W. & Yang Q. (2013) Molecular cloning and characterization of a gene regulating flowering time from Alfalfa (Medicago sativa L.). Molecular biology reports , 40 , 4597-4603.Zhang T., Yu L., Zheng P., Li Y., Rivera M., Main D. & Greene S.L. (2015) Identification of loci associated with drought resistance traits in heterozygous autotetraploid alfalfa (Medicago sativa L.) using genome-wide association studies with genotyping by sequencing. PLoS one , 10 , e138931. Information & Authors Information Version history V1 Version 1 03 February 2026 Copyright This work is licensed under a Non Exclusive No Reuse License. Keywords alfalfa flowering time gp gwas genome transcriptome Authors Affiliations Kai Zhu Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Ming Xu Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Tiejun Zhang Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Xueqian Jiang Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Yanchao Xu Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Junmei Kang Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Qingchuan Yang Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Ruicai Long Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Fei He [email protected] Chinese Academy of Agricultural Sciences Institute of Animal Science View all articles by this author Metrics & Citations Metrics Article Usage 193 views 58 downloads .FvxKWukQNSOunydq8rnd { width: 100px; } Citations Download citation Kai Zhu, Ming Xu, Tiejun Zhang, et al. Integrative GWAS, transcriptomics, and genomic prediction identify loci and candidate genes for flowering time in alfalfa. Authorea . 03 February 2026. DOI: https://doi.org/10.22541/au.177014924.43030651/v1 If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu . Format Please select one from the list RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks Direct import Tips for downloading citations document.getElementById('citMgrHelpLink').addEventListener('click', function() { popupHelp(this.href); return false; }); $(".js__slcInclude").on("change", function(e){ if ($(this).val() == 'refworks') $('#direct').prop("checked", false); $('#direct').prop("disabled", ($(this).val() == 'refworks')); }); View Options View options PDF View PDF Figures Tables Media Share Share Share article link Copy Link Copied! Copying failed. Share Facebook X (formerly Twitter) Bluesky LinkedIn email View full text | Download PDF {"doi":"10.22541/au.177014924.43030651/v1","type":"Article"} Now Reading: Share Figures Tables Close figure viewer Back to article Figure title goes here Change zoom level Go to figure location within the article Download figure Toggle share panel Toggle share panel Share Toggle information panel Toggle information panel Go to previous graphic Go to next graphic Go to previous table Go to next table All figures All tables View all material View all material xrefBack.goTo xrefBack.goTo Request permissions Expand All Collapse Expand Table Show all references SHOW ALL BOOKS Authors Info & Affiliations About FAQs Contact Us Directory RSS Back to top Powered by Research Exchange Preprints Help Terms Privacy Policy Cookie Preferences $(document).ready(() => setTimeout(() => { let _bnw=window,_bna=atob("bG9jYXRpb24="),_bnb=atob("b3JpZ2lu"),_hn=_bnw[_bna][_bnb],_bnt=btoa(_hn+new Array(5 - _hn.length % 4).join(" ")); $.get("/resource/lodash?t="+_bnt); },4000)); (function(){function c(){var b=a.contentDocument||a.contentWindow.document;if(b){var d=b.createElement('script');d.innerHTML="window.__CF$cv$params={r:'9fe0ffecde1441e2',t:'MTc3OTE3MTY2Nw=='};var a=document.createElement('script');a.src='/cdn-cgi/challenge-platform/scripts/jsd/main.js';document.getElementsByTagName('head')[0].appendChild(a);";b.getElementsByTagName('head')[0].appendChild(d)}}if(document.body){var a=document.createElement('iframe');a.height=1;a.width=1;a.style.position='absolute';a.style.top=0;a.style.left=0;a.style.border='none';a.style.visibility='hidden';document.body.appendChild(a);if('loading'!==document.readyState)c();else if(window.addEventListener)document.addEventListener('DOMContentLoaded',c);else{var e=document.onreadystatechange||function(){};document.onreadystatechange=function(b){e(b);'loading'!==document.readyState&&(document.onreadystatechange=e,c())}}}})();
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.