SLAF-seq Efficiently Identifies SNP Markers for Wheat (Triticum aestivum L.) Development

preprint OA: closed CC-BY-4.0
📄 Open PDF Full text JSON View at publisher
Full text 126,425 characters · extracted from preprint-html · click to expand
SLAF-seq Efficiently Identifies SNP Markers for Wheat (Triticum aestivum L.) Development | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article SLAF-seq Efficiently Identifies SNP Markers for Wheat (Triticum aestivum L.) Development Dongsheng Yang, Hao Liang, Shijun Sun, Haiwei Wang, Chao Cui, and 6 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8808792/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 10 You are reading this latest preprint version Abstract Molecular markers are indispensable tools for identifying genetic variation among plant individuals and enhancing breeding efficiency. In this study, we developed SNP markers, assessed genetic diversity, and established fingerprint maps for 306 wheat germplasm accessions from China using SLAF-seq technology. We obtained 4978.16 Mb of clean reads after quality control of individual sample sequencing data. The number of SNP markers detected per sample ranged from 7.03 to 35.92 million. A total of 554,315 SLAF tags were identified, including 356,643 polymorphic tags. After population-level SNP filtering, 52,228 highly consistent and effective SNP markers were retained. Genetic diversity analysis revealed relatively close genetic relationships among the wheat varieties, with an average observed heterozygosity of 0.090 and a mean polymorphism information content (PIC) of 0.251. Population structure analysis ( K = 4) indicated that most accessions shared close ancestral relationships, with evidence of admixture. Cluster analysis grouped the 306 wheat germplasm resources into four distinct clusters. Further filtering identified 114 core SNP markers, enabling the successful construction of a fingerprint database encompassing all 306 accessions. This study demonstrates that SLAF-seq is a cost-effective and efficient method for high-throughput SNP marker development and a powerful tool for wheat germplasm genetic analysis. The SNP markers identified here can facilitate germplasm identification, varietal improvement, protection, utilization, and QTL mapping of important traits with yield and quality, significantly advancing molecular breeding efforts in wheat. Wheat SLAF-seq SNP Genetic diversity Fingerprint map Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1. INTRODUCTION Wheat ( Triticum aestivum L.) is one of the world’s major food crops and the second-largest staple crop in China, providing approximately 20% of protein intake and 21% of caloric intake for humans. By 2050, the global population is projected to exceed 9 billion ( https://population.un.org/wpp/ ); wheat production will need to be sufficient to ensure food security [ 1 ]. Breeders have been gradually developing wheat varieties adapted to various cultivation environments through artificial selection based on landraces [ 2 , 3 ]. Researchers often focus on identifying and utilizing genes closely associated with important and complex agronomic traits, as this targeted trait-based selective breeding approach yields higher efficiency [ 4 , 5 ]. Traditional linkage analysis and association mapping are effective methods for identifying genetic locus variations, but they still exhibit limitations in detecting genetic variations related to domestication and improvement [ 6 , 7 ]. Studying genetic variation regions across different populations or within populations can help identify loci under selection [ 8 , 9 ]. The cultivation of excellent varieties is crucial for ensuring high and stable wheat yields in China. However, when breeding for wheat varieties with superior agronomic traits, breeders have a tendency to overemphasize the utilization of a limited number of elite wheat germplasms. This has led to an increasingly narrow genetic base in the wheat germplasms that have been bred, which in turn has resulted in slow progress and limited advancement in wheat breeding. Consequently, research on germplasm resources, as a fundamental aspect of developing new wheat varieties, has garnered increasing attention among researchers. Analyzing the genetic diversity of germplasms, understanding the genetic basis and kinship of existing germplasms, and identifying and utilizing superior germplasm materials are of great significance for broadening the genetic base of wheat breeding, accelerating the breeding process, and enhancing breeding efficiency. Wheat has a large and complex genome with a high proportion of repetitive sequences or transposons [ 10 ]. As such, high-throughput, high-density genotyping strategies are often used to reduce research costs [ 11 ]. Single nucleotide polymorphisms (SNPs), which are third-generation DNA molecular markers, are highly accurate, offer high throughput, and are low-cost. SNPs are widely used in plant molecular biology research [ 12 – 14 ]. Many technical methods for SNP marker development exist, including GBS, SLAF, and RAD [ 15 – 17 ]. Specific-locus amplified fragment sequencing (SLAF-seq) is a reduced-representation genome sequencing technology that combines high-throughput sequencing to screen representative SLAF fragments [ 18 ]. First, in-silico enzyme digestion is predicted based on the reference genome of the species or a closely-related one. Then, two optimal restriction enzymes are chosen for double digestion. Finally, 300–500-bp-long enzyme-digested fragments are identified as SLAF tags. The advantages of SLAF-seq are applicability to species with or without a reference genome, uniform tag distribution across the genome, suitably-sized enzyme-digested fragments, minimal sequencing data redundancy, and high data utilization efficiency [ 19 ]. SLAF-seq-based SNP marker development has been widely used in major crops such as soybean ( Glycine max (L.) Merr.), maize ( Zea mays L.), and rice ( Oryza sativa L.), for germplasm resource kinship identification, genetic diversity analysis, high-density genetic map construction, and gene mapping [ 20 – 22 ]. The wheat genome is extremely large and complex, with significant heterozygosity. Reduced-representation genome sequencing can overcome these issues, creating favorable conditions for large-scale SNP development. In this study, we used SLAF-seq to develop wheat genome-wide SNP markers and analyzed the genetics of 306 wheat samples. Sequences were examined against the wheat reference genome. Using 17,233 novel genome-wide SNPs, we gauged genetic diversity and population structure. We provide key data for wheat-related research such as genome-wide association studies and germplasm evaluation. Our results could aid in wheat resource conservation and utilization, as well as genetic mapping, important-trait association analysis, and marker-assisted breeding. 2. MATERIALS AND METHODS 2.1 Test Materials The experimental materials consisted of 306 wheat germplasm resources with spring and weak winter characteristics collected from various regions in China, all of which could grow normally in the Bayannur area (S1 Table ). They were planted at the Hetao College Crop Experimental Field in April 2024. 2.2 DNA Extraction and Detection From each sample, around 150 mg of tender leaf tissue was gathered, swiftly frozen in liquid nitrogen, and stored at − 80°C in an ultra-cold freezer for later use. Genomic DNA of wheat was extracted via the CTAB method. An ultra-micro spectrophotometer, the NanoDrop ND 2000, was employed to assay the DNA concentration and 1.5% agarose gel electrophoresis was used to check that the DNA quality was sufficient for library preparation and sequencing. 2.3 Library Construction The Wheat_KN9204 genome ( https://ngdc.cncb.ac.cn/gwh/Assembly/25997/show ) was selected as the wheat reference genome for restriction enzyme digestion prediction, with a genome size of 14.47 Gb and a GC content of 46.16%. The optimal restriction enzyme digestion strategy was developed to ensure the following: 1) a minimal proportion of restriction fragments located in repetitive sequences; 2) a relatively uniform distribution of restriction fragments across the genome; 3) fragment lengths that matched with the specific experimental system [ 23 ]; and 4) a sufficient number of restriction fragments (SLAF tags) to meet the expected tag count. Qualified genomic DNA samples were individually digested and the resulting restriction fragments (SLAF tags) underwent 3′-end A-tailing, dual-index adapter ligation, PCR amplification, purification, pooling, gel excision, and target fragment selection [ 24 ]. 2.4 Illumina HiSeq Sequencing and Data Quality Assessment After passing the library quality inspection, high-throughput sequencing was performed using Illumina HiSeq. Paired-end sequencing was employed to identify raw data and obtain the read count for each sample. Complex structural regions on the genome (such as loop domains and consecutive restriction sites), low individual purity of genomic DNA, insufficient digestion time of restriction enzymes, and other factors may affect the activity of restriction enzymes, leading to some restriction sites remaining uncut. The sequencing quality score (Q) is a crucial metric for evaluating the single-base error rate in high-throughput sequencing. A higher sequencing quality score corresponds to a lower base-calling error rate. If the probability of a base-calling error is 0.001, the quality score of that base should be Q30. 2.5 Development of SNP Markers Reads were sequenced from restriction fragments produced by the same restriction enzyme acting on different samples. Reads derived from SLAF-seq were clustered based on sequence similarity. When SLAF tag sequences differed between samples, these were considered polymorphic SLAF tags. SNP markers were developed based on the reference sequence with the highest depth within each SLAF tag. Sequencing reads were aligned to the reference genome using BWA [ 25 ], and SNPs were called using GATK [ 26 ] and SAMtools [ 27 ]. The intersection of SNP markers obtained from both methods was considered a reliable SNP marker dataset. Finally, SNPs with a minor allele frequency (MAF) > 0.05 and completeness > 0.8 were selected for subsequent analysis. 2.6 Phylogenetic Tree Construction The developed SNP markers were preliminarily filtered according to a criteria of completeness > 0.5 and MAF > 0.05. Based on the initially screened population-consistent SNPs, a phylogenetic tree of all samples was constructed using MEGA X [ 28 ] software with the neighbor-joining method under the Kimura 2-parameter model, with 1,000 bootstrap replicates. 2.7 Analysis of Genetic Diversity SNPs developed from populations composed of non-genetic group materials should possess characteristics such as high marker quality, strong representativeness, high discriminative power among materials, uniform genomic distribution, and strong specificity. Based on these principles, markers should be uniformly distributed across the genome and markers should have no missing loci (i.e., a locus completeness of 100%). Loci were discarded if any of the following were true: MAF < 20%, PIC < 0.35, and P-value from Hardy-Weinberg equilibrium testing < 0.01. Also, if there were no other locus mutations within 100 bp upstream and downstream of the selected markers, the marker was discarded. The PowerMarker V3.25 software was used to calculate the MAF, expected number of alleles, observed number of alleles, expected heterozygosity (He), observed heterozygosity (Ho), Nei’s diversity index (H), PIC, and Shannon-Wiener index (I). 2.8 Population Genetic Structure and Principal Component Analysis Population structure analysis was conducted using Admixture (v1.22) [ 29 ]. The number of subgroups (K value) was preset from 1 to 10 for clustering, and the clustering results were cross-validated. The optimal number of clusters was determined based on the trough of the cross-validation error rate [ 30 ]. The R package Pophelper was used to generate stacked Q-matrix plots for each K value ( http://royfrancis.github.io/pophelper ). The smartPCA program in the Eigensoft (v6.0) [ 31 ] package was employed to perform principal component analysis (PCA) based on SNP data, revealing the clustering patterns of the samples. Principal component analysis was utilized to transform multiple indicators into several comprehensive indicators for analyzing genetic relationships among individuals. 2.9 Construction of SNP Fingerprint Map Based on the high-quality core SNP markers selected, genotyping results were obtained for all samples. SNP loci with consistent genotypes among replicate samples were identified and converted into binary-encoded data to construct the SNP fingerprint map of 306 germplasm resources containing wheat varieties and high-generation wheat lines. 3. RESULTS 3.1 SLAF Library Quality Assessment SLAF-seq reads are genomic DNA restriction fragments with a base distribution influenced by restriction enzyme sites and PCR amplification. The first two bases of the sequencing reads exhibited base separation consistent with the restriction sites, while subsequent bases showed varying degrees of fluctuation (Fig. 1 A). Using SLAF-predict software to perform in silico restriction digestion prediction on the wheat reference genome, the restriction enzyme combination was determined to be Rsa Ι + Hae Ш. Digestion efficiency is a key indicator for evaluating the success of SLAF experiments. In this study, the paired-end mapping efficiency was 88.72%, with residual restriction sites accounting for 11.28%, indicating normal SLAF tag digestion. Sequences with fragment lengths between 464 and 494 bp were defined as SLAF tags, and 560,935 SLAF tags were predicted, which were generally evenly distributed across the genome. To assess the normality of the SLAF experimental process and the effectiveness of the restriction digestion protocol, the SOAP software was used to align sample reads to the reference genome. The results demonstrated a paired-end mapping efficiency of 91.66%, single-end mapping efficiency of 4.23%, and unmapped efficiency of 4.11%. Additionally, the insert size distribution fell within the expected range (Fig. 1 B), confirming that the SLAF library construction was normal and effective. 3.2 Sequencing Data Statistics and Evaluation To ensure analytical quality, 126 bp × 2 read lengths were used for subsequent data evaluation and analysis. Sequencing of 306 wheat materials yielded read counts ranging from 7,032,048 to 35,918,412, totaling 4978.16 Mb reads, with sample W33 exhibiting the longest sequences. The sequencing Q30 values ranged between 85.03% and 92.27%, with an average of 90.65%. The GC content varied from 47.38% to 51.53%, averaging 48.55%, indicating high sequencing quality and reliable results. Furthermore, alignment with the reference genome showed that the percentage of clean reads mapped to the reference genome ranged from 92.82% to 99.70%, with an average of 99.11%. The proportion of paired-end sequences properly aligned to the reference genome with insert sizes matching the expected fragment length distribution ranged from 77.32% to 96.58%, averaging 86.53%. 3.3 Development of SLAF Tags and SNP Markers A total of 554,315 SLAF tags were obtained from the genome sequencing of 306 wheat samples, with an average sequencing depth of 10.97×, among which 356,643 SLAF tags were polymorphic. The SLAF tags were mapped to the reference genome using BWA software, and their distribution across chromosomes was visualized (Fig. 2 A). A total of 5,232,218 population SNPs were developed, with the number of SNP markers per sample ranging between 1,862,063 and 3,095,277. The completeness of these SNP markers was 35.59%–59.16%, and the heterozygosity rate was 3.63%–14.50%. The distribution of SNPs across chromosomes was also plotted (Fig. 2 B). The developed SLAF tags and SNPs covered nearly all chromosomes, with the highest SNP density observed on chromosomes GWHBJWI00000004, GWHBJWI00000005, GWHBJWI00000008, and GWHBJWI000000020. After filtering the population SNPs, 52,228 highly consistent and effective SNP markers were obtained for subsequent germplasm genetic evolution analysis. 3.4 Genetic Diversity and Principal Component Analysis Through rigorous screening of indicators, such as the heterozygosity rate, the missing rate, and polymorphism, high-quality SNP markers were obtained and used to estimate genetic diversity parameters for 306 wheat samples. The expected allele number was 1.105–2.000, with an average of 1.509, and the observed allele number was 2.000 for all markers. The expected heterozygous number was 0.095–0.500, with an average of 0.308, and the observed heterozygosity number ranged from 0.003 to 1.000, with an average value of 0.090. The Nei diversity index ranged from 0.095 to 0.502, with an average of 0.309. The PIC ranged from 0.090 to 0.375, with an average of 0.251. The Shannon Wiener index ranged from 0.199 to 0.693, with an average of 0.475 (Table 1 ). These results indicate that the dataset is suitable for genotyping and fingerprint construction of the 306 wheat samples. Table 1 Genetic diversity index of wheat population samples Index Expected allele number Expected heterozygous number Observed allele number Observed heterozygous number Nei diversity index Polymorphism information content Shannon Wiener index Range 1.105–2.000 0.095–0.500 2.000–2.000 0.003–1.000 0.095–0.502 0.090–0.375 0.199–0.693 Average 1.509 0.308 2.000 0.090 0.309 0.251 0.475 Using the neighbor-joining algorithm of the MAGA software, a phylogenetic tree of 306 wheat samples was constructed, with the 306 samples divided into four groups (Fig. 3 ). Group 1 contained the largest number of samples at 159, which were further divided into two subgroups: subgroup 1 comprised 14 samples, while subgroup 2 was further divided into three smaller groups containing 25, 25, and 95 samples respectively. Group 2 contained 50 samples; Group 3 contained 56 samples; and Group 4 contained 41 samples. PCA was conducted based on high-quality group SNPs, and the clustering of individuals is shown in Fig. 4 . Principal component PC1, PC2, and PC3, accounted for 9.59%, 4.29%, and 3.01% of wheat individual genetic variation, respectively. The wheat individuals were divided into four main clusters, with Cluster A containing the largest number of individuals, indicating significant differences in genetic structure among individuals. 3.5 Population Structure Analysis To further elucidate the genetic background relationships of wheat, the identified highly consistent SNP molecular markers were used to analyze the population genetic structure of 306 wheat germplasms using the Admixture software. The results revealed a relatively distinct population structure among the 306 wheat germplasms, with the lowest cross-validation error rate corresponding to K = 4, thus determining the optimal cluster number as four, which was highly consistent with the phylogenetic tree classification results (Fig. 5 ). When K = 4, Group I comprised 75 samples dominated by ancestral Q1, Group II included 102 samples dominated by ancestral Q2, Group III consisted of 76 samples dominated by ancestral Q3, and Group IV contained 53 samples dominated by ancestral Q4. Most populations exhibited signs of admixture, indicating close relationships among individuals and the presence of certain genetic exchanges (Fig. 6 ). Population structure analysis ( K = 4) results indicated that the genetic heterozygosity proportion was greater than that of morphological analysis. According to traditional classification methods, wheat seeds are divided into hard wheat and soft wheat based on endosperm texture. Among the 306 materials, 121 were hard wheat with compact endosperm structure, translucent appearance, and a hardness index above 60. The remaining were soft wheat with hardness indices ranging from 33 to 42, loose endosperm structure, and gypsum-like appearance. 3.6 Core germplasm screening Core germplasm represents the maximum range of genetic diversity of morphological characteristics, geographical distribution, genes and genotypes of a particular kind of plant or its wild relatives. It has significant academic and practical significance for promoting germplasm exchange, utilization and gene bank management. Core Hunter II software can extract diverse, representative and least redundant subsets from a large number of germplasm resources to construct core germplasm or micro-core germplasm. Based on the genetic variation marker (SNP) data combined with multiple evaluation measures (modified Rogers distance, Shannon’s diversity index, etc.) for weighted processing, we determined that the screened materials have high diversity, high representativeness and high locus richness. Using Core Hunter II software combined with the weighted index modified Rogers distance (0.7) and Shannon’s diversity index (0.3), screening was conducted according to the gradients of the total germplasm resource ratio of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. The gene coverage of the screened materials was evaluated, and finally 61 materials were determined as core germplasms. 3.7 SNP Fingerprint Construction Based on phylogenetic analysis, population structure, and PCA, the genotypes of 114 SNP loci from 306 wheat varieties were determined to construct fingerprint profiles. The nucleotide bases at each locus exhibited four possible types: A, T, G, and C. According to SNP principles, each SNP locus may present four variant types: substitution, transversion, deletion, and insertion. The four homozygous loci types are A/A, G/G, C/C, and T/T, while the heterozygous loci include A/C, A/T, A/G, C/T, C/G, and T/G. Different colors were assigned to represent the four homozygous loci: yellow, green, blue, and purple, respectively, while white was used to denote all heterozygous loci (Fig. 7 ). Finally, individual and composite fingerprint profiles were constructed for the 306 wheat varieties using the 114 SNP loci. These profiles visually demonstrate genomic-level differences among common cultivated wheat varieties, providing a foundation for the identification, protection, and utilization of wheat germplasm resources. 4. DISCUSSION 4.1 Development of Molecular Markers Interspecies genetic diversity and population evolution can be analyzed using various molecular markers, such as SSR, AFLP, SRAP, and SCoT [ 32 – 34 ]. However, most of these molecular marker methods suffer from drawbacks such as cumbersome operation, poor stability, high cost, and insufficient accuracy. The use of different molecular markers for analyzing the same species can yield varying genetic diversity results, which is often directly related to the sensitivity and detection characteristics of the molecular markers themselves. SNP markers, as third-generation molecular markers based on genome sequencing, exhibit advantages such as high polymorphism, high density, large numbers, genetic stability, wide distribution, and high-throughput detection [ 35 , 36 ]. They are ideal markers for studies on population genetic evolution, gene mapping, genetic linkage map construction, and genome-wide association analysis. The key to SLAF-seq technology lies in selecting the optimal enzyme digestion scheme to digest the species’ genome, followed by sequencing to obtain genomic variation information of the target species. Unlike whole-genome sequencing, this approach avoids a large number of repetitive sequences, significantly reducing sequencing costs. It has played an important role in SNP mining for species such as hollyhock ( Alcea rosea L.) [ 37 ], rose ( Rosa chinensis Jacq.) [ 38 ], maize [ 39 ], rice [ 40 ], Miscanthus [ 41 ], and eggplant ( Solanum melongena ) [ 42 ]. In this study, SLAF-seq technology was employed to conduct reduced-representation genome sequencing on 306 wheat materials with an average sequencing depth of 10.97×. To improve marker quality, GATK and SAMtools were used for joint screening, resulting in 5,232,218 population SNP markers. After filtering, 52,228 high-quality SNP markers were obtained, fully meeting the requirements for specific locus analysis and differential genetic information discovery among the tested materials. The substantial number of SNP markers obtained can be utilized for genome-wide association studies (GWAS) on key traits related to wheat yield, quality, and stress resistance. And the results will provide a valuable reference for future molecular genetic research in wheat. 4.2 Analysis of Genetic Diversity Genetic diversity is one of the core subjects in biological research and serves as the foundation for species survival and evolution [ 43 ]. Studying the genetic diversity of populations can provide deeper insights into a species’ adaptability to the environment and its genetic structure, facilitating the conservation of germplasm resources for important species. Genetic diversity evaluation parameters reflect the levels of genetic variation among and within populations. In this study, the expected number of alleles (Ne) ranged from 1.105 to 2.000, with an average of 1.509, showing a certain discrepancy from the observed number of alleles, indicating uneven distribution of alleles within the population. MAF, which is commonly used to distinguish between common and rare variants in populations [ 44 ], was moderate at 0.224. PIC reflects the degree of variation at SNP loci. According to the theory proposed by Botstein [ 24 ], PIC ≥ 0.5 indicates high polymorphism; PIC between 0.25 and 0.5 indicates moderate polymorphism; and PIC ≤ 0.25 indicates low polymorphism. Nei’s genetic diversity index reflects the level of genetic diversity within a population, with higher values indicating lower genetic uniformity and higher genetic diversity. SNP analysis results showed that the 306 wheat accessions tested, originating from 17 regions in China, exhibited a PIC ranging from 0.090 to 0.375, a Nei’s diversity index between 0.095 and 0.502, and a total of 1,191,222 polymorphic markers, indicating a relatively high level of genetic diversity among the tested germplasms. Of the wheat accessions tested, 170 were spring wheat distributed in northeastern and northwestern China, including Heilongjiang, Inner Mongolia, Ningxia, Xinjiang, and Gansu, regions characterized by lower temperatures and shorter growing seasons. The remaining 136 wheat accessions were weak winter wheat, mainly distributed in Shandong, Henan, Hebei, Jiangsu, and Anhui, regions with milder climates and longer growing seasons. Generally, the richer the genetic diversity of a species, the stronger its adaptability to the environment [ 45 ]. In this study, all 136 weak winter wheat varieties were able to grow normally in Bayannur, a spring wheat cultivation region, further demonstrating that higher genetic diversity is the basis for better environmental adaptation in species. In recent years, despite the development of numerous wheat varieties and increasingly abundant germplasm resources, the high-intensity artificial selection during breeding and the frequent use of the same parental lines have significantly reduced population genetic diversity, leading to a narrow genetic base among varieties and making it difficult to select new breakthrough cultivars. This may be related to breeders’ converging selection criteria for agronomic traits, quality traits, and resistance, as well as the low diversity of climate types. 4.3 Phylogenetic Tree Construction and Population Structure Analysis Phylogenetic trees can clearly reflect the genetic relationships among different groups but cannot reveal the genetic composition of individuals. In contrast, genetic structure analysis can display the genetic structure of test materials and the genetic composition of individuals at the DNA level, clarifying the ancestral origins of individuals and the exchange of genetic information among different genotypes. This serves as an effective method for analyzing the genetic backgrounds of diverse test materials and constitutes a fundamental task in studies on genetic diversity, varietal evolution, and association analysis of quantitative traits. It lays the foundation for understanding genetic evolution, resource conservation, and the utilization of species [ 46 ]. In this study, the classification results of the constructed evolutionary tree and the cross-validation error rate trough ( K value) corresponding to the population structure both indicated that the optimal number of clusters for the 306 wheat samples was four groups. The constructed evolutionary tree showed that the 109 samples in Group 1 and the 50 samples in Group 2 were genetically close, suggesting a relatively close kinship between these two groups. The results demonstrated that genotypes could not be precisely clustered based on their geographic origins. On the one hand, some genotypes from different geographic origins were grouped into the same cluster, indicating a certain degree of correlation among them, possibly due to germplasm exchange or cross-border trade that facilitated the transfer of germplasm from one region to another. On the other hand, some genotypes from the same geographic origin clustered into different groups, suggesting a certain degree of genetic differentiation among populations. Genotype grouping that does not reflect geographic origins has also been observed in studies on the genetic diversity of cowpea germplasm [ 47 ]. 5. CONCLUSIONS This study utilized SLAF-seq technology to develop SNP markers for 306 spring and weak winter wheat samples adapted to the Hetao region, obtaining a total of 554,315 SLAF tags, among which 356,643 were polymorphic. A total of 5,232,218 population SNP markers were developed, and after population filtering, 52,228 highly consistent and effective SNP markers were obtained. Calculation of population genetic diversity indices indicated that wheat exhibits high genetic diversity at the population level. Using the screened SNP markers, a phylogenetic tree was constructed, classifying the 306 wheat sample resources into four groups. The findings of this study can provide a reference for the exploitation, utilization, and scientific conservation of wheat resources. Authors’ contributions Dongsheng Yang and Shuiyuan Hao conceived and designed the research. Dongsheng Yang and Xu Gao performed the experiments. Dongsheng Yang and Hao Liang completed the writing of the article. Haiwei Wang, Shijun Sun, Chao Cui, Ruinian Xu, Yulei Liu, Che Liu and Lei Wang assisted in the completion of the experiments. All authors have read and agreed to the published version of the manuscript.All authors have read and agreed to the published version of the manuscript. Declarations Authors’ contributions Dongsheng Yang and Shuiyuan Hao conceived and designed the research. Dongsheng Yang and Xu Gao performed the experiments. Dongsheng Yang and Hao Liang completed the writing of the article. Haiwei Wang, Shijun Sun, Chao Cui, Ruinian Xu, Yulei Liu, Che Liu and Lei Wang assisted in the completion of the experiments. All authors have read and agreed to the published version of the manuscript.All authors have read and agreed to the published version of the manuscript. Funding Science and Technology Plan Project of Inner Mongolia Autonomous Region (2025YFHH0252); Science and Technology Plan Project of Bayannur (NMKJXM202408); Development of Inner Mongolia Through Science and Technology of China (NMKJXM202201); Development of Inner Mongolia Through Science and Technology of China (NMKJXM202302). Data Availability The raw sequencing data supporting the findings of this study have been deposited in the NCBI Sequence Read Archive (SRA) under the BioProject accession number PRJNA1314945. These data are publicly available and can be accessed through the NCBI SRA database. Ethics approval and consent to participate The methods involved in the current study were carried out in obedience with intuitional and national regulations. Consent for publication Not applicable. Conflicts of Interest The authors declare no conflicts of interest. References Afzal F, Li HH, Gul A, Subhani A, Ali A, Mujeeb-Kazi, A, Ogbonnaya F, Trethowan R, Xia XC, He ZH, Rasheed A: Genome-wide analyses reveal footprints of divergent selection and drought adaptive traits in synthetic-derived wheats . Genes-Genomes-Genetics 2019, 9 (6): 1957-1973. Lopes MS, El-Basyoni I, Baenziger PS, Singh S, Royo C, Ozbek K, Aktas H, Ozer E, Ozdemir F, Manickavelu A, Ban T, Vikram P: Exploiting genetic diversity from landraces in wheat breeding for adaptation to climate change . Journal of Experimental Botany 2015, 66 (12): 3477-3486. Rasheed A, Mujeeb-Kazi A, Ogbonnaya FC, He ZH, Rajaram S: Wheat genetic resources in the post-genomics era: promise and challenges . Annals of botany 2018, 121 (4): 603-616. Swarts K, Gutaker RM, Benz B, Blake M, Bukowski R, Holland J, Kruse-Peeples M, Lepak N, Prim L, Romay MC, Ross-Ibarra J, Sanchez-Gonzalez JDJ, Schmidt C, Schuenemann VJ, Krause J, Matson RG, Weigel D, Buckler ES, Burbano HA: Genomic estimation of complex traits reveals ancient maize adaptation to temperate North America . Science 2017, 357 (6350): 512-515. Zhou Y, Chen ZX, Cheng MP, Chen J, Zhu TT, Wang R, Liu YX, Qi PF, Chen GY, Jiang QT, Wei YM, Luo MC, Nevo E, Allaby RG, Liu DC, Wang JR, Dvorák J, Zheng YL: Uncovering the dispersion history, adaptive evolution and selection of wheat in China . Plant Biotechnology Journal 2018, 16 (1): 280-291. Heslot N, Jannink JL, Sorrells ME. Perspectives for genomic selection applications and research in plants . Crop Science 2015, 55 (1) :1-12. Morrell PL, Buckler ES, Ross-Ibarra J: Crop genomics: advances and applications . Nature Reviews Genetics 2011, 13 (2): 85-96. Wei DY, Cui YX, He YJ, Xiong Q, Qian LW, Tong CB, Lu GY, Ding YJ, Li JN, Jung C, Qian W: A genome-widesurvey with different rapeseed ecotypes uncovers footprints of domestication and breeding . Journal of Experimental Botany 2017, 68 (17): 4791-4801. Stephan W: Signatures of positive selection: from selective sweeps at individual loci to subtle allele frequency changes in polygenic adaptation . Molecular Ecology 2016, 25 (1): 79-88 Pfeifer M, Kugler KG, Sandve SR, Zhan B, Rudi H, Hvidsten TR, Mayer KFX, Olsen OA: Genome interplay in the grain transcriptome of hexaploid bread wheat . Science 2014, 345 (6194): 1250091. Allen AM, Winfield MO, Burridge AJ, Downie RC, Benbow HR, Barker G, Wilkinson PA, Coghill JA, Waterfall C, Davassi A, Scopes G, Pirani A, Webster T, Brew F, Bloor CA, Griffiths S, Bentley AR, Alda M, Jack P, Phillips AL, Edwards KJ: Characterization of a wheat breeders’ array suitable for high-throughput SNP genotyping of global accessions of hexaploidbread wheat ( Triticum aestivum ). Plant Biotechnology Journal 2017, 15 (3): 390-401. Eltaher S, Li J, Freeman B, Singh S, Ali GS: A genome-wide association study identified SNP markers and candidate genes associated with morphometric fruit quality traits in mangoes . BMC genomics 2025, 26 (1): 120. Lippolis A, Hollebrands B, Acierno V, de Jong C, Pouvreau L, Paulo J, Gezan SA, Trindade LM: GWAS i dentifies SNP m arkers and c andidate g enes for o ff- f lavours and p rotein c ontent in f aba b ean ( Vicia faba L.) . Plants 2025, 14 (2): 193. Yang FY, Lang T, Wu JY, Zhang C, Qu H.J, Pu ZG, Yang F, Yu M, Feng JY: SNP loci identification and KASP marker development system for genetic diversity, population structure, and fingerprinting in sweetpotato ( Ipomoea batatas L.) . BMC Genomics 2024, 25 (1): 1245. Adedugba AA, Adeyemo OA, Adetumbi AJ, Ilesanmi OJ, Ogunkanmi LA: Genetic diversity and population structure of some Nigerian and four African countries' sorghum landraces [ Sorghum bicolor (L.) Moench] using Genotyping-By-Sequencing (GBS) SNP markers . South African Journal of Botany 2023, 162 : 495-504. Feng JY, Zhao S, Li M, Zhang C, Qu HJ, Li Q, Li JW, Lin Y, Pu ZG: Genome-wide genetic diversity detection and population structure analysis in sweetpotato ( Ipomoea batatas ) using RAD-seq . Genomics 2020, 112 (2): 1978-1987. Wang XY,Wang WJ, Zhan DM, Ge SS,Tang LQ: Genome- w ide SNP m arkers b ased on SLAF-Seq u ncover g enetic d iversity of Saccharina c ultivars in Shandong, China . Frontiers in Marine Science 2022, 9 : 849502. Sun XW, Liu DY, Zhang XF, Li WB, Liu H, Hong WG, Jiang CB, Guan N, Ma CX, Zeng HP, Xu CH, Song J, Huang L, Wang CM, Shi JJ, Wang R, Zheng XH, Lu CY, Wang XW, Zheng HK: SLAF-seq : a n efficient method of large-scale de novo SNP discovery and genotyping using high-throughput sequencing . PLoS One 2013, 8 (3): e58700. Li J, Lin JY, Ou HB, Liu XS, Jiang Y, Liang RL: Marker d evelopment and a nalysis of g enetic d iversity of p hoebe bournei g ermplasms u sing SLAF-seq t echnology . Molecular Plant Breeding 2021, 19 (13), 4517-4524 Ren HL, Han JN, Wang XR, Zhang B, Yu LL, Gao HW, Hong HL, Sun RJ, Tian Y, Qi XS, Liu ZX, Wu XX, Qiu LJ: QTL mapping of drought tolerance traits in soybean with SLAF sequencing . The Crop Journal 2020, 8 (6): 977-989. Wen Y, Fang YX, Hu P, Tan YQ, Wang YY, Hou LL, Deng XM, Wu H, Zhu LX, Zhu L, Chen G, Zeng DL, Guo LB, Zhang GH, Gao ZY, Dong GJ, Ren DY, Shen L, Zhang Q, Xue DW, Qian Q, Hu J: Construction of a h igh- d ensity g enetic m ap b ased on SLAF m arkers and QTL a nalysis of l eaf s ize in r ice . Frontiers in plant science 2020, 11 : 1143. Xia C, Chen LL, Rong TZ, Li R, Xiang Y, Wang P, Liu CH, Dong XQ, Liu B, Zhao D, Wei RJ, Lan H: Identification of a new maize inflorescence meristem mutant and association analysis using SLAF-seq method . Euphytica 2015, 202 (1): 35-44. Davey JW, Cezard T, Fuentes-Utrilla P, Eland C, Gharbi K, Blaxter ML: Special features of RAD Sequencing data: implications for genotyping . Molecular ecology 2013, 22 (11): 3151-3164. Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD: Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq i llumina sequencing platform . Applied and environmental microbiology 2013, 79 (17): 5112-5120. Li H, Durbin R: Fast and accurate short read alignment with Burrows - Wheeler transform . Bioinformatics 2009, 25 (14): 1754-1760. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA: The g enome a nalysis t oolkit: a m ap r educe framework for analyzing next-generation DNA sequencing data . Genome research 2010, 20 (9): 1297-1303. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The sequence alignment/map format and SAMtools . Bioinformatics 2009, 25 (16): 2078-2079. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: Molecular evolutionary genetics analysis across computing platforms. Molecular Biology and Evolution 2018, 35 (6): 1547-1549. Alexander DH, Novembre J, Lange K: Fast model-based estimation of ancestry in unrelated individuals . Genome research 2009, 19 (9): 1655-1664. Pritchard JK, Stephens M, Donnelly P: Inference of population structure using multilocus genotype data . Genetics 2000, 155 (2): 945-959. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D: Principal components analysis corrects for stratification in genomewide association studies . Nature Genetics 2006, 38 (8): 904-909. Deepika C, Venkatachalam SR, Yuvaraja A, Arutchenthil P, Indra N, Ravichandran V, Veeramani P, Kathirvelan P: Comparison of genetic diversity using morphometric, molecular and digital imaging parameters in castor ( Ricinus communis L.) . Biocatalysis and Agricultural Biotechnology 2025, 65: 103525. Hu SJ, Wang MZ, Yan XH, Cheng XM. Genetic d iversity and p opulation s tructure of e ndangered o rchid Cypripedium flavum in f ragmented h abitat u sing f luorescent AFLP m arkers . Plants 2024, 13 (20): 2851. Sinjare DYK, Abdulrahman SS, Khalid NS, Ismail RY, Naeem MY, Selamoglu Z, Issayev G, Ahmed AMN: Molecular characterization and cuticular stomatal anatomy of Punica Granatum L. cultivars study in Dohuk governorate . Molecular Biology Reports 2025, 52 (1): 287-287. Sathishkumar R, Mohanrao MD, Geethanjali S, Prasad MSL, Senthilvel S: A simple and cost-effective SNP genotyping assay for marker-assisted selection of wilt resistance in castor breeding . Industrial Crops and Products , 2025, 226 : 120693. Muturi P, Kyallo M, Gasura E, Yao N. Diversity analysis and genome-wide association studies of seed weight trait in Bambara groundnut ( Vigna subterranea (L.) Verdc.) using diversity array technology sequence derived single nucleotide polymorphism markers . Euphytica 2025, 221 (4): 34-34. Deng WQ, Li YF, Chen X, Luo YZ, Pan YZ, Li X, Zhu ZS, Li FW, Liu XL Jia Y: Development of single nucleotide polymorphism (SNP) markers and construction of DNA fingerprinting of Alcea rosea L. based on specific-locus amplified fragment sequencing (SLAF-seq) technology . Genetic Resources and Crop Evolution 2024, 72 (2): 1-15. Xia AN, Yang AA, Meng XS, Dong GZ, Tang XJ, Lei SM, Liu YG: Development and application of rose ( Rosa chinensis Jacq.) SNP markers based on SLAF-seq technology . Genetic Resources and Crop Evolution 2022, 69 : 173-182. Wen TT, Zhang XF, Zhu JJ, Zhang SS, Rhaman MS, Zeng W: A SLAF-based high-density genetic map construction and genetic architecture of thermotolerant traits in maize ( Zea mays L.) . Frontiers in Plant Science 2024, 15 : 1338086. Liu X, Zhang N, Sun YR, Fu ZX, Han YH, Yang Y, Jia JC, Hou SY, Zhang BJ: QTL mapping of downy mildew resistance in foxtail millet by SLAF-seq and BSR-seq analysis . Theoretical and Applied Genetics 2024, 137 (7): 168. Chen ZY, He YC, Iqbal Y, Shi YL, Huang HM, Yi ZL: Investigation of genetic relationships within three Miscanthus species using SNP markers identified with SLAF-seq . BMC Genomics 2022, 23 (1): 43. Wei QZ, Wang WH, Hu TH, Hu HJ, Wang JL, Bao CL: Construction of a SNP-based genetic map using SLAF-seq and QTL analysis of morphological traits in eggplant . Frontiers in Genetics 2020, 11 : 178. Souza IGB, Souza VAB, Lima PSC: Molecular characterization of Platonia insignis Mart. (" bacurizeiro ") using inter simple sequence repeat (ISSR) markers . Molecular biology reports 2013, 40 (5): 3835-3845. Dussault FM, Boulding EG: Effect of minor allele frequency on the number of single nucleotide polymorphisms needed for accurate parentage assignment: A methodology illustrated using Atlantic salmon . Aquaculture Res earch 2018, 49 : 1368-1372. Pauls SU, Nowak C, Bálint M, Pfenninger M: The impact of global climate change on genetic diversity within populations and species . Molecular ecology 2013 , 22(4): 925-946. Wei SP, Liu XF, Yang SX, Lv HY, Niu Y, Zhang YM: Comparison of various clusteringmethods for population structure in Chinese cultivated soybean[ Glycine max (L.) Merr.] . Journal of Nanjing Agricultural University 2011, 34 (2): 13-17. Chipeta MM, Kafwambira J, Yohane, E: Cowpea genetic diversity, population structure and genome-wide association studies in Malawi: insights for breeding programs . Frontiers in Plant Science 2025, 15 : 1461631. Additional Declarations No competing interests reported. Supplementary Files STable.docx Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 06 Mar, 2026 Reviews received at journal 04 Mar, 2026 Reviews received at journal 25 Feb, 2026 Reviewers agreed at journal 25 Feb, 2026 Reviewers agreed at journal 24 Feb, 2026 Reviewers invited by journal 24 Feb, 2026 Editor invited by journal 10 Feb, 2026 Editor assigned by journal 09 Feb, 2026 Submission checks completed at journal 09 Feb, 2026 First submitted to journal 06 Feb, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8808792","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":596845027,"identity":"f392b3ef-4133-4dbc-886e-7a7e14b3b345","order_by":0,"name":"Dongsheng Yang","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Dongsheng","middleName":"","lastName":"Yang","suffix":""},{"id":596845028,"identity":"c76b6007-0876-41aa-83b6-18b5c5affb20","order_by":1,"name":"Hao Liang","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Hao","middleName":"","lastName":"Liang","suffix":""},{"id":596845029,"identity":"275c0531-6985-477c-85af-98171cf5ceda","order_by":2,"name":"Shijun Sun","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Shijun","middleName":"","lastName":"Sun","suffix":""},{"id":596845030,"identity":"e889ece4-db77-4d59-bedd-04f2bb27e834","order_by":3,"name":"Haiwei Wang","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Haiwei","middleName":"","lastName":"Wang","suffix":""},{"id":596845031,"identity":"f7aab4bd-df0e-4bed-9ec8-12572740ca29","order_by":4,"name":"Chao Cui","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Chao","middleName":"","lastName":"Cui","suffix":""},{"id":596845032,"identity":"d20cd3b6-e05d-41c6-b151-7028fe58695e","order_by":5,"name":"Ruinian Xu","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Ruinian","middleName":"","lastName":"Xu","suffix":""},{"id":596845033,"identity":"406e38ed-499c-4f3c-9685-90b2401c0cef","order_by":6,"name":"Yulei Liu","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Yulei","middleName":"","lastName":"Liu","suffix":""},{"id":596845034,"identity":"2ac5cec6-9af1-456c-84d7-2d07f8a14402","order_by":7,"name":"Che Liu","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Che","middleName":"","lastName":"Liu","suffix":""},{"id":596845035,"identity":"93e2c03e-09ae-44af-bdb3-cfc4c0eddd9d","order_by":8,"name":"Lei Wang","email":"","orcid":"","institution":"Hetao College","correspondingAuthor":false,"prefix":"","firstName":"Lei","middleName":"","lastName":"Wang","suffix":""},{"id":596845036,"identity":"0f949847-4f9a-4203-b548-11fa42f007d2","order_by":9,"name":"Shuiyuan Hao","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA1UlEQVRIiWNgGAWjYBACAyBmZmCQALEZHyRU2JCmhdngwZk0orWAAZvkw7ZDhLWYsx8+/LmwzSJxw/GzxyoS2A4w8Ld3J+DVYtmTliY9s00iccOZvLQbCTx3GCTOnN2A32EHcsyYeUFagIwbCRLPGAwkcgloOf/G+DNYy/k3ZgUJBoeJ0HIjx0AarOVGjhlDQgJRWp6lSfOckzCeeeONsUTCgTQewn45n3z4M09ZnWzf+RzDjz//2cjxt/fi1wIDjg1QBg9RykHAnmiVo2AUjIJRMPIAAOocTGGMpbEiAAAAAElFTkSuQmCC","orcid":"","institution":"Hetao College","correspondingAuthor":true,"prefix":"","firstName":"Shuiyuan","middleName":"","lastName":"Hao","suffix":""},{"id":596845037,"identity":"de99764a-87dd-465a-a391-43b6797a70aa","order_by":10,"name":"Xu Gao","email":"","orcid":"","institution":"Inner Mongolia Tianfu Hetao Germplasm Science and Technology Development Co., Ltd.","correspondingAuthor":false,"prefix":"","firstName":"Xu","middleName":"","lastName":"Gao","suffix":""}],"badges":[],"createdAt":"2026-02-06 15:25:56","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8808792/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8808792/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":103605929,"identity":"7f31155b-2f2a-4cd2-9830-30ce8d62b4b5","added_by":"auto","created_at":"2026-02-27 14:46:49","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":73144,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDistribution of insert fragments of control sequences\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/4d2ad1351f88aaec3b15671f.png"},{"id":104398302,"identity":"0c73deb7-70f7-44a1-bd8b-05bb24b4de59","added_by":"auto","created_at":"2026-03-11 12:01:31","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":293913,"visible":true,"origin":"","legend":"\u003cp\u003eThe distribution of SLAF tags and SNPs on chromosomes\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/449755c0d16ac40555d4cdd0.png"},{"id":104399267,"identity":"75874755-b63a-446d-920e-dbbfe9b16a12","added_by":"auto","created_at":"2026-03-11 12:05:19","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":331980,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePhylogenetic tree of 306 wheat germplasm resources\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/ca9efb4b585ae4e4e063af66.png"},{"id":103605933,"identity":"807398fc-3aee-497b-92e9-99214d84fe43","added_by":"auto","created_at":"2026-02-27 14:46:49","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":104792,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePCA plot of 306 wheat germplasm resources\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/c46653da6457567a72f469bd.png"},{"id":103605930,"identity":"4c4283bc-a473-456e-95bc-5266a481380c","added_by":"auto","created_at":"2026-02-27 14:46:49","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":41365,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCross-validation error rates corresponding to different \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eK\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e values\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/f44a45537cfb2429639c4c5e.png"},{"id":104399548,"identity":"41aea797-4e17-4c75-bbb0-d56b302ba71d","added_by":"auto","created_at":"2026-03-11 12:06:36","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":688372,"visible":true,"origin":"","legend":"\u003cp\u003eAdmixture individual cluster values corresponding to each \u003cem\u003eK\u003c/em\u003e value\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/961416e4b74a0645c702dc13.png"},{"id":103605935,"identity":"b814ae54-4055-4156-92ae-2bc17c27ce00","added_by":"auto","created_at":"2026-02-27 14:46:49","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":792988,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDNA fingerprints of 306 wheat samples. Each row represents the selected candidate loci, and each column represents a sample\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/bf16d95f75446a4747380632.png"},{"id":106723532,"identity":"8c6ab7a1-3f05-4ee3-bd24-917c4101e49d","added_by":"auto","created_at":"2026-04-12 18:05:09","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":5663070,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/dd77c89d-8e46-44f6-b7b0-77d35056452e.pdf"},{"id":104399291,"identity":"d37e705b-ec00-419b-880b-7883cd6e16fe","added_by":"auto","created_at":"2026-03-11 12:05:24","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":75130,"visible":true,"origin":"","legend":"","description":"","filename":"STable.docx","url":"https://assets-eu.researchsquare.com/files/rs-8808792/v1/85b00931d43aa02c02d4d95d.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"SLAF-seq Efficiently Identifies SNP Markers for Wheat (Triticum aestivum L.) Development","fulltext":[{"header":"1. INTRODUCTION","content":"\u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eWheat (\u003cem\u003eTriticum aestivum\u003c/em\u003e L.) is one of the world\u0026rsquo;s major food crops and the second-largest staple crop in China, providing approximately 20% of protein intake and 21% of caloric intake for humans. By 2050, the global population is projected to exceed 9\u0026nbsp;billion (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://population.un.org/wpp/\u003c/span\u003e\u003cspan address=\"https://population.un.org/wpp/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e); wheat production will need to be sufficient to ensure food security [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Breeders have been gradually developing wheat varieties adapted to various cultivation environments through artificial selection based on landraces [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Researchers often focus on identifying and utilizing genes closely associated with important and complex agronomic traits, as this targeted trait-based selective breeding approach yields higher efficiency [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Traditional linkage analysis and association mapping are effective methods for identifying genetic locus variations, but they still exhibit limitations in detecting genetic variations related to domestication and improvement [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Studying genetic variation regions across different populations or within populations can help identify loci under selection [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe cultivation of excellent varieties is crucial for ensuring high and stable wheat yields in China. However, when breeding for wheat varieties with superior agronomic traits, breeders have a tendency to overemphasize the utilization of a limited number of elite wheat germplasms. This has led to an increasingly narrow genetic base in the wheat germplasms that have been bred, which in turn has resulted in slow progress and limited advancement in wheat breeding. Consequently, research on germplasm resources, as a fundamental aspect of developing new wheat varieties, has garnered increasing attention among researchers. Analyzing the genetic diversity of germplasms, understanding the genetic basis and kinship of existing germplasms, and identifying and utilizing superior germplasm materials are of great significance for broadening the genetic base of wheat breeding, accelerating the breeding process, and enhancing breeding efficiency.\u003c/p\u003e \u003cp\u003eWheat has a large and complex genome with a high proportion of repetitive sequences or transposons [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. As such, high-throughput, high-density genotyping strategies are often used to reduce research costs [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. Single nucleotide polymorphisms (SNPs), which are third-generation DNA molecular markers, are highly accurate, offer high throughput, and are low-cost. SNPs are widely used in plant molecular biology research [\u003cspan additionalcitationids=\"CR13\" citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. Many technical methods for SNP marker development exist, including GBS, SLAF, and RAD [\u003cspan additionalcitationids=\"CR16\" citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. Specific-locus amplified fragment sequencing (SLAF-seq) is a reduced-representation genome sequencing technology that combines high-throughput sequencing to screen representative SLAF fragments [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. First, in-silico enzyme digestion is predicted based on the reference genome of the species or a closely-related one. Then, two optimal restriction enzymes are chosen for double digestion. Finally, 300\u0026ndash;500-bp-long enzyme-digested fragments are identified as SLAF tags. The advantages of SLAF-seq are applicability to species with or without a reference genome, uniform tag distribution across the genome, suitably-sized enzyme-digested fragments, minimal sequencing data redundancy, and high data utilization efficiency [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. SLAF-seq-based SNP marker development has been widely used in major crops such as soybean (\u003cem\u003eGlycine max\u003c/em\u003e (L.) Merr.), maize (\u003cem\u003eZea mays\u003c/em\u003e L.), and rice (\u003cem\u003eOryza sativa\u003c/em\u003e L.), for germplasm resource kinship identification, genetic diversity analysis, high-density genetic map construction, and gene mapping [\u003cspan additionalcitationids=\"CR21\" citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. The wheat genome is extremely large and complex, with significant heterozygosity. Reduced-representation genome sequencing can overcome these issues, creating favorable conditions for large-scale SNP development.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003eIn this study, we used SLAF-seq to develop wheat genome-wide SNP markers and analyzed the genetics of 306 wheat samples. Sequences were examined against the wheat reference genome. Using 17,233 novel genome-wide SNPs, we gauged genetic diversity and population structure. We provide key data for wheat-related research such as genome-wide association studies and germplasm evaluation. Our results could aid in wheat resource conservation and utilization, as well as genetic mapping, important-trait association analysis, and marker-assisted breeding.\u003c/p\u003e"},{"header":"2. MATERIALS AND METHODS","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1 Test Materials\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe experimental materials consisted of 306 wheat germplasm resources with spring and weak winter characteristics collected from various regions in China, all of which could grow normally in the Bayannur area (S1 Table ). They were planted at the Hetao College Crop Experimental Field in April 2024.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e2.2 DNA Extraction and Detection\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eFrom each sample, around 150 mg of tender leaf tissue was gathered, swiftly frozen in liquid nitrogen, and stored at \u0026minus;\u0026thinsp;80\u0026deg;C in an ultra-cold freezer for later use. Genomic DNA of wheat was extracted via the CTAB method. An ultra-micro spectrophotometer, the NanoDrop ND 2000, was employed to assay the DNA concentration and 1.5% agarose gel electrophoresis was used to check that the DNA quality was sufficient for library preparation and sequencing.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e2.3 Library Construction\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe Wheat_KN9204 genome (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ngdc.cncb.ac.cn/gwh/Assembly/25997/show\u003c/span\u003e\u003cspan address=\"https://ngdc.cncb.ac.cn/gwh/Assembly/25997/show\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) was selected as the wheat reference genome for restriction enzyme digestion prediction, with a genome size of 14.47 Gb and a GC content of 46.16%. The optimal restriction enzyme digestion strategy was developed to ensure the following: 1) a minimal proportion of restriction fragments located in repetitive sequences; 2) a relatively uniform distribution of restriction fragments across the genome; 3) fragment lengths that matched with the specific experimental system [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]; and 4) a sufficient number of restriction fragments (SLAF tags) to meet the expected tag count. Qualified genomic DNA samples were individually digested and the resulting restriction fragments (SLAF tags) underwent 3\u0026prime;-end A-tailing, dual-index adapter ligation, PCR amplification, purification, pooling, gel excision, and target fragment selection [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003e2.4 Illumina HiSeq Sequencing and Data Quality Assessment\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eAfter passing the library quality inspection, high-throughput sequencing was performed using Illumina HiSeq.\u0026nbsp;Paired-end sequencing was employed to identify raw data and obtain the read count for each sample. Complex structural regions on the genome (such as loop domains and consecutive restriction sites), low individual purity of genomic DNA, insufficient digestion time of restriction enzymes, and other factors may affect the activity of restriction enzymes, leading to some restriction sites remaining uncut. The sequencing quality score (Q) is a crucial metric for evaluating the single-base error rate in high-throughput sequencing. A higher sequencing quality score corresponds to a lower base-calling error rate. If the probability of a base-calling error is 0.001, the quality score of that base should be Q30.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003e2.5 Development of SNP Markers\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eReads were sequenced from restriction fragments produced by the same restriction enzyme acting on different samples. Reads derived from SLAF-seq were clustered based on sequence similarity. When SLAF tag sequences differed between samples, these were considered polymorphic SLAF tags. SNP markers were developed based on the reference sequence with the highest depth within each SLAF tag. Sequencing reads were aligned to the reference genome using BWA [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e], and SNPs were called using GATK [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] and SAMtools [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]. The intersection of SNP markers obtained from both methods was considered a reliable SNP marker dataset. Finally, SNPs with a minor allele frequency (MAF)\u0026thinsp;\u0026gt;\u0026thinsp;0.05 and completeness\u0026thinsp;\u0026gt;\u0026thinsp;0.8 were selected for subsequent analysis.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e2.6 Phylogenetic Tree Construction\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe developed SNP markers were preliminarily filtered according to a criteria of completeness\u0026thinsp;\u0026gt;\u0026thinsp;0.5 and MAF\u0026thinsp;\u0026gt;\u0026thinsp;0.05. Based on the initially screened population-consistent SNPs, a phylogenetic tree of all samples was constructed using MEGA X [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] software with the neighbor-joining method under the Kimura 2-parameter model, with 1,000 bootstrap replicates.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e2.7 Analysis of Genetic Diversity\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eSNPs developed from populations composed of non-genetic group materials should possess characteristics such as high marker quality, strong representativeness, high discriminative power among materials, uniform genomic distribution, and strong specificity. Based on these principles, markers should be uniformly distributed across the genome and markers should have no missing loci (i.e., a locus completeness of 100%). Loci were discarded if any of the following were true: MAF\u0026thinsp;\u0026lt;\u0026thinsp;20%, PIC\u0026thinsp;\u0026lt;\u0026thinsp;0.35, and P-value from Hardy-Weinberg equilibrium testing\u0026thinsp;\u0026lt;\u0026thinsp;0.01. Also, if there were no other locus mutations within 100 bp upstream and downstream of the selected markers, the marker was discarded.\u003c/p\u003e \u003cp\u003eThe PowerMarker V3.25 software was used to calculate the MAF, expected number of alleles, observed number of alleles, expected heterozygosity (He), observed heterozygosity (Ho), Nei\u0026rsquo;s diversity index (H), PIC, and Shannon-Wiener index (I).\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003e2.8 Population Genetic Structure and Principal Component Analysis\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003ePopulation structure analysis was conducted using Admixture (v1.22) [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. The number of subgroups (K value) was preset from 1 to 10 for clustering, and the clustering results were cross-validated. The optimal number of clusters was determined based on the trough of the cross-validation error rate [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. The R package Pophelper was used to generate stacked Q-matrix plots for each \u003cem\u003eK\u003c/em\u003e value (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://royfrancis.github.io/pophelper\u003c/span\u003e\u003cspan address=\"http://royfrancis.github.io/pophelper\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). The smartPCA program in the Eigensoft (v6.0) [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e] package was employed to perform principal component analysis (PCA) based on SNP data, revealing the clustering patterns of the samples. Principal component analysis was utilized to transform multiple indicators into several comprehensive indicators for analyzing genetic relationships among individuals.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003e2.9 Construction of SNP Fingerprint Map\u003c/h2\u003e \u003cp\u003eBased on the high-quality core SNP markers selected, genotyping results were obtained for all samples. SNP loci with consistent genotypes among replicate samples were identified and converted into binary-encoded data to construct the SNP fingerprint map of 306 germplasm resources containing wheat varieties and high-generation wheat lines.\u003c/p\u003e \u003c/div\u003e"},{"header":"3. RESULTS","content":"\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003e3.1 SLAF Library Quality Assessment\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eSLAF-seq reads are genomic DNA restriction fragments with a base distribution influenced by restriction enzyme sites and PCR amplification. The first two bases of the sequencing reads exhibited base separation consistent with the restriction sites, while subsequent bases showed varying degrees of fluctuation (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). Using SLAF-predict software to perform in silico restriction digestion prediction on the wheat reference genome, the restriction enzyme combination was determined to be Rsa Ι\u0026thinsp;+\u0026thinsp;Hae Ш. Digestion efficiency is a key indicator for evaluating the success of SLAF experiments. In this study, the paired-end mapping efficiency was 88.72%, with residual restriction sites accounting for 11.28%, indicating normal SLAF tag digestion. Sequences with fragment lengths between 464 and 494 bp were defined as SLAF tags, and 560,935 SLAF tags were predicted, which were generally evenly distributed across the genome. To assess the normality of the SLAF experimental process and the effectiveness of the restriction digestion protocol, the SOAP software was used to align sample reads to the reference genome. The results demonstrated a paired-end mapping efficiency of 91.66%, single-end mapping efficiency of 4.23%, and unmapped efficiency of 4.11%. Additionally, the insert size distribution fell within the expected range (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB), confirming that the SLAF library construction was normal and effective.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Sequencing Data Statistics and Evaluation\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eTo ensure analytical quality, 126 bp \u0026times; 2 read lengths were used for subsequent data evaluation and analysis. Sequencing of 306 wheat materials yielded read counts ranging from 7,032,048 to 35,918,412, totaling 4978.16 Mb reads, with sample W33 exhibiting the longest sequences. The sequencing Q30 values ranged between 85.03% and 92.27%, with an average of 90.65%. The GC content varied from 47.38% to 51.53%, averaging 48.55%, indicating high sequencing quality and reliable results. Furthermore, alignment with the reference genome showed that the percentage of clean reads mapped to the reference genome ranged from 92.82% to 99.70%, with an average of 99.11%. The proportion of paired-end sequences properly aligned to the reference genome with insert sizes matching the expected fragment length distribution ranged from 77.32% to 96.58%, averaging 86.53%.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003e3.3 Development of SLAF Tags and SNP Markers\u003c/h2\u003e \u003cp\u003eA total of 554,315 SLAF tags were obtained from the genome sequencing of 306 wheat samples, with an average sequencing depth of 10.97\u0026times;, among which 356,643 SLAF tags were polymorphic. The SLAF tags were mapped to the reference genome using BWA software, and their distribution across chromosomes was visualized (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). A total of 5,232,218 population SNPs were developed, with the number of SNP markers per sample ranging between 1,862,063 and 3,095,277. The completeness of these SNP markers was 35.59%\u0026ndash;59.16%, and the heterozygosity rate was 3.63%\u0026ndash;14.50%. The distribution of SNPs across chromosomes was also plotted (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). The developed SLAF tags and SNPs covered nearly all chromosomes, with the highest SNP density observed on chromosomes GWHBJWI00000004, GWHBJWI00000005, GWHBJWI00000008, and GWHBJWI000000020. After filtering the population SNPs, 52,228 highly consistent and effective SNP markers were obtained for subsequent germplasm genetic evolution analysis.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003e3.4 Genetic Diversity and Principal Component Analysis\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThrough rigorous screening of indicators, such as the heterozygosity rate, the missing rate, and polymorphism, high-quality SNP markers were obtained and used to estimate genetic diversity parameters for 306 wheat samples. The expected allele number was 1.105\u0026ndash;2.000, with an average of 1.509, and the observed allele number was 2.000 for all markers. The expected heterozygous number was 0.095\u0026ndash;0.500, with an average of 0.308, and the observed heterozygosity number ranged from 0.003 to 1.000, with an average value of 0.090. The Nei diversity index ranged from 0.095 to 0.502, with an average of 0.309. The PIC ranged from 0.090 to 0.375, with an average of 0.251. The Shannon Wiener index ranged from 0.199 to 0.693, with an average of 0.475 (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). These results indicate that the dataset is suitable for genotyping and fingerprint construction of the 306 wheat samples.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eGenetic diversity index of wheat population samples\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"8\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIndex\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eExpected allele number\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eExpected\u003c/p\u003e \u003cp\u003eheterozygous number\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eObserved allele number\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eObserved heterozygous number\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eNei diversity index\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003ePolymorphism information content\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003eShannon Wiener index\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRange\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.105\u0026ndash;2.000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.095\u0026ndash;0.500\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e2.000\u0026ndash;2.000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.003\u0026ndash;1.000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.095\u0026ndash;0.502\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.090\u0026ndash;0.375\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e0.199\u0026ndash;0.693\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAverage\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.509\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.308\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e2.000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.090\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.309\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.251\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e \u003cp\u003e0.475\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eUsing the neighbor-joining algorithm of the MAGA software, a phylogenetic tree of 306 wheat samples was constructed, with the 306 samples divided into four groups (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). Group 1 contained the largest number of samples at 159, which were further divided into two subgroups: subgroup 1 comprised 14 samples, while subgroup 2 was further divided into three smaller groups containing 25, 25, and 95 samples respectively. Group 2 contained 50 samples; Group 3 contained 56 samples; and Group 4 contained 41 samples. PCA was conducted based on high-quality group SNPs, and the clustering of individuals is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e. Principal component PC1, PC2, and PC3, accounted for 9.59%, 4.29%, and 3.01% of wheat individual genetic variation, respectively. The wheat individuals were divided into four main clusters, with Cluster A containing the largest number of individuals, indicating significant differences in genetic structure among individuals.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003e3.5 Population Structure Analysis\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eTo further elucidate the genetic background relationships of wheat, the identified highly consistent SNP molecular markers were used to analyze the population genetic structure of 306 wheat germplasms using the Admixture software. The results revealed a relatively distinct population structure among the 306 wheat germplasms, with the lowest cross-validation error rate corresponding to \u003cem\u003eK\u003c/em\u003e\u0026thinsp;=\u0026thinsp;4, thus determining the optimal cluster number as four, which was highly consistent with the phylogenetic tree classification results (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). When \u003cem\u003eK\u003c/em\u003e\u0026thinsp;=\u0026thinsp;4, Group I comprised 75 samples dominated by ancestral Q1, Group II included 102 samples dominated by ancestral Q2, Group III consisted of 76 samples dominated by ancestral Q3, and Group IV contained 53 samples dominated by ancestral Q4. Most populations exhibited signs of admixture, indicating close relationships among individuals and the presence of certain genetic exchanges (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e\u003cdiv class=\"BlockQuote\"\u003e\u003cp\u003ePopulation structure analysis (\u003cem\u003eK\u003c/em\u003e\u0026thinsp;=\u0026thinsp;4) results indicated that the genetic heterozygosity proportion was greater than that of morphological analysis. According to traditional classification methods, wheat seeds are divided into hard wheat and soft wheat based on endosperm texture. Among the 306 materials, 121 were hard wheat with compact endosperm structure, translucent appearance, and a hardness index above 60. The remaining were soft wheat with hardness indices ranging from 33 to 42, loose endosperm structure, and gypsum-like appearance.\u003c/p\u003e\u003c/div\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003e3.6 Core germplasm screening\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eCore germplasm represents the maximum range of genetic diversity of morphological characteristics, geographical distribution, genes and genotypes of a particular kind of plant or its wild relatives. It has significant academic and practical significance for promoting germplasm exchange, utilization and gene bank management. Core Hunter II software can extract diverse, representative and least redundant subsets from a large number of germplasm resources to construct core germplasm or micro-core germplasm. Based on the genetic variation marker (SNP) data combined with multiple evaluation measures (modified Rogers distance, Shannon\u0026rsquo;s diversity index, etc.) for weighted processing, we determined that the screened materials have high diversity, high representativeness and high locus richness. Using Core Hunter II software combined with the weighted index modified Rogers distance (0.7) and Shannon\u0026rsquo;s diversity index (0.3), screening was conducted according to the gradients of the total germplasm resource ratio of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. The gene coverage of the screened materials was evaluated, and finally 61 materials were determined as core germplasms.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003e3.7 SNP Fingerprint Construction\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eBased on phylogenetic analysis, population structure, and PCA, the genotypes of 114 SNP loci from 306 wheat varieties were determined to construct fingerprint profiles. The nucleotide bases at each locus exhibited four possible types: A, T, G, and C. According to SNP principles, each SNP locus may present four variant types: substitution, transversion, deletion, and insertion. The four homozygous loci types are A/A, G/G, C/C, and T/T, while the heterozygous loci include A/C, A/T, A/G, C/T, C/G, and T/G. Different colors were assigned to represent the four homozygous loci: yellow, green, blue, and purple, respectively, while white was used to denote all heterozygous loci (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). Finally, individual and composite fingerprint profiles were constructed for the 306 wheat varieties using the 114 SNP loci. These profiles visually demonstrate genomic-level differences among common cultivated wheat varieties, providing a foundation for the identification, protection, and utilization of wheat germplasm resources.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"4. DISCUSSION","content":"\u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Development of Molecular Markers\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eInterspecies genetic diversity and population evolution can be analyzed using various molecular markers, such as SSR, AFLP, SRAP, and SCoT [\u003cspan additionalcitationids=\"CR33\" citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. However, most of these molecular marker methods suffer from drawbacks such as cumbersome operation, poor stability, high cost, and insufficient accuracy. The use of different molecular markers for analyzing the same species can yield varying genetic diversity results, which is often directly related to the sensitivity and detection characteristics of the molecular markers themselves. SNP markers, as third-generation molecular markers based on genome sequencing, exhibit advantages such as high polymorphism, high density, large numbers, genetic stability, wide distribution, and high-throughput detection [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. They are ideal markers for studies on population genetic evolution, gene mapping, genetic linkage map construction, and genome-wide association analysis. The key to SLAF-seq technology lies in selecting the optimal enzyme digestion scheme to digest the species\u0026rsquo; genome, followed by sequencing to obtain genomic variation information of the target species. Unlike whole-genome sequencing, this approach avoids a large number of repetitive sequences, significantly reducing sequencing costs. It has played an important role in SNP mining for species such as hollyhock (\u003cem\u003eAlcea rosea\u003c/em\u003e L.) [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e], rose (\u003cem\u003eRosa chinensis\u003c/em\u003e Jacq.) [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e], maize [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e], rice [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e], \u003cem\u003eMiscanthus\u003c/em\u003e [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e], and eggplant (\u003cem\u003eSolanum melongena\u003c/em\u003e) [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. In this study, SLAF-seq technology was employed to conduct reduced-representation genome sequencing on 306 wheat materials with an average sequencing depth of 10.97\u0026times;. To improve marker quality, GATK and SAMtools were used for joint screening, resulting in 5,232,218 population SNP markers. After filtering, 52,228 high-quality SNP markers were obtained, fully meeting the requirements for specific locus analysis and differential genetic information discovery among the tested materials. The substantial number of SNP markers obtained can be utilized for genome-wide association studies (GWAS) on key traits related to wheat yield, quality, and stress resistance. And the results will provide a valuable reference for future molecular genetic research in wheat.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003e4.2 Analysis of Genetic Diversity\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eGenetic diversity is one of the core subjects in biological research and serves as the foundation for species survival and evolution [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]. Studying the genetic diversity of populations can provide deeper insights into a species\u0026rsquo; adaptability to the environment and its genetic structure, facilitating the conservation of germplasm resources for important species. Genetic diversity evaluation parameters reflect the levels of genetic variation among and within populations. In this study, the expected number of alleles (Ne) ranged from 1.105 to 2.000, with an average of 1.509, showing a certain discrepancy from the observed number of alleles, indicating uneven distribution of alleles within the population. MAF, which is commonly used to distinguish between common and rare variants in populations [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e], was moderate at 0.224. PIC reflects the degree of variation at SNP loci. According to the theory proposed by Botstein [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e], PIC\u0026thinsp;\u0026ge;\u0026thinsp;0.5 indicates high polymorphism; PIC between 0.25 and 0.5 indicates moderate polymorphism; and PIC\u0026thinsp;\u0026le;\u0026thinsp;0.25 indicates low polymorphism. Nei\u0026rsquo;s genetic diversity index reflects the level of genetic diversity within a population, with higher values indicating lower genetic uniformity and higher genetic diversity. SNP analysis results showed that the 306 wheat accessions tested, originating from 17 regions in China, exhibited a PIC ranging from 0.090 to 0.375, a Nei\u0026rsquo;s diversity index between 0.095 and 0.502, and a total of 1,191,222 polymorphic markers, indicating a relatively high level of genetic diversity among the tested germplasms. Of the wheat accessions tested, 170 were spring wheat distributed in northeastern and northwestern China, including Heilongjiang, Inner Mongolia, Ningxia, Xinjiang, and Gansu, regions characterized by lower temperatures and shorter growing seasons. The remaining 136 wheat accessions were weak winter wheat, mainly distributed in Shandong, Henan, Hebei, Jiangsu, and Anhui, regions with milder climates and longer growing seasons. Generally, the richer the genetic diversity of a species, the stronger its adaptability to the environment [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]. In this study, all 136 weak winter wheat varieties were able to grow normally in Bayannur, a spring wheat cultivation region, further demonstrating that higher genetic diversity is the basis for better environmental adaptation in species. In recent years, despite the development of numerous wheat varieties and increasingly abundant germplasm resources, the high-intensity artificial selection during breeding and the frequent use of the same parental lines have significantly reduced population genetic diversity, leading to a narrow genetic base among varieties and making it difficult to select new breakthrough cultivars. This may be related to breeders\u0026rsquo; converging selection criteria for agronomic traits, quality traits, and resistance, as well as the low diversity of climate types.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec23\" class=\"Section2\"\u003e \u003ch2\u003e4.3 Phylogenetic Tree Construction and Population Structure Analysis\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003ePhylogenetic trees can clearly reflect the genetic relationships among different groups but cannot reveal the genetic composition of individuals. In contrast, genetic structure analysis can display the genetic structure of test materials and the genetic composition of individuals at the DNA level, clarifying the ancestral origins of individuals and the exchange of genetic information among different genotypes. This serves as an effective method for analyzing the genetic backgrounds of diverse test materials and constitutes a fundamental task in studies on genetic diversity, varietal evolution, and association analysis of quantitative traits. It lays the foundation for understanding genetic evolution, resource conservation, and the utilization of species [\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e]. In this study, the classification results of the constructed evolutionary tree and the cross-validation error rate trough (\u003cem\u003eK\u003c/em\u003e value) corresponding to the population structure both indicated that the optimal number of clusters for the 306 wheat samples was four groups. The constructed evolutionary tree showed that the 109 samples in Group 1 and the 50 samples in Group 2 were genetically close, suggesting a relatively close kinship between these two groups.\u003c/p\u003e \u003cp\u003eThe results demonstrated that genotypes could not be precisely clustered based on their geographic origins. On the one hand, some genotypes from different geographic origins were grouped into the same cluster, indicating a certain degree of correlation among them, possibly due to germplasm exchange or cross-border trade that facilitated the transfer of germplasm from one region to another. On the other hand, some genotypes from the same geographic origin clustered into different groups, suggesting a certain degree of genetic differentiation among populations. Genotype grouping that does not reflect geographic origins has also been observed in studies on the genetic diversity of cowpea germplasm [\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"5. CONCLUSIONS","content":"\u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThis study utilized SLAF-seq technology to develop SNP markers for 306 spring and weak winter wheat samples adapted to the Hetao region, obtaining a total of 554,315 SLAF tags, among which 356,643 were polymorphic. A total of 5,232,218 population SNP markers were developed, and after population filtering, 52,228 highly consistent and effective SNP markers were obtained. Calculation of population genetic diversity indices indicated that wheat exhibits high genetic diversity at the population level. Using the screened SNP markers, a phylogenetic tree was constructed, classifying the 306 wheat sample resources into four groups. The findings of this study can provide a reference for the exploitation, utilization, and scientific conservation of wheat resources.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eAuthors\u0026rsquo; contributions\u003c/b\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eDongsheng Yang and Shuiyuan Hao conceived and designed the research. Dongsheng Yang and Xu Gao performed the experiments. Dongsheng Yang and Hao Liang completed the writing of the article. Haiwei Wang, Shijun Sun, Chao Cui, Ruinian Xu, Yulei Liu, Che Liu and Lei Wang assisted in the completion of the experiments. All authors have read and agreed to the published version of the manuscript.All authors have read and agreed to the published version of the manuscript.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAuthors\u0026rsquo; contributions\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDongsheng Yang and Shuiyuan Hao conceived and designed the research. Dongsheng Yang and Xu Gao performed the experiments. Dongsheng Yang and Hao Liang completed the writing of the article. Haiwei Wang, Shijun Sun, Chao Cui, Ruinian Xu, Yulei Liu, Che Liu and Lei Wang assisted in the completion of the experiments. All authors have read and agreed to the published version of the manuscript.All authors have read and agreed to the published version of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eScience and Technology Plan Project of Inner Mongolia Autonomous Region (2025YFHH0252); Science and Technology Plan Project of Bayannur (NMKJXM202408); Development of Inner Mongolia Through Science and Technology of China (NMKJXM202201); Development of Inner Mongolia Through Science and Technology of China (NMKJXM202302).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe raw sequencing data supporting the findings of this study have been deposited in the NCBI Sequence Read Archive (SRA) under the BioProject accession number PRJNA1314945. These data are publicly available and can be accessed through the NCBI SRA database.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe methods involved in the current study were carried out in obedience with intuitional and national regulations.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConflicts of Interest\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no conflicts of interest.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eAfzal F, Li HH, Gul A, Subhani A, Ali A, Mujeeb-Kazi, A, Ogbonnaya F, Trethowan R, Xia XC, He ZH, Rasheed A: \u003cstrong\u003eGenome-wide analyses reveal footprints of divergent selection and drought adaptive traits in synthetic-derived wheats\u003c/strong\u003e. \u003cem\u003eGenes-Genomes-Genetics\u003c/em\u003e\u003cem\u003e \u003c/em\u003e2019, \u003cstrong\u003e9\u003c/strong\u003e(6): 1957-1973.\u003c/li\u003e\n\u003cli\u003eLopes MS, El-Basyoni I, Baenziger PS, Singh S, Royo C, Ozbek K, Aktas H, Ozer E, Ozdemir F, Manickavelu A, Ban T, Vikram P: \u003cstrong\u003eExploiting genetic diversity from landraces in wheat breeding for adaptation to climate change\u003c/strong\u003e. \u003cem\u003eJournal of Experimental Botany\u003c/em\u003e 2015, \u003cstrong\u003e66\u003c/strong\u003e(12): 3477-3486.\u003c/li\u003e\n\u003cli\u003eRasheed A, Mujeeb-Kazi A, Ogbonnaya FC, He ZH, Rajaram S: \u003cstrong\u003eWheat genetic resources in the post-genomics era: promise and challenges\u003c/strong\u003e. \u003cem\u003eAnnals of botany\u003c/em\u003e 2018, \u003cstrong\u003e121\u003c/strong\u003e(4): 603-616.\u003c/li\u003e\n\u003cli\u003eSwarts K, Gutaker RM, Benz B, Blake M, Bukowski R, Holland J, Kruse-Peeples M, Lepak N, Prim L, Romay MC, Ross-Ibarra J, Sanchez-Gonzalez JDJ, Schmidt C, Schuenemann VJ, Krause J, Matson RG, Weigel D, Buckler ES, Burbano HA: \u003cstrong\u003eGenomic estimation of complex traits reveals ancient maize adaptation to temperate North America\u003c/strong\u003e. \u003cem\u003eScience\u003c/em\u003e 2017, \u003cstrong\u003e357\u003c/strong\u003e(6350): 512-515.\u003c/li\u003e\n\u003cli\u003eZhou Y, Chen ZX, Cheng MP, Chen J, Zhu TT, Wang R, Liu YX, Qi PF, Chen GY, Jiang QT, Wei YM, Luo MC, Nevo E, Allaby RG, Liu DC, Wang JR, Dvor\u0026aacute;k J, Zheng YL: \u003cstrong\u003eUncovering the dispersion history, adaptive evolution and selection of wheat in China\u003c/strong\u003e. \u003cem\u003ePlant Biotechnology Journal\u003c/em\u003e 2018, \u003cstrong\u003e16\u003c/strong\u003e(1): 280-291.\u003c/li\u003e\n\u003cli\u003eHeslot N, Jannink JL, Sorrells ME. \u003cstrong\u003ePerspectives for genomic selection applications and research in plants\u003c/strong\u003e. \u003cem\u003eCrop Science\u003c/em\u003e 2015, \u003cstrong\u003e55\u003c/strong\u003e(1) :1-12.\u003c/li\u003e\n\u003cli\u003eMorrell PL, Buckler ES, Ross-Ibarra J: \u003cstrong\u003eCrop genomics: advances and applications\u003c/strong\u003e. \u003cem\u003eNature Reviews Genetics\u003c/em\u003e\u003cem\u003e \u003c/em\u003e2011, \u003cstrong\u003e13\u003c/strong\u003e(2): 85-96.\u003c/li\u003e\n\u003cli\u003eWei DY, Cui YX, He YJ, Xiong Q, Qian LW, Tong CB, Lu GY, Ding YJ, Li JN, Jung C, Qian W: \u003cstrong\u003eA genome-widesurvey with different rapeseed ecotypes uncovers footprints of domestication and breeding\u003c/strong\u003e. \u003cem\u003eJournal of Experimental Botany\u003c/em\u003e 2017, \u003cstrong\u003e68\u003c/strong\u003e(17): 4791-4801.\u003c/li\u003e\n\u003cli\u003eStephan W: \u003cstrong\u003eSignatures of positive selection: from selective sweeps at individual loci to subtle allele frequency changes in polygenic adaptation\u003c/strong\u003e. \u003cem\u003eMolecular Ecology\u003c/em\u003e 2016, \u003cstrong\u003e25\u003c/strong\u003e(1): 79-88\u003c/li\u003e\n\u003cli\u003ePfeifer M, Kugler KG, Sandve SR, Zhan B, Rudi H, Hvidsten TR, Mayer KFX, Olsen OA: \u003cstrong\u003eGenome interplay in the grain transcriptome of hexaploid bread wheat\u003c/strong\u003e. \u003cem\u003eScience\u003c/em\u003e 2014, \u003cstrong\u003e345\u003c/strong\u003e(6194): 1250091.\u003c/li\u003e\n\u003cli\u003eAllen AM, Winfield MO, Burridge AJ, Downie RC, Benbow HR, Barker G, Wilkinson PA, Coghill JA, Waterfall C, Davassi A, Scopes G, Pirani A, Webster T, Brew F, Bloor CA, Griffiths S, Bentley AR, Alda M, Jack P, Phillips AL, Edwards KJ: \u003cstrong\u003eCharacterization of a wheat breeders\u0026rsquo; array suitable for high-throughput SNP genotyping of global accessions of hexaploidbread wheat (\u003cem\u003eTriticum aestivum\u003c/em\u003e).\u003c/strong\u003e\u003cem\u003ePlant Biotechnology Journal\u003c/em\u003e 2017, \u003cstrong\u003e15\u003c/strong\u003e(3): 390-401.\u003c/li\u003e\n\u003cli\u003eEltaher S, Li J, Freeman B, Singh S, Ali GS: \u003cstrong\u003eA genome-wide association study identified SNP markers and candidate genes associated with morphometric fruit quality traits in mangoes\u003c/strong\u003e. \u003cem\u003eBMC genomics\u003c/em\u003e 2025, \u003cstrong\u003e26\u003c/strong\u003e(1): 120.\u003c/li\u003e\n\u003cli\u003eLippolis A, Hollebrands B, Acierno V, de Jong C, Pouvreau L, Paulo J, Gezan SA, Trindade LM: \u003cstrong\u003eGWAS \u003c/strong\u003e\u003cstrong\u003ei\u003c/strong\u003e\u003cstrong\u003edentifies SNP \u003c/strong\u003e\u003cstrong\u003em\u003c/strong\u003e\u003cstrong\u003earkers and \u003c/strong\u003e\u003cstrong\u003ec\u003c/strong\u003e\u003cstrong\u003eandidate \u003c/strong\u003e\u003cstrong\u003eg\u003c/strong\u003e\u003cstrong\u003eenes for \u003c/strong\u003e\u003cstrong\u003eo\u003c/strong\u003e\u003cstrong\u003eff-\u003c/strong\u003e\u003cstrong\u003ef\u003c/strong\u003e\u003cstrong\u003elavours and \u003c/strong\u003e\u003cstrong\u003ep\u003c/strong\u003e\u003cstrong\u003erotein \u003c/strong\u003e\u003cstrong\u003ec\u003c/strong\u003e\u003cstrong\u003eontent in \u003c/strong\u003e\u003cstrong\u003ef\u003c/strong\u003e\u003cstrong\u003eaba \u003c/strong\u003e\u003cstrong\u003eb\u003c/strong\u003e\u003cstrong\u003eean (\u003cem\u003eVicia faba\u003c/em\u003e L.)\u003c/strong\u003e. \u003cem\u003ePlants\u003c/em\u003e 2025, \u003cstrong\u003e14\u003c/strong\u003e(2): 193.\u003c/li\u003e\n\u003cli\u003eYang FY, Lang T, Wu JY, Zhang C, Qu H.J, Pu ZG, Yang F, Yu M, Feng JY: \u003cstrong\u003eSNP loci identification and KASP marker development system for genetic diversity, population structure, and fingerprinting in sweetpotato (\u003cem\u003eIpomoea batatas\u003c/em\u003e L.)\u003c/strong\u003e. \u003cem\u003eBMC Genomics\u003c/em\u003e 2024, \u003cstrong\u003e25\u003c/strong\u003e(1): 1245.\u003c/li\u003e\n\u003cli\u003eAdedugba AA, Adeyemo OA, Adetumbi AJ, Ilesanmi OJ, Ogunkanmi LA: \u003cstrong\u003eGenetic diversity and population structure of some Nigerian and four African countries\u0026apos; sorghum landraces [\u003cem\u003eSorghum bicolor\u003c/em\u003e (L.) Moench] using Genotyping-By-Sequencing (GBS) SNP markers\u003c/strong\u003e. \u003cem\u003eSouth African Journal of Botany\u003c/em\u003e 2023, \u003cstrong\u003e162\u003c/strong\u003e: 495-504.\u003c/li\u003e\n\u003cli\u003eFeng JY, Zhao S, Li M, Zhang C, Qu HJ, Li Q, Li JW, Lin Y, Pu ZG: \u003cstrong\u003eGenome-wide genetic diversity detection and population structure analysis in sweetpotato (\u003cem\u003eIpomoea batatas\u003c/em\u003e) using RAD-seq\u003c/strong\u003e. \u003cem\u003eGenomics\u003c/em\u003e 2020, \u003cstrong\u003e112\u003c/strong\u003e(2): 1978-1987.\u003c/li\u003e\n\u003cli\u003eWang XY,Wang WJ, Zhan DM, Ge SS,Tang LQ: \u003cstrong\u003eGenome-\u003c/strong\u003e\u003cstrong\u003ew\u003c/strong\u003e\u003cstrong\u003eide SNP \u003c/strong\u003e\u003cstrong\u003em\u003c/strong\u003e\u003cstrong\u003earkers \u003c/strong\u003e\u003cstrong\u003eb\u003c/strong\u003e\u003cstrong\u003eased on SLAF-Seq \u003c/strong\u003e\u003cstrong\u003eu\u003c/strong\u003e\u003cstrong\u003encover \u003c/strong\u003e\u003cstrong\u003eg\u003c/strong\u003e\u003cstrong\u003eenetic \u003c/strong\u003e\u003cstrong\u003ed\u003c/strong\u003e\u003cstrong\u003eiversity of \u003cem\u003eSaccharina\u003c/em\u003e \u003c/strong\u003e\u003cstrong\u003ec\u003c/strong\u003e\u003cstrong\u003eultivars in Shandong, China\u003c/strong\u003e.\u003cem\u003e Frontiers in Marine Science\u003c/em\u003e 2022, \u003cstrong\u003e9\u003c/strong\u003e: 849502.\u003c/li\u003e\n\u003cli\u003eSun XW, Liu DY, Zhang XF, Li WB, Liu H, Hong WG, Jiang CB, Guan N, Ma CX, Zeng HP, Xu CH, Song J, Huang L, Wang CM, Shi JJ, Wang R, Zheng XH, Lu CY, Wang XW, Zheng HK: \u003cstrong\u003eSLAF-seq\u003c/strong\u003e\u003cstrong\u003e: a\u003c/strong\u003e\u003cstrong\u003en efficient method of large-scale de novo SNP discovery and genotyping using high-throughput sequencing\u003c/strong\u003e. \u003cem\u003ePLoS One\u003c/em\u003e 2013, \u003cstrong\u003e8\u003c/strong\u003e(3): e58700.\u003c/li\u003e\n\u003cli\u003eLi J, Lin JY, Ou HB, Liu XS, Jiang Y, Liang RL: \u003cstrong\u003eMarker \u003c/strong\u003e\u003cstrong\u003ed\u003c/strong\u003e\u003cstrong\u003eevelopment and \u003c/strong\u003e\u003cstrong\u003ea\u003c/strong\u003e\u003cstrong\u003enalysis of \u003c/strong\u003e\u003cstrong\u003eg\u003c/strong\u003e\u003cstrong\u003eenetic \u003c/strong\u003e\u003cstrong\u003ed\u003c/strong\u003e\u003cstrong\u003eiversity of \u003c/strong\u003e\u003cstrong\u003ep\u003c/strong\u003e\u003cstrong\u003ehoebe bournei \u003c/strong\u003e\u003cstrong\u003eg\u003c/strong\u003e\u003cstrong\u003eermplasms \u003c/strong\u003e\u003cstrong\u003eu\u003c/strong\u003e\u003cstrong\u003esing SLAF-seq \u003c/strong\u003e\u003cstrong\u003et\u003c/strong\u003e\u003cstrong\u003eechnology\u003c/strong\u003e. \u003cem\u003eMolecular Plant Breeding\u003c/em\u003e 2021, \u003cstrong\u003e19\u003c/strong\u003e(13), 4517-4524\u003c/li\u003e\n\u003cli\u003eRen HL, Han JN, Wang XR, Zhang B, Yu LL, Gao HW, Hong HL, Sun RJ, Tian Y, Qi XS, Liu ZX, Wu XX, Qiu LJ: \u003cstrong\u003eQTL mapping of drought tolerance traits in soybean with SLAF sequencing\u003c/strong\u003e. \u003cem\u003eThe Crop Journal\u003c/em\u003e 2020, \u003cstrong\u003e8\u003c/strong\u003e(6): 977-989.\u003c/li\u003e\n\u003cli\u003eWen Y, Fang YX, Hu P, Tan YQ, Wang YY, Hou LL, Deng XM, Wu H, Zhu LX, Zhu L, Chen G, Zeng DL, Guo LB, Zhang GH, Gao ZY, Dong GJ, Ren DY, Shen L, Zhang Q, Xue DW, Qian Q, Hu J: \u003cstrong\u003eConstruction of a \u003c/strong\u003e\u003cstrong\u003eh\u003c/strong\u003e\u003cstrong\u003eigh-\u003c/strong\u003e\u003cstrong\u003ed\u003c/strong\u003e\u003cstrong\u003eensity \u003c/strong\u003e\u003cstrong\u003eg\u003c/strong\u003e\u003cstrong\u003eenetic \u003c/strong\u003e\u003cstrong\u003em\u003c/strong\u003e\u003cstrong\u003eap \u003c/strong\u003e\u003cstrong\u003eb\u003c/strong\u003e\u003cstrong\u003eased on SLAF \u003c/strong\u003e\u003cstrong\u003em\u003c/strong\u003e\u003cstrong\u003earkers and QTL \u003c/strong\u003e\u003cstrong\u003ea\u003c/strong\u003e\u003cstrong\u003enalysis of \u003c/strong\u003e\u003cstrong\u003el\u003c/strong\u003e\u003cstrong\u003eeaf \u003c/strong\u003e\u003cstrong\u003es\u003c/strong\u003e\u003cstrong\u003eize in \u003c/strong\u003e\u003cstrong\u003er\u003c/strong\u003e\u003cstrong\u003eice\u003c/strong\u003e. \u003cem\u003eFrontiers in plant science\u003c/em\u003e 2020, \u003cstrong\u003e11\u003c/strong\u003e: 1143.\u003c/li\u003e\n\u003cli\u003eXia C, Chen LL, Rong TZ, Li R, Xiang Y, Wang P, Liu CH, Dong XQ, Liu B, Zhao D, Wei RJ, Lan H: \u003cstrong\u003eIdentification of a new maize inflorescence meristem mutant and association analysis using SLAF-seq method\u003c/strong\u003e. \u003cem\u003eEuphytica\u003c/em\u003e 2015, \u003cstrong\u003e202\u003c/strong\u003e(1): 35-44.\u003c/li\u003e\n\u003cli\u003eDavey JW, Cezard T, Fuentes-Utrilla P, Eland C, Gharbi K, Blaxter ML: \u003cstrong\u003eSpecial features of RAD Sequencing data: implications for genotyping\u003c/strong\u003e. \u003cem\u003eMolecular ecology\u003c/em\u003e 2013, \u003cstrong\u003e22\u003c/strong\u003e(11): 3151-3164.\u003c/li\u003e\n\u003cli\u003eKozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD: \u003cstrong\u003eDevelopment of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq\u003c/strong\u003e\u003cstrong\u003ei\u003c/strong\u003e\u003cstrong\u003ellumina sequencing platform\u003c/strong\u003e. \u003cem\u003eApplied and environmental microbiology\u003c/em\u003e\u003cem\u003e \u003c/em\u003e2013, \u003cstrong\u003e79\u003c/strong\u003e(17): 5112-5120.\u003c/li\u003e\n\u003cli\u003eLi H, Durbin R: \u003cstrong\u003eFast and accurate short read alignment with Burrows\u003c/strong\u003e\u003cstrong\u003e-\u003c/strong\u003e\u003cstrong\u003eWheeler transform\u003c/strong\u003e. \u003cem\u003eBioinformatics\u003c/em\u003e 2009, \u003cstrong\u003e25\u003c/strong\u003e(14): 1754-1760.\u003c/li\u003e\n\u003cli\u003eMcKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA: \u003cstrong\u003eThe \u003c/strong\u003e\u003cstrong\u003eg\u003c/strong\u003e\u003cstrong\u003eenome \u003c/strong\u003e\u003cstrong\u003ea\u003c/strong\u003e\u003cstrong\u003enalysis \u003c/strong\u003e\u003cstrong\u003et\u003c/strong\u003e\u003cstrong\u003eoolkit: a \u003c/strong\u003e\u003cstrong\u003em\u003c/strong\u003e\u003cstrong\u003eap\u003c/strong\u003e\u003cstrong\u003er\u003c/strong\u003e\u003cstrong\u003eeduce framework for analyzing next-generation DNA sequencing data\u003c/strong\u003e. \u003cem\u003eGenome research\u003c/em\u003e 2010, \u003cstrong\u003e20\u003c/strong\u003e(9): 1297-1303.\u003c/li\u003e\n\u003cli\u003eLi H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: \u003cstrong\u003eThe sequence alignment/map format and SAMtools\u003c/strong\u003e. \u003cem\u003eBioinformatics\u003c/em\u003e 2009, \u003cstrong\u003e25\u003c/strong\u003e(16): 2078-2079.\u003c/li\u003e\n\u003cli\u003eKumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: \u003cstrong\u003eMolecular\u003c/strong\u003e\u003cstrong\u003eevolutionary genetics analysis across computing platforms.\u003c/strong\u003e\u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e 2018, \u003cstrong\u003e35\u003c/strong\u003e(6): 1547-1549.\u003c/li\u003e\n\u003cli\u003eAlexander DH, Novembre J, Lange K: \u003cstrong\u003eFast model-based estimation of ancestry in unrelated individuals\u003c/strong\u003e. \u003cem\u003eGenome research\u003c/em\u003e 2009, \u003cstrong\u003e19\u003c/strong\u003e(9): 1655-1664.\u003c/li\u003e\n\u003cli\u003ePritchard JK, Stephens M, Donnelly P: \u003cstrong\u003eInference of population structure using multilocus genotype data\u003c/strong\u003e. \u003cem\u003eGenetics\u003c/em\u003e 2000, \u003cstrong\u003e155\u003c/strong\u003e(2): 945-959.\u003c/li\u003e\n\u003cli\u003ePrice AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D: \u003cstrong\u003ePrincipal components analysis corrects for stratification in genomewide association studies\u003c/strong\u003e. \u003cem\u003eNature Genetics\u003c/em\u003e\u003cem\u003e \u003c/em\u003e2006, \u003cstrong\u003e38\u003c/strong\u003e(8): 904-909. \u003c/li\u003e\n\u003cli\u003eDeepika C, Venkatachalam SR, Yuvaraja A, Arutchenthil P, Indra N, Ravichandran V, Veeramani P, Kathirvelan P: \u003cstrong\u003eComparison of genetic diversity using morphometric, molecular and digital imaging parameters in castor (\u003cem\u003eRicinus communis \u003c/em\u003eL.)\u003c/strong\u003e. \u003cem\u003eBiocatalysis and Agricultural Biotechnology\u003c/em\u003e 2025, 65: 103525.\u003c/li\u003e\n\u003cli\u003eHu SJ, Wang MZ, Yan XH, Cheng XM. \u003cstrong\u003eGenetic \u003c/strong\u003e\u003cstrong\u003ed\u003c/strong\u003e\u003cstrong\u003eiversity and \u003c/strong\u003e\u003cstrong\u003ep\u003c/strong\u003e\u003cstrong\u003eopulation \u003c/strong\u003e\u003cstrong\u003es\u003c/strong\u003e\u003cstrong\u003etructure of \u003c/strong\u003e\u003cstrong\u003ee\u003c/strong\u003e\u003cstrong\u003endangered \u003c/strong\u003e\u003cstrong\u003eo\u003c/strong\u003e\u003cstrong\u003erchid \u003cem\u003eCypripedium flavum\u003c/em\u003e in \u003c/strong\u003e\u003cstrong\u003ef\u003c/strong\u003e\u003cstrong\u003eragmented \u003c/strong\u003e\u003cstrong\u003eh\u003c/strong\u003e\u003cstrong\u003eabitat \u003c/strong\u003e\u003cstrong\u003eu\u003c/strong\u003e\u003cstrong\u003esing \u003c/strong\u003e\u003cstrong\u003ef\u003c/strong\u003e\u003cstrong\u003eluorescent AFLP \u003c/strong\u003e\u003cstrong\u003em\u003c/strong\u003e\u003cstrong\u003earkers\u003c/strong\u003e. \u003cem\u003ePlants\u003c/em\u003e 2024, \u003cstrong\u003e13\u003c/strong\u003e(20): 2851.\u003c/li\u003e\n\u003cli\u003eSinjare DYK, Abdulrahman SS, Khalid NS, Ismail RY, Naeem MY, Selamoglu Z, Issayev G, Ahmed AMN: \u003cstrong\u003eMolecular characterization and cuticular stomatal anatomy of \u003cem\u003ePunica Granatum\u003c/em\u003e L. cultivars study in Dohuk governorate\u003c/strong\u003e. \u003cem\u003eMolecular Biology Reports\u003c/em\u003e 2025, \u003cstrong\u003e52\u003c/strong\u003e(1): 287-287.\u003c/li\u003e\n\u003cli\u003eSathishkumar R, Mohanrao MD, Geethanjali S, Prasad MSL, Senthilvel S: \u003cstrong\u003eA simple and cost-effective SNP genotyping assay for marker-assisted selection of wilt resistance in castor breeding\u003c/strong\u003e. \u003cem\u003eIndustrial Crops and Products\u003c/em\u003e, 2025, \u003cstrong\u003e226\u003c/strong\u003e: 120693.\u003c/li\u003e\n\u003cli\u003eMuturi P, Kyallo M, Gasura E, Yao N. \u003cstrong\u003eDiversity analysis and genome-wide association studies of seed weight trait in Bambara groundnut (\u003cem\u003eVigna subterranea\u003c/em\u003e (L.) Verdc.) using diversity array technology sequence derived single nucleotide polymorphism markers\u003c/strong\u003e. \u003cem\u003eEuphytica\u003c/em\u003e 2025,\u003cstrong\u003e 221\u003c/strong\u003e(4): 34-34.\u003c/li\u003e\n\u003cli\u003eDeng WQ, Li YF, Chen X, Luo YZ, Pan YZ, Li X, Zhu ZS, Li FW, Liu XL Jia Y: \u003cstrong\u003eDevelopment of single nucleotide polymorphism (SNP) markers and construction of DNA fingerprinting of \u003cem\u003eAlcea rosea\u003c/em\u003e L. based on specific-locus amplified fragment sequencing (SLAF-seq) technology\u003c/strong\u003e. \u003cem\u003eGenetic Resources and Crop Evolution\u003c/em\u003e 2024, \u003cstrong\u003e72\u003c/strong\u003e(2): 1-15.\u003c/li\u003e\n\u003cli\u003eXia AN, Yang AA, Meng XS, Dong GZ, Tang XJ, Lei SM, Liu YG: \u003cstrong\u003eDevelopment and application of rose (\u003cem\u003eRosa chinensis\u003c/em\u003e Jacq.) SNP markers based on SLAF-seq technology\u003c/strong\u003e. \u003cem\u003eGenetic Resources and Crop Evolution\u003c/em\u003e 2022, \u003cstrong\u003e69\u003c/strong\u003e: 173-182.\u003c/li\u003e\n\u003cli\u003eWen TT, Zhang XF, Zhu JJ, Zhang SS, Rhaman MS, Zeng W: \u003cstrong\u003eA SLAF-based high-density genetic map construction and genetic architecture of thermotolerant traits in maize (\u003cem\u003eZea mays\u003c/em\u003e L.)\u003c/strong\u003e. \u003cem\u003eFrontiers in Plant Science\u003c/em\u003e 2024, \u003cstrong\u003e15\u003c/strong\u003e: 1338086.\u003c/li\u003e\n\u003cli\u003eLiu X, Zhang N, Sun YR, Fu ZX, Han YH, Yang Y, Jia JC, Hou SY, Zhang BJ: \u003cstrong\u003eQTL mapping of downy mildew resistance in foxtail millet by SLAF-seq and BSR-seq analysis\u003c/strong\u003e. \u003cem\u003eTheoretical and Applied Genetics\u003c/em\u003e 2024, \u003cstrong\u003e137\u003c/strong\u003e(7): 168.\u003c/li\u003e\n\u003cli\u003eChen ZY, He YC, Iqbal Y, Shi YL, Huang HM, Yi ZL: \u003cstrong\u003eInvestigation of genetic relationships within three Miscanthus species using SNP markers identified with SLAF-seq\u003c/strong\u003e. \u003cem\u003eBMC Genomics\u003c/em\u003e 2022, \u003cstrong\u003e23\u003c/strong\u003e(1): 43.\u003c/li\u003e\n\u003cli\u003eWei QZ, Wang WH, Hu TH, Hu HJ, Wang JL, Bao CL: \u003cstrong\u003eConstruction of a SNP-based genetic map using SLAF-seq and QTL analysis of morphological traits in eggplant\u003c/strong\u003e. \u003cem\u003eFrontiers in Genetics\u003c/em\u003e 2020, \u003cstrong\u003e11\u003c/strong\u003e: 178.\u003c/li\u003e\n\u003cli\u003eSouza IGB, Souza VAB, Lima PSC: \u003cstrong\u003eMolecular characterization of \u003cem\u003ePlatonia insignis\u003c/em\u003e Mart. (\u0026quot;\u003cem\u003ebacurizeiro\u003c/em\u003e\u0026quot;) using inter simple sequence repeat (ISSR) markers\u003c/strong\u003e. \u003cstrong\u003e\u003cem\u003eMolecular biology reports\u003c/em\u003e\u003c/strong\u003e\u003cem\u003e \u003c/em\u003e2013, \u003cstrong\u003e40\u003c/strong\u003e(5): 3835-3845.\u003c/li\u003e\n\u003cli\u003eDussault FM, Boulding EG: \u003cstrong\u003eEffect of minor allele frequency on the number of single nucleotide polymorphisms needed for accurate parentage assignment:\u003c/strong\u003e\u003cstrong\u003eA methodology illustrated using Atlantic salmon\u003c/strong\u003e. \u003cem\u003eAquaculture Res\u003c/em\u003e\u003cem\u003eearch\u003c/em\u003e 2018, \u003cstrong\u003e49\u003c/strong\u003e: 1368-1372.\u003c/li\u003e\n\u003cli\u003ePauls SU, Nowak C, B\u0026aacute;lint M, Pfenninger M: \u003cstrong\u003eThe impact of global climate change on genetic diversity within populations and species\u003c/strong\u003e. \u003cem\u003eMolecular ecology\u003c/em\u003e\u003cstrong\u003e2013\u003c/strong\u003e, 22(4): 925-946.\u003c/li\u003e\n\u003cli\u003eWei SP, Liu XF, Yang SX, Lv HY, Niu Y, Zhang YM: \u003cstrong\u003eComparison of various clusteringmethods for population structure in Chinese cultivated soybean[\u003cem\u003eGlycine max\u003c/em\u003e (L.) Merr.]\u003c/strong\u003e. \u003cem\u003eJournal of Nanjing Agricultural University\u003c/em\u003e 2011, \u003cstrong\u003e34\u003c/strong\u003e(2): 13-17.\u003c/li\u003e\n\u003cli\u003eChipeta MM, Kafwambira J, Yohane, E: \u003cstrong\u003eCowpea genetic diversity, population structure and genome-wide association studies in Malawi: insights for breeding programs\u003c/strong\u003e. \u003cem\u003eFrontiers in Plant Science\u003c/em\u003e 2025, \u003cstrong\u003e15\u003c/strong\u003e: 1461631.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gics","sideBox":"Learn more about [BMC Genomics](http://bmcgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gics","title":"BMC Genomics","twitterHandle":"#BMCGenomics","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Wheat, SLAF-seq, SNP, Genetic diversity, Fingerprint map","lastPublishedDoi":"10.21203/rs.3.rs-8808792/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8808792/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eMolecular markers are indispensable tools for identifying genetic variation among plant individuals and enhancing breeding efficiency. In this study, we developed SNP markers, assessed genetic diversity, and established fingerprint maps for 306 wheat germplasm accessions from China using SLAF-seq technology. We obtained 4978.16 Mb of clean reads after quality control of individual sample sequencing data. The number of SNP markers detected per sample ranged from 7.03 to 35.92\u0026nbsp;million. A total of 554,315 SLAF tags were identified, including 356,643 polymorphic tags. After population-level SNP filtering, 52,228 highly consistent and effective SNP markers were retained. Genetic diversity analysis revealed relatively close genetic relationships among the wheat varieties, with an average observed heterozygosity of 0.090 and a mean polymorphism information content (PIC) of 0.251. Population structure analysis (\u003cem\u003eK\u003c/em\u003e\u0026thinsp;=\u0026thinsp;4) indicated that most accessions shared close ancestral relationships, with evidence of admixture. Cluster analysis grouped the 306 wheat germplasm resources into four distinct clusters. Further filtering identified 114 core SNP markers, enabling the successful construction of a fingerprint database encompassing all 306 accessions. This study demonstrates that SLAF-seq is a cost-effective and efficient method for high-throughput SNP marker development and a powerful tool for wheat germplasm genetic analysis. The SNP markers identified here can facilitate germplasm identification, varietal improvement, protection, utilization, and QTL mapping of important traits with yield and quality, significantly advancing molecular breeding efforts in wheat.\u003c/p\u003e","manuscriptTitle":"SLAF-seq Efficiently Identifies SNP Markers for Wheat (Triticum aestivum L.) Development","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-02-27 14:46:44","doi":"10.21203/rs.3.rs-8808792/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-03-06T05:02:10+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-03-05T02:48:18+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-02-25T10:15:34+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"77162041378850240628295733505925846164","date":"2026-02-25T08:55:46+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"120327562041769857487031978755690258656","date":"2026-02-25T00:46:52+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-02-24T12:07:21+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-02-10T09:11:30+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-02-09T23:04:39+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-02-09T23:02:46+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Genomics","date":"2026-02-06T15:10:01+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gics","sideBox":"Learn more about [BMC Genomics](http://bmcgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/gics","title":"BMC Genomics","twitterHandle":"#BMCGenomics","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"a0672986-391b-40fb-b082-23dd71c07963","owner":[],"postedDate":"February 27th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-04T05:55:36+00:00","versionOfRecord":[],"versionCreatedAt":"2026-02-27 14:46:44","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8808792","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8808792","identity":"rs-8808792","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-20T11:00:21.680559+00:00
License: CC-BY-4.0