Multi-tissue Transcriptomic and Pan-genomic Analyses Reveal Reciprocal Selective Retention Driving the Phenotypic Trade-off in Gossypium hirsutum and Gossypium barbadense | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Multi-tissue Transcriptomic and Pan-genomic Analyses Reveal Reciprocal Selective Retention Driving the Phenotypic Trade-off in Gossypium hirsutum and Gossypium barbadense Yifan Xu, Zaoyang Gong, Qinglin Shen, Rongzheng Zhao, Chuankang Cheng, and 13 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8754476/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 13 You are reading this latest preprint version Abstract The long-standing phenotypic trade-off between Upland cotton ( Gossypium hirsutum , high yield) and Sea Island cotton ( G. barbadense , high quality) represents a major bottleneck in cotton genetic improvement and domestication. Despite rapid advances in genomics, the genomic structural variations driving the divergence of these two domestication strategies, and their subsequent impact on downstream transcriptional networks, remain unclear. This study integrates multi-tissue transcriptomic atlases spanning the full growth period with pan-genomic analyses of both Upland and Sea Island cotton, proposing a "Reciprocal Selective Retention (RSR)" strategy adopted by the two species during evolution. Transcriptomic analysis revealed that while the transcriptional chassis of the two species is highly conserved, a drastic "phase inversion" occurs during fiber development. Sea Island cotton appears to trade off for quality by specifically activating a "delayed elongation" module (at 20 DPA), whereas Upland cotton initiates a "precocious filling" program to pursue yield. Further Weighted Gene Co-expression Network Analysis (WGCNA), combined with genomic Presence/Absence Variation (PAV) analysis, identified a set of key transcription factors exhibiting a "3-vs-3" reciprocal loss pattern. Upland cotton specifically retained CRF10 , WIND1 , and MYB93 , constructing a genetic foundation of "robust root system and strong regeneration" to support high yield. Conversely, Sea Island cotton specifically retained MYB111 and ERF105/017 to maintain long-staple characteristics and environmental buffering. Whole-genome structural variation and microsynteny analyses confirmed that these differences stem from asymmetric physical deletions (e.g., a 5 kb deletion in MYB111 ) or fragment sequence collapse (e.g., a 15.9 kb sequence divergence in MYB93 ) on homologous chromosomes. The RSR model proposed in this study not only offers novel insights into the genetic architecture of the yield-quality trade-off but also provides precise genomic targets for molecular design breeding. Cotton Pan-genome Multi-tissue comparative transcriptomics Reciprocal Selective Retention (RSR) Structural variation Phenotypic trade-off Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 1. Introduction Polyploidization and domestication are two major forces driving crop evolution, often accompanied by profound genomic reshaping and phenotypic trade-offs. As a classic allotetraploid model, cultivated cotton (Gossypium spp.) presents a fascinating case of "divergent domestication" (Wen et al. 2023 ). The two dominant cultivated species, Upland cotton (Gossypium hirsutum) and Sea Island cotton (G. barbadense), have evolved distinct adaptive strategies under artificial selection: the former was selected for broad environmental adaptability and high yield potential (typified by standard line TM-1), whereas the latter was selected for superior fiber quality at the cost of yield and robustness (typified by H7124) (Avci et al. 2013 ; Alagarsamy 2023 ). Unraveling the genomic architecture underlying this "yield-quality" trade-off is not only crucial for cotton genetic improvement but also provides fundamental insights into how homologous genomes differentiate functionally during polyploid evolution (Liu et al. 2024 ). With the advent of the "Pan-genomic Era," cotton genomics has witnessed milestone breakthroughs. Following the release of initial draft genomes (Li et al. 2015 ; Zhang et al. 2015 ), high-quality reference assemblies (Yuan et al. 2015 ; Hu et al. 2019 ) and recent "Telomere-to-Telomere" (T2T) assemblies (Hu et al. 2025 a; Hu et al. 2025 b) have provided unprecedented resolution for dissecting complex allotetraploid genomes. Despite this abundance of genomic data, the precise molecular mechanisms driving the phenotypic divergence between the two species remain incompletely understood. Traditional Genome-Wide Association Studies (GWAS) based on Single Nucleotide Polymorphisms (SNPs) often face the "missing heritability" challenge when explaining inter-specific phenotypic divergence (Manolio et al. 2009 ). Increasing evidence suggests that large-scale Structural Variations (SVs), such as Presence/Absence Variations (PAVs), act as major drivers of adaptive evolution in polyploid crops by physically reshaping regulatory landscapes (Pinglay et al. 2025 ). A critical knowledge gap remains: how do these "hard" genomic variations (SVs) structurally remodel downstream "soft" transcriptional networks to establish the canalized trait barriers between Upland and Sea Island cotton (Wendel 2000 )? Integrating multi-tissue transcriptomics with pan-genomics offers a powerful strategy to bridge this gap. While comparative transcriptomics has been applied to understand tissue-specific regulation in mammals (Yao et al. 2022 ), such systematic analyses remain limited in polyploid crops, often confined to single tissues like fibers (Chen et al. 2012 ) or leaves (Cheng et al. 2025 ). Unlike diploid models, the domestication of allotetraploid cotton involves complex subgenome interactions and the differentiation of homologous chromosomes. Therefore, constructing a spatiotemporal transcriptomic atlas across the full growth cycle, combined with pan-genomic SV analysis, is essential for revealing how "genomic hardware" drives the remodeling of "transcriptomic software." In the complex transcriptional regulatory networks of plants, transcription factors serve as key bridges connecting genomic variation to phenotypic remodeling. Among them, the MYB transcription factor superfamily occupies an irreplaceable core position in cotton fiber development due to its vast membership and high functional diversity (Dubos et al., 2010; Zhang et al., 2025 a). However, cotton domestication involves not only the improvement of single fiber traits but also a fundamental shift in adaptive strategies at the whole-plant level - most notably, the "broad adaptability" of Upland cotton versus the "delicate" nature of Sea Island cotton. This strongly suggests that, in addition to the MYB family, other key factors regulating cell fate determination and environmental response - such as the AP2/ERF family - may play indispensable synergistic roles in this process. AP2/ERF family members (e.g., WIND1 , ERF ) act not only as primary responders to hormone signals like ethylene but also as "master switches" regulating cell dedifferentiation and regeneration (Iwase et al., 2011). Based on multi-tissue transcriptomic and pan-genomic analyses, this study reveals the molecular mechanisms driving the phenotypic divergence between Upland and Sea Island cotton. We discovered that: (1) although the global transcriptional chassis remains highly conserved, significant divergence occurs in specific tissues; (2) a distinct "phase inversion" of transcriptional programs takes place during critical stages of fiber development; (3) guided by WGCNA, we pinpointed a key regulatory module driving the "delayed elongation" trait and identified its core regulators; and (4) genomic analyses confirmed that these losses stem from asymmetric structural variations targeting genes that were originally intact in the ancestral gene pool, with pan-genomic profiling further verifying the universality of this "retention vs. loss" pattern across broad cultivated germplasm resources. Based on these findings, we propose a "Reciprocal Selective Retention (RSR)" evolutionary model in cultivated cottons, which suggests that the two species shaped their distinct adaptive strategies by asymmetrically retaining or eliminating specific functional genetic modules from the ancestral gene pool. Based on this model, we dissect the molecular logic underlying the trade-off between "high yield" and "high quality" at the level of genomic structure, providing a theoretical basis and critical genomic targets for breeding novel cotton varieties that combine both traits. 2. Materials and Methods 2.1 Plant Materials and Data Acquisition In this study, we utilized the Telomere-to-Telomere (T2T) reference genome of Upland cotton ( G. hirsutum TM-1) published by Yan et al. ( 2025 ), and the high-quality reference genome of Sea Island cotton ( G. barbadense H7124) sequenced by Hu et al. ( 2019 ). The genomic data for G. barbadense were downloaded from the CottonMD database (Yang et al. 2023 ). Additionally, for pan-genomic evolutionary analysis, we collected chromosome-level genome assemblies of 26 Gossypium species, including wild diploids and tetraploids, from CottonGen (Yu et al. 2021 ) and the NCBI database (Supplementary Table S1 ). 2.2 Transcriptome Data Acquisition and Preprocessing The gene expression data used in this study were retrieved from the CottonMD database (Yang et al. 2023 ). The raw high-throughput sequencing data originated from the genome sequencing project of Upland cotton (TM-1) and Sea Island cotton (H7124) by Hu et al. ( 2019 ) (BioProject: PRJNA490626). We downloaded the normalized gene expression matrix (TPM values) and selected sample data covering 21 representative tissues/stages, including roots, stems, leaves, floral organs, and ovules and fibers at various developmental stages. To enhance analysis reliability and reduce background noise, genes with extremely low expression levels (TPM < 0.1) across all samples were filtered out prior to downstream analysis. 2.3 Gene Expression Characterization and Comparative Transcriptomic Analysis To evaluate global expression patterns across samples, Principal Component Analysis (PCA) was visualized using the ggplot2 package (Wickham 2016 ) in R (R Core Team 2025 ), based on a log2(TPM + 1) transformed expression matrix. Additionally, t-SNE (Van der Maaten and Hinton 2008 ) and UMAP (McInnes et al. 2018 ) dimensionality reduction analyses were performed using the Rtsne and umap packages, respectively. To quantify transcriptional conservation between species, expressed genes were filtered with a threshold of TPM ≥ 0.1, and the Pearson correlation coefficient ( R ) of the number of expressed genes between corresponding tissues of TM-1 and H7124 was calculated using the “cor.test” function in R. Furthermore, to dissect divergence in expression abundance, expressed genes in each tissue were ranked by TPM values and categorized into five expression windows (Very Low, Low, Medium, High, Very High). The number of "shared genes" falling into the same expression window in both species for a given tissue was counted using custom Python scripts. Finally, to eliminate the bias of total gene number differences, we calculated the relative proportions of common genes versus species-specific genes in each tissue. For tissue specificity analysis, the Tissue Specificity Index (Tau) was calculated following the method of Yanai et al. ( 2005 ). A Tau value closer to 1 indicates stronger tissue specificity. Specific genes were defined as those with a Specificity Measure (SPM) > 0.5 (Kryuchkova-Mostacci and Robinson-Rechavi, 2017 ), and UpSet plots were generated using the R package UpSetR (Conway et al. 2017 ) to visualize the intersections of specific genes across different tissues. 2.4 Weighted Gene Co-expression Network Construction and Unbiased Functional Annotation To mine key co-expression gene modules driving the delayed fiber elongation in Sea Island cotton (Avci et al. 2013 ), we constructed a weighted gene co-expression network using the WGCNA package in R (Langfelder and Horvath 2008 ). Considering computational efficiency and noise reduction, the top 6,000 genes with the highest variance in fiber development samples were selected as input data. First, a Pearson correlation matrix between genes was calculated, and a scale-free network was constructed based on a soft threshold. Subsequently, the adjacency matrix was converted into a Topological Overlap Matrix (TOM), and co-expression modules were identified using the Dynamic Tree Cut algorithm. Module expression patterns were visualized via Module Eigengenes (ME) to identify target modules specifically highly expressed in 20 DPA fibers of G. barbadense. For unbiased functional annotation of the target module, protein sequences of all genes within the module were extracted and scanned against the Pfam-A database (Mistry et al. 2021 ) using the hmmscan program in HMMER 3.0 software (Eddy 2011 ). The filtering criterion was set to an e-value < 1e-5. Based on the scan results, the types of domains contained in each gene were counted, and gene families were ranked by domain frequency to determine the most enriched categories of transcriptional regulators in the module. 2.5 Genome-wide Identification of Key Transcription Factor Families (MYB and AP2/ERF) To construct a high-coverage transcription factor family atlas and mine potential unannotated loci, we performed genome-wide identification of the MYB and AP2/ERF families in the TM-1 and H7124 genomes using the Bitacora pipeline (V1.4.2) (Vizueta et al. 2020 ). We employed the software's "Full Mode," integrating genomic sequences, genome annotation files, and protein sequences for comprehensive analysis. The specific workflow was as follows: Query libraries containing Hidden Markov Models (HMM) for MYB (Pfam: PF00249) and AP2 (Pfam: PF00847) domains were constructed, with Pfam seed files downloaded from the Pfam database (Mistry et al. 2021 ). HMMER (Eddy 2011 ) was first used to scan the input reference proteomes to identify annotated family members. For Homology-based Gene Prediction, the integrated GeMoMa software (Keilwagen et al. 2016 ) was used to deeply mine loci potentially missed in the reference annotation. This step used homologous proteins identified in the first step as templates to predict gene structures directly from genomic DNA sequences, thereby recovering "Putative New Genes" that were missed or incompletely annotated. Subsequently, Bitacora automatically merged results from "proteome search" and "genome prediction" and removed redundancy based on coordinates. The final candidate sequences were further verified for domain integrity using the Pfam database (Mistry et al. 2021 ), discarding sequences with an e-value < 1e-5 or missing key domains to ensure high confidence for downstream analysis. 2.6 Screening and Reciprocal Validation of Species-Specific Transcription Factors To pinpoint key candidate factors undergoing "Reciprocal Selective Retention" from vast gene families, we established a rigorous "PAV Intersection-Reciprocal Validation" screening workflow. Initial Screening: First, genome-wide Presence/Absence Variation (PAV) lists generated by comparative genomic analysis using MUMmer4 (Marçais et al. 2018 ) and Minimap2 (Li 2018 ) were intersected with the identified MYB and AP2/ERF family members. This preliminarily screened for candidate transcription factors present in H7124 but absent in TM-1 (H7124-specific), or present in TM-1 but absent in H7124 (TM-1-specific). Reciprocal BLAST Validation: To exclude false positives caused by genome assembly errors, missing annotations, or pseudogenization, we implemented strict sequence-level validation for all candidate genes. The complete CDS sequences of candidate genes were extracted as queries and reciprocally aligned to the counterpart reference genome using BLASTN (Camacho et al. 2009 ) (e.g., aligning H7124-specific candidates back to the TM-1 genome, and vice versa). Retention Criteria: We retained only those genes for which no valid homologous match was detected in the counterpart genome, or where detected matches exhibited severe sequence truncation (Coverage < 60%) and high sequence divergence (Identity < 85%). 2.7 Genomic Structural Variation and Microsynteny Analysis To elucidate the physical mechanisms leading to the loss of the aforementioned key transcription factors (e.g., physical deletion/NOTAL or sequence collapse/HDR), we combined whole-genome alignment with local microsynteny analysis. Whole-Genome Structural Variation (SV) Identification: The SyRI (Synteny and Rearrangement Identifier) tool (Goel et al. 2019 ) was used to systematically identify syntenic regions, inversions, translocations, and local variations (including NOTAL and HDR) based on NUCmer (Marçais et al. 2018 ) alignment results of TM-1 and H7124 (parameters: --maxmatch -l 100 -c 1000). The Python tool plotsr (Goel and Schneeberger 2022 ) was used to visualize the SV status of candidate gene loci. Targeted Microsynteny Mapping: To reconstruct the evolutionary history of the six candidate loci, we selected eight representative Gossypium genomes (Supplementary Table S1 ) to construct microsynteny maps. Centered on the target gene (e.g., MYB111 or WIND1), 2–5 single-copy, highly conserved genes upstream and downstream were selected as physical anchors. Genomic sequences between anchors were extracted, and the presence form (intact, truncated, or completely lost) of the target gene in different species was determined using BLASTN (Camacho et al. 2009 ). The Python tool plotsr (Goel and Schneeberger 2022 ) was then used to plot microsynteny structures to illustrate the specific types of gene loss. 2.8 Pan-genome Homology Search and Evolutionary Trajectory Analysis To verify the universality of the RSR pattern in Gossypium germplasm resources, we extended the analysis to 26 Gossypium pan-genomes. Using the CDS sequences of the six candidate genes as probes, homology searches were performed against all pan-genomes using the BLAST+ package (Camacho et al. 2009 ) (threshold: e-value < 1e-5). A custom Python script was used to extract the best hit locus in each genome and calculate its Identity and Coverage relative to the query sequence. Simultaneously, chromosomal location information (A-subgenome or D-subgenome) of the best hit was recorded to distinguish between orthologs and partially retained homeologs. The final results were visualized as bubble plots using the Python tool plotsr (Goel and Schneeberger 2022 ) to display the lineage-specific retention and loss trajectories of these key factors during the domestication from wild species to cultivars. 3. Results 3.1 Global Transcriptome Landscape Reveals Evolutionary Conservatism and Tissue-Specific Divergence Between Upland and Sea Island Cotton To systematically dissect the regulatory divergence driving the phenotypic trade-off between yield and quality, we constructed a comprehensive transcriptomic atlas covering 21 tissues throughout the full growth period of Upland cotton (G. hirsutum, TM-1) and Sea Island cotton (G. barbadense, H7124). Dimensionality reduction analysis (Fig. 1 A) showed that samples clustered tightly by tissue type rather than species. Vegetative tissues (roots, stems, leaves) and floral organs of both species clustered together, indicating that despite divergent domestication, the core transcriptional chassis maintaining basic physiological functions remains highly conserved. However, beneath this global conservation, we observed significant lineage-specific differentiation in transcriptional activity. Comparison of expressed gene numbers across homologous tissues showed moderate correlation (Pearson’s R = 0.63) but highlighted unique outliers closely related to agronomic traits (Fig. 1 B). Notably, TM-1 exhibited significantly more expressed genes in roots and anthers, reflecting its robust resource acquisition capability and reproductive potential. In contrast, H7124 maintained a more active transcriptome in developing fibers (especially at 20 DPA) and late-stage ovules, suggesting a more persistent regulatory program underlying its superior fiber quality. Furthermore, the distribution of the Tissue Specificity Index (Tau) revealed a shift in regulatory strategies (Fig. 1 C). While both species exhibited a bimodal distribution, TM-1 showed an expansion in the number of highly tissue-specific genes (Tau > 0.5). To verify this differentiation at the gene level, we further performed pairwise comparisons of Tau indices for orthologous genes (Supplementary Fig. 1). The analysis showed that despite extremely high evolutionary conservation genome-wide (R = 0.951), confirming that functional specialization of most genes has been "canalized," we still identified a series of "outlier genes" significantly deviating from the diagonal. These genes, which underwent drastic drift in expression breadth between species, combined with the expansion observed in Fig. 1 C, strongly suggest that the evolutionary driving force for the "high yield" trait of TM-1 stems from the directional specialization and remodeling of key functional modules in specific organs (especially underground roots and reproductive sinks). 3.2 Asymmetric Expansion of Tissue-Specific Gene Repertoires and Phase Inversion of Fiber Development Programs To dissect the fine-tuning mechanisms determining "yield-quality" divergence, we first systematically deconstructed the distribution patterns of tissue-specific gene repertoires (SPM > 0.5) in the two species using UpSet plots (Supplementary Figs. 2 & 3). The analysis revealed that Upland cotton and Sea Island cotton diverged in their investment directions for "functionally specialized genes" during domestication. Upland cotton (TM-1) exhibited a significant "underground-priority" strategy (Supplementary Fig. 3). Its root-specific gene number underwent an explosive expansion, constituting the most significant specialized module among all tissues, suggesting the evolution of a complex root network to support high yield potential. However, as an evolutionary trade-off, its number of specific genes significantly contracted during the middle to late stages of fiber development (10–25 DPA). In contrast, although Sea Island cotton (H7124) had fewer root genes, it maintained sustained high investment in reproductive sink organs (Supplementary Fig. 2); particularly in 20 DPA Fiber (elongation-thickening transition period) and 25 DPA Fiber, H7124 still retained an extremely vast specific gene repertoire, providing the genetic basis for the fine shaping of "extra-long staple" quality. This asymmetric distribution of gene repertoires—"TM-1 strong roots" vs. "H7124 strong fibers"—further manifested as a drastic "phase inversion" in the spatiotemporal dynamics of the transcriptome (Fig. 2 ): First, the overall distribution of gene expression (Fig. 2 b) and stratified sharing analysis (Fig. 2 c) reconfirmed the evolutionary conservation of vegetative growth. In vegetative organs such as roots, stems, and leaves, the two species not only shared highly consistent overall distribution shapes of gene expression abundance but also retained a high proportion of common genes in all expression windows (especially medium-to-high expression intervals) (green bars, approx. 30%-50%, Fig. 2 a), indicating that the transcriptional network maintaining basic biomass accumulation in cotton plants is stable across species. However, entering the critical stages of fiber and ovule development, the transcriptional program underwent fundamental remodeling, presenting a significant "species-biased phase inversion" (Fig. 2 a): H7124-dominated "Delayed Elongation" Phase (20 DPA): At the critical node of transition from fiber elongation to secondary wall thickening, Sea Island cotton-specific expressed genes (red bars) occupied absolute dominance (approx. 70%), while common genes and TM-1 specific genes were compressed to extremely low proportions. Combined with the wider high-expression interval of H7124 in this period in Fig. 2 b, this indicates that Sea Island cotton extends the developmental window of fiber elongation by specifically activating a vast transcriptional network (i.e., the gene repertoire observed in Supplementary Fig. 2), thereby achieving the high-quality phenotype. TM-1-dominated "Precocious Filling" Phase (25 DPA): Subsequently, during the high-speed secondary wall synthesis period, the trend reversed. The proportion of Upland cotton-specific genes (blue bars) significantly rebounded and took dominance. This suggests that Upland cotton initiates the secondary wall deposition program centered on biomass accumulation earlier, reflecting its breeding selection strategy for "early maturity and high yield (high lint percentage)". In summary, although the basic transcriptional chassis of cotton is conservative, Upland and Sea Island cotton likely achieved spatiotemporal separation of developmental programs by recruiting distinct specific gene modules: H7124 chose a delay strategy of "trading time for quality," while TM-1 chose a precocious strategy of "efficiency first". This phase inversion at the transcriptional level (Fig. 2 evidence), caused by the asymmetric expansion of gene repertoires (UpSet evidence), provides systematic molecular insights for understanding the evolutionary trade-off between "high yield" and "high quality". 3.3 WGCNA Reveals a "Delayed Elongation" Regulatory Module Centered on MYB-AP2 To deconstruct the "fiber development phase inversion" phenomenon observed in Fig. 2 and mine the core drivers maintaining high transcriptional activity in Sea Island cotton (H7124) at 20 DPA, we constructed a gene co-expression network using WGCNA. The analysis pinpointed Module 6 as the key target module: the expression profile of this module highly matches the "delayed elongation" phenotype of Sea Island cotton (Fig. 3 a), maintaining extremely high activity in H7124 fibers at 10–20 DPA, while rapidly declining in Upland cotton (TM-1) during the same period. This significant species-specific expression pattern may imply that Module 6 is the transcriptional basis for Sea Island cotton fibers breaking through the conventional elongation time limit. Statistics on the number of gene families contained in this module showed (Fig. 3 b) that it is enriched with a large number of Protein Kinase (Pkinase) and Leucine-Rich Repeat (LRR) domains, indicating that cells are receiving continuous growth signals to maintain an active metabolic state. More critically, at the transcriptional regulation level (Fig. 3 c), the MYB and AP2/ERF families showed the highest recruitment abundance. Among them, the MYB family (38 members) ranked first as recognized fiber development regulators, followed closely by the AP2/ERF family (37 members), which typically acts as ethylene response factors regulating cell elongation and stress adaptation. Considering the classic functions of the MYB family in secondary wall and fiber elongation, and the key role of AP2/ERF in coordinating hormone signals and environmental adaptation, the high co-enrichment of these two families in Module 6 implies that the Sea Island cotton-specific fiber elongation network may be orchestrated synergistically by these two types of key factors. To verify this hypothesis and trace the genetic roots leading to the expression divergence of Module 6 between species, we subsequently performed systematic identification of MYB and AP2 families genome-wide. Through intersection screening of gene family members with whole-genome PAV (Presence/Absence Variation) lists, supplemented by strict reciprocal BLAST validation (Supplementary Figs. 4 & 5), we finally locked down 6 key transcription factors that underwent "reciprocal loss" between the two cotton species from the genome. To parse the physical mechanisms of these loss events, we further performed microscopic synteny and structural variation analyses on them. 3.4 Reciprocal Genomic Structural Variation and Physical Loss Mechanisms of Key Transcription Factors To dissect the physical basis of expression divergence for the key transcription factors identified by WGCNA, PAV, and reciprocal BLAST, we employed a "Reciprocal Whole-genome Alignment" strategy. First, SyRI analysis constructed a macroscopic structural variation map using each genome as a reference (Supplementary Figs. 6 & 7). Chromosome-scale visualization clearly displayed a highly collinear background (gray links) between TM-1 and H7124 at the whole-genome level (A01-A13, D01-D13), confirming the high quality and overall stability of the genome assemblies for both species. Based on this robust macroscopic framework, we precisely extracted local alignment information for the six candidate gene loci from the whole-genome alignment results and plotted high-resolution microscopic structural variation maps (Fig. 4 ). This progressive "zoom-in" analysis from "whole-chromosome background" to "specific loci" revealed two lineage-specific loss mechanisms: First, regarding MYB111 (A12), which is specifically retained in Sea Island cotton and serves as the core factor driving "delayed elongation" as mentioned earlier, SyRI analysis revealed the root cause of its silencing in Upland cotton: a physical deletion (NOTAL) of 5.06 kb occurred at the corresponding chromosomal position in TM-1 (Fig. 4 b). This "clean" genomic excision directly removed the coding region of MYB111 , genetically terminating the potential for further fiber elongation in Upland cotton, thereby shifting the strategy toward early maturity and high yield. In a mirror contrast, WIND1 (A02), specifically retained in Upland cotton and a potential regulator of cell regeneration and root plasticity, underwent a catastrophic loss event in Sea Island cotton. The corresponding region in H7124 suffered a massive deletion of 201.90 kb (Fig. 4 a), resulting in the complete removal of the gene and its surrounding regulatory regions. This may explain the defects in root adaptability (i.e., the "delicate" nature) of Sea Island cotton. In addition to physical deletion, Sequence Collapse (HDR) represents another major loss mechanism. For instance, the Upland-specific MYB93 (A11) and CRF10 (A12) both fell into Highly Diverged Regions (HDR) at their corresponding loci in Sea Island cotton, indicating that the original gene sequences underwent drastic rearrangement or degeneration, leading to loss of function. In summary, Fig. 4 implies that this "double dissociation" phenomenon between the two cultivated species is not a sequencing error but likely a genuine genomic event. Upland cotton "discarded" quality genes like MYB111 in exchange for early maturity, while Sea Island cotton "discarded" adaptability genes like WIND1 in exchange for specialized development. This asymmetric structural variation on homologous chromosomes constitutes the solid genetic basis for the "yield-quality" phenotypic trade-off between the two. 3.5 Pan-genome Microsynteny Reveals Lineage-Specific Loss Trajectories of Key Factors To further confirm that the structural variations identified in Fig. 4 were driven by directional selection during domestication rather than random events, we expanded our scope to the evolutionary history of the Gossypium genus. By incorporating the genomes of diploid ancestors ( G. raimondii , G. arboreum ) and wild tetraploids ( G. tomentosum , G. mustelinum ), we constructed high-resolution microsynteny maps for the six key loci (Fig. 5 ). The analysis clearly revealed an evolutionary pattern of "ancestral omnipotence vs. descendant specialization," strongly supporting the RSR hypothesis. "Upland-style Elimination" of Quality Genes: Focusing on MYB111 (determining delayed fiber elongation) and ERF105/ERF017 (environmental buffering factors) (Fig. 5 right panel), we found that these genes maintained intact gene structures and collinear relationships in diploid ancestors and wild tetraploids. This implies that "high quality/long staple" was originally an ancient trait potential of the Gossypium genus. However, on the divergence branch of Upland cotton (TM-1, J668), these loci underwent precise physical excision. This suggests that Upland cotton may have "actively" discarded these high-energy-consuming quality regulatory genes during domestication to achieve early maturity and high yield. "Sea Island-style Elimination" of Adaptive Genes: Conversely, at the MYB93 (root architecture) and WIND1 (regeneration capacity) loci (Fig. 5 left panel), the Sea Island lineage exhibited specific loss. In particular, while WIND1 existed intact in Upland cotton and all wild relatives, a significant genomic gap appeared in H7124 and 3–79. This "functional deficiency" specifically occurring in the Sea Island evolutionary chain is likely the genetic root of its poor root adaptability and environmental sensitivity (i.e., being "delicate"). In summary, microsynteny evidence indicates that the phenotypic divergence between Upland and Sea Island cotton did not stem from the evolution of new genes, but rather from their inheritance of a complete gene pool from a common ancestor, followed by divergent "genomic subtraction" in opposite directions. This complementary trajectory of gene loss ultimately solidified their distinct agronomic traits. 3.6. Pan-genomic Evolutionary Footprints Confirm the Universality of Reciprocal Selective Retention (RSR) To verify whether the RSR pattern revealed in Fig. 4 and Fig. 5 is universal across cotton germplasm resources, we extended the analysis to 26 Gossypium pan-genomes, covering diploid ancestors, wild tetraploids, and modern cultivars. Using the six key genes as probes, we constructed high-resolution phylogenomic footprints (Fig. 6 ). Pan-genomic alignment clearly categorized these six genes into two distinct evolutionary trajectories, revealing that the "3-vs-3" reciprocal loss pattern is likely a fixed genetic divergence at the species level between Upland and Sea Island cotton. "Alternative Detection" of Homeologs Detects Subgenome-Specific Loss: In the "Upland-Retained Group" (Fig. 6 a, e.g., WIND1 ), all Upland cotton accessions (TM-1, J668, etc.) showed high-identity matches located on chromosome A02 (red bubbles, Identity > 99%), indicating that this gene is core and conserved within the Upland cotton population. However, in all Sea Island cotton accessions (3–79, H7124, Giza7, etc.), BLAST could only detect lower-identity matches located on chromosome D02 (blue bubbles, Identity ~ 98%). This "chromosomal drift" from A02 to D02 is crucial—it implies that the original A-subgenome functional copy ( WIND1 ) in the Sea Island cotton genome has been completely lost, forcing the probe to "settle for the second best" and match the D-subgenome homeolog. This population-wide loss pattern confirms that the absence of WIND1 is a species characteristic of Sea Island cotton, rather than a mutation in individual varieties. Collective Silencing of "Quality Genes" in Upland Population: Conversely, in the "Sea Island-Retained Group" (Fig. 6 b, e.g., MYB111 ), the Sea Island cotton population consistently retained the intact gene on chromosome A12. However, in the Upland cotton population, this locus appeared as extremely low coverage (small bubbles) or homeologous drift. This further increases the likelihood that Upland cotton, in pursuit of early maturity and high yield, "purged" this fiber elongation factor at the whole-population level during domestication. In summary, pan-genomic evidence indicates that the RSR mechanism is not accidental genetic drift but a "core genetic imprint" solidified in the genomes of the two species after long-term domestication selection. This highly consistent "retention vs. loss" pattern between species provides a solid theoretical basis for improving Upland cotton using wild germplasm resources or the Sea Island cotton gene pool. 4. Discussion 4.1 Reciprocal Selective Retention (RSR): A New Genomic Evolutionary Perspective Breaking the "Yield-Quality" Negative Correlation The negative correlation between "high yield" (typified by Upland cotton) and "high quality" (typified by Sea Island cotton) has long been regarded as an insurmountable bottleneck in cotton breeding, often attributed to linkage drag or physiological energy constraints (Yang et al., 2026 ). However, our multi-omics integrated analysis offers an alternative genomic explanation: the Reciprocal Selective Retention (RSR) evolutionary model. Unlike the gradual accumulation of Single Nucleotide Polymorphisms (SNPs), RSR reveals a radical "genomic subtraction" strategy, where species actively eliminated genetic modules inconsistent with their survival strategies (Olson, 1999 ). Specifically, this model reveals the significantly divergent evolutionary logics adopted by the two species: (1) Upland Cotton's "High-Yield Robustness Module": To support its superior yield potential, Upland cotton specifically retained CRF10 (Cytokinin Response Factor, potentially regulating sink capacity and biomass) (Rashotte et al. 2006 ), WIND1 (cell regeneration factor) (Iwase et al. 2011), and MYB93 (root system architecture factor) (Gibbs et al. 2014 ). These genes collectively construct a genetic chassis integrating "strong acquisition (roots), strong sink capacity (CRF), and strong repair (WIND)". (2) Sea Island Cotton's "Quality Specialization Module": As an evolutionary cost, Sea Island cotton lost the aforementioned broad-adaptability genes, instead specifically retaining MYB111 (maintaining high flavonoid levels to finely tune/delay the end of elongation, thereby achieving longer fibers) (Stracke et al. 2007 ; Tan et al. 2013 ) and ERF105/ERF017 (stress defense factors) (Bolt et al. 2017 ). We speculate that these ERF members may act as "environmental buffers" during the prolonged fiber development window, providing necessary physiological homeostasis protection for the precise synthesis of high-quality fibers. This asymmetric "hardware" solidification on homologous chromosomes constitutes the key genomic structural basis for the phenotypic trade-off between the two species. 4.2 Transcriptome "Phase Inversion": A Developmental Trade-off of "Time" for "Quality" Our comparative transcriptomic landscape reveals an intriguing paradox in cotton domestication: vegetative organs are highly conserved, while reproductive organs undergo drastic remodeling. This phenomenon implies that the cotton genome is subject to strong evolutionary constraints to maintain basic life activities (transcriptional chassis) on one hand, while possessing extremely high plasticity in key traits like fiber development on the other. The most significant discovery is the "Phase Inversion" of the transcriptional program during fiber development. The "delayed elongation" network specifically activated by Sea Island cotton at 20 DPA is essentially a strategy of "trading time for quality." Through the sustained expression of key factors like MYB111 , Sea Island cotton delays the initiation of secondary wall thickening, thereby gaining a longer elongation window, which is consistent with previous physiological observations (Avci et al., 2013 ). In contrast, the transcriptional profile of Upland cotton reflects an "efficiency-first" strategy: the physical loss of MYB111 leads to the premature termination of the elongation program, forcing cells to rapidly enter the secondary wall filling stage (the blue dominant peak at 25 DPA). Although this "precocious mode" limits fiber length, it significantly shortens the growth cycle and increases boll weight (biomass) (Haigler et al. 2012 ), perfectly fitting the demand for "high lint percentage and short accumulated temperature" in extensive cultivation. 4.3 Evolutionary Constraints and Adaptive Differentiation: Evolutionary Trade-off between "Resource Acquisition" and "Sink Development Timing" Our comparative transcriptomic landscape and pan-genomic analysis further reveal the co-evolutionary logic of "resource acquisition (Source)" and "sink capacity construction (Sink)" during cotton domestication (White et al., 2016 ). First is the "underground cornerstone" for building efficient environmental adaptability. Our data indicate that TM-1 exhibits a significant expansion of specific genes in roots and specifically retains MYB93 . Previous studies have shown that the homolog of MYB93 in Arabidopsis acts as a negative regulator of lateral root development (Gibbs et al., 2014 ). The retention of this "suppressor" by Upland cotton may not be accidental but an optimization strategy for Root System Architecture (RSA): limiting ineffective proliferation of shallow lateral roots to promote deep rooting of the main root or improve water and fertilizer use efficiency (Lynch, 1995 ). Compared to the sensitivity of Sea Island cotton roots to specific environments, this root remodeling based on genomic specific retention provides a foundation for Upland cotton to establish a robust nutrient absorption network in diverse environments, offering sufficient material and energy support for the subsequent reproductive burst (Hu et al., 2019 ). Secondly, supported by a robust "Source", Upland cotton achieved an explosive expansion of "Sink Capacity". The core of the yield difference lies in the increase in boll number per plant and seed number per boll, which mainly depends on the number of locules. Upland cotton typically has 4–5 locules, significantly higher than the 3-locule characteristic of Sea Island cotton (Zhang et al., 2015 ; Viot and Wendel, 2023 ). Our analysis precisely captured the transcriptomic imprint determining this trait: the fundamental remodeling of the reproductive organ development program. On one hand, TM-1 showed significant specific gene amplification in anthers, which may enhance pollen viability and fertilization efficiency to meet the high demand of multi-locule and multi-ovule fertilization for male gametes; on the other hand, our common gene analysis revealed that the pistil—the maternal tissue determining carpel fusion and locule number—is the most drastically differentiated floral organ between the two species (common gene proportion < 5%) (Fig. 2 ). This highly specific transcriptional profile suggests that Upland cotton may have broken the genetic limit of "3 locules" in Sea Island cotton by reconstructing the meristem maintenance pathway or floral organ identity gene network, thereby establishing a high sink capacity architecture of "4–5 locules", which represents a significant advantage for yield. 4.4 Implications for Molecular Module Breeding: Reassembling "High-Yield Chassis" and "Quality Modules" The proposal of the RSR model provides novel insights for cotton genetic improvement. Since the divergence of "high yield" and "high quality" stems from the lineage-specific loss of functional modules, future breeding strategies should shift from traditional hybridization to "Genomic Design" based on pan-genomic information. Strategy 1: Reshaping the adaptability of Sea Island cotton. To address the restricted cultivation range of Sea Island cotton, gene editing or precise introgression could be employed to "reintroduce" the Upland cotton-specific MYB93-WIND1 root module. Given the conserved roles of WIND1 and MYB93 in cell regeneration and root architecture (Iwase et al., 2011; Gibbs et al., 2014 ), this strategy holds the potential to reconstruct the underground robustness of Sea Island cotton, thereby enhancing its environmental plasticity. Strategy 2: Breaking the quality ceiling of Upland cotton. Addressing the limitation of fiber length in Upland cotton, we can attempt to introduce the MYB111 delay module from Sea Island cotton. Previous studies have confirmed that flavonoid metabolism is closely related to fiber elongation (Tan et al., 2013 ), and MYB111 is a key regulator of this pathway (Stracke et al., 2007 ). However, it should be noted that simple replenishment may lead to late maturity. Therefore, using fiber-specific promoters (such as SCW promoters) to finely tune its expression window (Huang et al., 2021 ) to moderately extend the elongation phase without significantly prolonging the total growth period will be key to creating new "high-yield and high-quality" Upland cotton varieties. 5. Conclusion This study systematically dissected the genetic basis of the "yield-quality" phenotypic trade-off between Upland and Sea Island cotton by integrating multi-tissue transcriptomics and pan-genomic structural variation analysis. We found that: (1) The two species share a conserved transcriptional chassis but achieve functional specialization through asymmetric expansion of gene repertoires in specific tissues (Upland roots/anthers vs. Sea Island fibers). (2) The "Phase Inversion" of the transcriptional program during fiber development is key to fiber quality differences, with Sea Island cotton trading for a longer elongation window via a delayed expression strategy. (3) The "high yield" characteristic of Upland cotton benefits from its specifically retained adaptive genetic modules. For instance, the retained MYB93 and WIND1 may have optimized root architecture and environmental adaptability, building a robust "underground resource acquisition" system to support the explosive expansion of the above-ground reproductive sink. (4) This process is driven by the Reciprocal Selective Retention (RSR) mechanism at the genomic level, i.e., the complementary loss of key members of MYB and AP2/ERF families on homologous chromosomes (e.g., Upland cotton discarding MYB111 , Sea Island cotton discarding WIND1 ). In summary, the RSR mechanism reveals the evolutionary wisdom of "genomic subtraction" in polyploid domestication. Our findings provide a robust theoretical framework and genomic targets for breaking the long-standing negative correlation between yield and quality through future genomic design breeding. Declarations Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Funding This work was supported by the National Key Research and Development Program of China (Grant No. 2024YFD1200300), the National Natural Science Foundation of China (Grant No. 31701471), and the State Key Laboratory of Cotton Bio-breeding and Integrated Utilization (Grant No. CB2025A13). Author Contribution Yifan Xu : Conceptualization, Methodology, Software, Formal analysis, Investigation, Data curation, Writing – original draft, Visualization. Zaoyang Gong : Validation, Data curation. Qinglin Shen :Validation, Investigation. Rongzheng Zhao : Data curation, Software. Chuankang Cheng :Validation, Investigation. Wanting Su : Data curation, Visualization. Yibing Li : Validation, Investigation. Xingrui Yang : Data curation. Xin Ruan : Investigation. Fengyun Li : Validation. Kai Guo : Resources, Investigation. Dajun Liu : Resources, Investigation. Xueying Liu : Supervision, Writing – review & editing. Zhonghua Teng : Supervision, Writing – review & editing. Fang Liu :Supervision, Writing – review & editing. Zhengsheng Zhang : Resources, Supervision. Yanchao Xu : Conceptualization, Methodology, Software, Supervision, Writing – review & editing. Dexin Liu : Funding acquisition, Project administration, Supervision, Writing – review & editing. Acknowledgement We acknowledge the High Performance Computing (HPC) clusters at Southwest University for their support. Data Availability Data will be made available on request. References Alagarsamy M (2023) Assessing genetic variation in Gossypium barbadense L. germplasm based on fibre characters. J Cotton Res 6:15. https://doi.org/10.1186/s42397-023-00153-y Avci U, Pattathil S, Singh B, Brown VL, Hahn MG, Haigler CH (2013) Cotton fiber cell walls of Gossypium hirsutum and Gossypium barbadense have differences related to loosely-bound xyloglucan. PLoS ONE 8:e56315. https://doi.org/10.1371/journal.pone.0056315 Bolt S, Zuther E, Zintl S et al (2017) ERF105 is a transcription factor gene of Arabidopsis thaliana required for freezing tolerance and cold acclimation. Plant Cell Environ 40:108–120. https://doi.org/10.1111/pce.12838 Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. https://doi.org/10.1186/1471-2105-10-421 Chang X, He X, Li J et al (2024) High-quality Gossypium hirsutum and Gossypium barbadense genome assemblies reveal the landscape and evolution of centromeres. Plant Commun 5:100722. https://doi.org/10.1016/j.xplc.2023.100722 Chen X, Chen W, Li X et al (2012) Molecular mechanisms of fiber differential development between G. barbadense and G. hirsutum revealed by genetical genomics. PLoS ONE 7:e30056. https://doi.org/10.1371/journal.pone.0030056 Chen ZJ, Sreedasyam A, Ando A et al (2020) Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nat Genet 52:525–533. https://doi.org/10.1038/s41588-020-0614-5 Cheng H, Liu S, Zhang Y et al (2025) Comparative single-cell transcriptomic map reveals divergence in leaves between two cotton species at cell type resolution. J Adv Res. https://doi.org/10.1016/j.jare.2025.04.012 Cock PJ, Antao T, Chang JT et al (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25:1422–1423 Conway JR, Lex A, Gehlenborg N (2017) UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33:2938–2940. https://doi.org/10.1093/bioinformatics/btx364 Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7:e1002195. https://doi.org/10.1371/journal.pcbi.1002195 Gibbs DJ, Voß U, Harding SA et al (2014) AtMYB93 is a novel negative regulator of lateral root development in Arabidopsis. New Phytol 203:1194–1207. https://doi.org/10.1111/nph.12879 Goel M, Schneeberger K (2022) plotsr: visualizing structural similarities and rearrangements between multiple genomes. Bioinformatics 38:2922–2926. https://doi.org/10.1093/bioinformatics/btac196 Goel M, Sun H, Jiao WB et al (2019) SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol 20:277. https://doi.org/10.1186/s13059-019-1911-0 Grover CE, Yuan D, Arick MA et al (2019) The genome sequence of Gossypioides kirkii illustrates a descending dysploidy in plants. Front Plant Sci 10:1541. https://doi.org/10.3389/fpls.2019.01541 Haigler CH, Betancur L, Stiff MR, Tuttle JR (2012) Cotton fiber: a powerful single-cell model for cell wall research. Front Plant Sci 3:104. https://doi.org/10.3389/fpls.2012.00104 Hu G, Wang Z, Tian Z et al (2025) A telomere-to-telomere genome assembly of cotton provides insights into centromere evolution and short-season adaptation. Nat Genet 57:1031–1043. https://doi.org/10.1038/s41588-025-02130-4 Hu Y, Chen J, Fang L et al (2019) Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat Genet 51:739–748. https://doi.org/10.1038/s41588-019-0371-5 Huang G, Bao Z, Feng L et al (2024) A telomere-to-telomere cotton genome assembly reveals centromere evolution and a Mutator transposon-linked module regulating embryo development. Nat Genet 56:1953–1963. https://doi.org/10.1038/s41588-024-01877-6 Huang J, Chen F, Guo Y et al (2021) GhMYB7 promotes secondary wall cellulose deposition in cotton fibres by regulating GhCesA gene expression through three distinct cis-elements. New Phytol 232:1718–1737. https://doi.org/10.1111/nph.17612 Keilwagen J, Wenk M, Erickson JL et al (2016) Using intron position conservation for homology-based gene prediction. Nucleic Acids Res 44:e89. https://doi.org/10.1093/nar/gkw092 Kryuchkova-Mostacci N, Robinson-Rechavi M (2017) A benchmark of tissue-specificity metrics for RNA-seq data. Brief Bioinform 18:205–214. https://doi.org/10.1093/bib/bbw008 Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9:559. https://doi.org/10.1186/1471-2105-9-559 Li F, Fan G, Lu C et al (2015) Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat Biotechnol 33:524–530. https://doi.org/10.1038/nbt.3208 Li H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. https://doi.org/10.1093/bioinformatics/bty191 Li Y, Tu L, Pettolino FA et al (2016) GbEXPATR, a species-specific expansin, enhances cotton fibre elongation through cell wall restructuring. Plant Biotechnol J 14:1006–1017. https://doi.org/10.1111/pbi.12450 Liu R, Xiao X, Gong J et al (2024) Genetic linkage analysis of stable QTLs in Gossypium hirsutum RIL population revealed function of GhCesA4 in fiber development. J Adv Res 60:133–148. https://doi.org/10.1016/j.jare.2023.12.005 Lynch J (1995) Root architecture and plant productivity. Plant Physiol 109:7–13. https://doi.org/10.1104/pp.109.1.7 Manolio TA, Collins FS, Cox NJ et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753. https://doi.org/10.1038/nature08494 Marçais G, Delcher AL, Phillippy AM et al (2018) MUMmer4: A fast and versatile genome alignment system. PLoS Comput Biol 14:e1005944. https://doi.org/10.1371/journal.pcbi.1005944 McInnes L, Healy J, Melville J (2018) UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 Meng Q, Xie P, Xu Z et al (2025) Pangenome analysis reveals yield- and fiber-related diversity and interspecific gene flow in Gossypium barbadense L. Nat Commun 16:4995. https://doi.org/10.1038/s41467-025-60254-x Mistry J, Chuguransky S, Williams L et al (2021) Pfam: the protein families database in 2021. Nucleic Acids Res 49:D412–D419. https://doi.org/10.1093/nar/gkaa913 Olson MV (1999) When less is more: gene loss as an engine of evolutionary change. Am J Hum Genet 64:18–23. https://doi.org/10.1086/302219 Peng R, Xu Y, Tian S et al (2022) Evolutionary divergence of duplicated genomes in newly described allotetraploid cottons. Proc Natl Acad Sci USA 119:e2208496119. https://doi.org/10.1073/pnas.2208496119 Perkin LC, Bell A, Hinze LL et al (2021) Genome assembly of two nematode-resistant cotton lines (Gossypium hirsutum L.). G3 Genes Genomes Genet. 11:jkab276. https://doi.org/10.1093/g3journal/jkab276 Pinglay S, Lalanne J-B, Daza RM et al (2025) Multiplex generation and single-cell analysis of structural variants in mammalian genomes. Science 387:ado5978. https://doi.org/10.1126/science.ado5978 R Core Team (2025) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/ Rashotte AM, Mason MG, Hutchison CE et al (2006) A subset of Arabidopsis AP2 transcription factors mediates cytokinin responses in concert with a two-component pathway. Proc Natl Acad Sci USA 103:11081–11085. https://doi.org/10.1073/pnas.0602038103 Sheri V, Mohan H, Jogam P et al (2025) CRISPR/Cas genome editing for cotton precision breeding: mechanisms, advances, and prospects. J Cotton Res 8:4. https://doi.org/10.1186/s42397-024-00206-w Stracke R, Ishihara H, Huep G et al (2007) Differential regulation of closely related R2R3-MYB transcription factors controls flavonol accumulation in different parts of the Arabidopsis thaliana seedling. Plant J 50:660–677. https://doi.org/10.1111/j.1365-313X.2007.03078.x Tan J, Tu L, Deng F et al (2013) A genetic and metabolic analysis revealed that cotton fiber cell development was retarded by flavonoid naringenin. Plant Physiol 162:86–95. https://doi.org/10.1104/pp.112.212142 Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605 Viot CR, Wendel JF (2023) Evolution of the cotton genus, Gossypium, and its domestication in the Americas. Crit Rev Plant Sci 42:1–33. https://doi.org/10.1080/07352689.2022.2156061 Vizueta J, Sánchez-Gracia A, Rozas J (2020) BITACORA: a comprehensive tool for the identification and annotation of gene families in genome assemblies. Mol Ecol Resour 20:1445–1452. https://doi.org/10.1111/1755-0998.13202 Wang J, Liang Y, Gong Z et al (2023) Genomic and epigenomic insights into the mechanism of cold response in upland cotton (Gossypium hirsutum). Plant Physiol Biochem 205:108206. https://doi.org/10.1016/j.plaphy.2023.108206 Wang M, Li J, Qi Z et al (2022) Genomic innovation and regulatory rewiring during evolution of the cotton genus Gossypium. Nat Genet 54:1959–1971. https://doi.org/10.1038/s41588-022-01237-2 Wang M, Li J, Wang P et al (2021) Comparative genome analyses highlight transposon-mediated genome expansion and the evolutionary architecture of 3D genomic folding in cotton. Mol Biol Evol 38:3621–3636. https://doi.org/10.1093/molbev/msab128 Wen X, Chen Z, Yang Z et al (2023) A comprehensive overview of cotton genomics, biotechnology and molecular biological studies. Sci China Life Sci 66:2214–2256. https://doi.org/10.1007/s11427-022-2278-0 Wendel JF (2000) Genome evolution in polyploids. Plant Mol Biol 42:225–249. https://doi.org/10.1023/A:1006392424384 White AC, Rogers A, Rees M et al (2016) How can we make plants grow faster? A source–sink perspective on growth rate. J Exp Bot 67:31–45. https://doi.org/10.1093/jxb/erv447 Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer, New York Wilkins D (2020) gggenes: Draw Gene Arrow Maps in 'ggplot2'. R package version 0.4.1. https://CRAN.R-project.org/package=gggenes Xu Y, Wei Y, Zhou Z et al (2024) Widespread incomplete lineage sorting and introgression shaped adaptive radiation in the Gossypium genus. Plant Commun 5:100728. https://doi.org/10.1016/j.xplc.2023.100728 Xu Z, Wang G, Zhu X et al (2025) Genome assembly of two allotetraploid cotton germplasms reveals mechanisms of somatic embryogenesis and enables precise genome editing. Nat Genet 57:2028–2039. https://doi.org/10.1038/s41588-025-02258-3 Yan H, Han J, Jin S et al (2025) Post-polyploidization centromere evolution in cotton. Nat Genet. https://doi.org/10.1038/s41588-025-02115-3 Yanai I, Benjamin H, Shmoish M et al (2005) Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 21:650–659. https://doi.org/10.1093/bioinformatics/bti042 Yang Z, Wang J, Huang Y et al (2023) CottonMD: a multi-omics database for cotton biological study. Nucleic Acids Res 51:D1446–D1456. https://doi.org/10.1093/nar/gkac863 Yang Z, Yang Z, Gao C et al (2026) Graph pan-genome illuminates evolutionary trajectories and agronomic trait architecture in allotetraploid cotton. Nat Genet 58:218–229. https://doi.org/10.1038/s41588-025-02462-1 Yao Y, Liu S, Xia C et al (2022) Comparative transcriptome in large-scale human and cattle populations. Genome Biol 23:176. https://doi.org/10.1186/s13059-022-02745-4 Yu J, Jung S, Cheng CH et al (2021) CottonGen: the community database for cotton genomics, genetics, and breeding research. Plants 10:2805. https://doi.org/10.3390/plants10122805 Yuan D, Tang Z, Wang M et al (2015) The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres. Sci Rep 5:17662. https://doi.org/10.1038/srep17662 Zhang D, Zhou H, Zhang Y et al (2025) Diverse roles of MYB transcription factors in plants. J Integr Plant Biol. https://doi.org/10.1111/jipb.13869 Zhang J, Yu M, Guo Y et al (2025) Evolutionary divergence of an ethylene-responsive transcriptional cascade governs a dose-dependent balance between cotton fiber length and strength. Adv Sci. https://doi.org/10.1002/advs.202514154 Zhang T, Hu Y, Jiang W et al (2015) Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol 33:531–537. https://doi.org/10.1038/nbt.3207 Zhang X, Tian X, Gao X et al (2025) Integrated metabolomic and transcriptomic analyses identify MYB genes regulating key metabolites and agronomic traits in upland cotton Gossypium hirsutum. Nat Genet 57:2819–2830. https://doi.org/10.1038/s41588-025-02363-3 Additional Declarations No competing interests reported. Supplementary Files SupplementaryMaterials.docx Cite Share Download PDF Status: Under Review Version 1 posted Reviews received at journal 29 Apr, 2026 Reviews received at journal 23 Apr, 2026 Reviewers agreed at journal 16 Apr, 2026 Reviewers agreed at journal 13 Apr, 2026 Reviewers agreed at journal 01 Apr, 2026 Reviews received at journal 27 Feb, 2026 Reviewers agreed at journal 16 Feb, 2026 Reviewers agreed at journal 14 Feb, 2026 Reviewers agreed at journal 13 Feb, 2026 Reviewers invited by journal 13 Feb, 2026 Editor assigned by journal 12 Feb, 2026 Submission checks completed at journal 11 Feb, 2026 First submitted to journal 01 Feb, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8754476","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":592240282,"identity":"12bc9aa6-7ca3-4a30-ad14-d10afab8ae1b","order_by":0,"name":"Yifan Xu","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Yifan","middleName":"","lastName":"Xu","suffix":""},{"id":592240283,"identity":"4d2935f8-2e20-4916-bbd5-de9fa51f34f5","order_by":1,"name":"Zaoyang Gong","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Zaoyang","middleName":"","lastName":"Gong","suffix":""},{"id":592240284,"identity":"0f06b6fd-7fcb-4a85-84dd-3d61050cb280","order_by":2,"name":"Qinglin Shen","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Qinglin","middleName":"","lastName":"Shen","suffix":""},{"id":592240285,"identity":"2698ae2e-0e63-4cd5-a06b-37587d0572da","order_by":3,"name":"Rongzheng Zhao","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Rongzheng","middleName":"","lastName":"Zhao","suffix":""},{"id":592240286,"identity":"7cc52e2a-92ec-45aa-ac11-da87312f5274","order_by":4,"name":"Chuankang Cheng","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Chuankang","middleName":"","lastName":"Cheng","suffix":""},{"id":592240287,"identity":"c44312f9-d1ef-4af5-95bf-96c38fc6afce","order_by":5,"name":"Yibing Li","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Yibing","middleName":"","lastName":"Li","suffix":""},{"id":592240288,"identity":"1bac48af-c85d-446a-a673-560432d46379","order_by":6,"name":"Wanting Su","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Wanting","middleName":"","lastName":"Su","suffix":""},{"id":592240292,"identity":"e3257a34-116d-4120-ac7d-9668d81e255c","order_by":7,"name":"Fengyun Li","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Fengyun","middleName":"","lastName":"Li","suffix":""},{"id":592240294,"identity":"0c96806d-b3f3-4929-addf-474883398ad8","order_by":8,"name":"Xingrui Yang","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Xingrui","middleName":"","lastName":"Yang","suffix":""},{"id":592240297,"identity":"ca4add94-f51f-44e3-926b-24f953cf9179","order_by":9,"name":"Xin Ruan","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Xin","middleName":"","lastName":"Ruan","suffix":""},{"id":592240299,"identity":"0305866b-e080-4610-8898-cb68623b7053","order_by":10,"name":"Kai Guo","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Kai","middleName":"","lastName":"Guo","suffix":""},{"id":592240300,"identity":"534f6055-20e0-408c-8456-519436b2a325","order_by":11,"name":"Dajun Liu","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Dajun","middleName":"","lastName":"Liu","suffix":""},{"id":592240301,"identity":"24c1dcd2-2226-4836-8ab9-f3337cf061f3","order_by":12,"name":"Xueying Liu","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Xueying","middleName":"","lastName":"Liu","suffix":""},{"id":592240302,"identity":"e1ae52f5-cf24-4f8d-912f-faa0f7bbd568","order_by":13,"name":"Zhonghua Teng","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Zhonghua","middleName":"","lastName":"Teng","suffix":""},{"id":592240303,"identity":"e22ce20c-5e32-4adc-9618-3222dfbcd84b","order_by":14,"name":"Fang Liu","email":"","orcid":"","institution":"Chinese Academy of Agricultural Sciences, Cotton Research Institute","correspondingAuthor":false,"prefix":"","firstName":"Fang","middleName":"","lastName":"Liu","suffix":""},{"id":592240304,"identity":"54e3b885-b667-400d-a69e-7678f5888364","order_by":15,"name":"Zhengsheng Zhang","email":"","orcid":"","institution":"Southwest University","correspondingAuthor":false,"prefix":"","firstName":"Zhengsheng","middleName":"","lastName":"Zhang","suffix":""},{"id":592240305,"identity":"a96410b6-f313-4440-80e6-6896611c5699","order_by":16,"name":"Yanchao Xu","email":"","orcid":"","institution":"Chinese Academy of Agricultural Sciences, Cotton Research Institute","correspondingAuthor":false,"prefix":"","firstName":"Yanchao","middleName":"","lastName":"Xu","suffix":""},{"id":592240306,"identity":"6d6b7267-2385-4c72-8f29-1197cd9b9829","order_by":17,"name":"Dexin Liu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAzElEQVRIiWNgGAWjYDCCAyBUIMHAD+EyE6vFQIJBsoEULQwMBkB0gFgtfMd7DA8XGFgkbj5/xkyCocI6sYH97AG8WiTPHEs4PMNAInHbjRygljPpiQ08eQl4tRjcSD5wmAeshXebBGPb4cQGCR4D/FruP2wAa9ncfxao5R8xWm4wQ2zZwJAL1NJAhBbJM2kJIC3GM27kf7ZIOJZu3MaTg18L3/Ezxp95Kupk+/uPJd74UGMt289+Br8WVJAAxGwkqB8Fo2AUjIJRgAMAAJhuRpPzzoBiAAAAAElFTkSuQmCC","orcid":"","institution":"Southwest University","correspondingAuthor":true,"prefix":"","firstName":"Dexin","middleName":"","lastName":"Liu","suffix":""}],"badges":[],"createdAt":"2026-02-01 07:38:27","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8754476/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8754476/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":102964384,"identity":"879fb025-535d-4fa0-9a12-a85807f49750","added_by":"auto","created_at":"2026-02-19 04:22:11","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":159775,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eGlobal transcriptome landscape of Upland cotton (TM-1) and Sea Island cotton (H7124) reveals evolutionary conservation and tissue-specific divergence.\u003c/strong\u003e \u0026nbsp;(a) Dimensionality reduction analyses using Principal Component Analysis (PCA), t-SNE, and UMAP based on gene expression profiles across 21 tissues. Samples clustered mainly by tissue type (color) rather than species (shape), indicating a shared conservative Transcriptional Chassis between the two species. (b) Scatter plot comparing the number of expressed genes (TPM ≥ 0.1) in homologous tissues. Although a significant positive correlation was observed (Pearson’s = 0.0027), outliers highlight lineage-specific metabolic activity differences: TM-1 exhibits a higher number of genes in roots and anthers, whereas H7124 retains more active genes in developing fibers (e.g., 20 DPA). The dashed line represents the linear regression fit. (c) Frequency distribution of the Tissue Specificity Index (Tau). Both species display a bimodal distribution; however, TM-1 shows an expansion of highly tissue-specific genes (Tau \u0026gt; 0.5), while H7124 retains a higher proportion of broadly expressed genes (Tau \u0026lt; 0.4).\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-8754476/v1/0984fa7ccec6a9d7891fb575.png"},{"id":102963900,"identity":"53ae051c-7e0e-41ad-8777-fae2c817bba8","added_by":"auto","created_at":"2026-02-19 04:20:49","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":194130,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eConservation and spatiotemporal divergence in tissue-specific gene expression between Upland and Sea Island cotton.\u003c/strong\u003e(a) Stacked bar chart showing the relative proportions of \"Common\" and \"Species-Specific\" genes (TPM ≥ 0.1) in each tissue. Green represents expressed genes common to both species, red represents H7124-specific expressed genes, and blue represents TM-1-specific expressed genes. Note the significant proportion inversion (phase inversion) occurring during fiber development (20 DPA to 25 DPA). (b) Violin plots of gene expression abundance (TPM) for H7124 (solid lines/solid fill) and TM-1 (dashed lines/hollow) across tissues. Solid lines represent H7124, and dashed lines represent TM-1. (c)Statistics of shared gene numbers within different expression abundance windows (from Very Low to Very High). The charts reveal high conservation in the transcriptomes of vegetative organs, contrasting with the drastic transcriptional remodeling in reproductive organs (particularly fibers and ovules).\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-8754476/v1/41e4cb060e763076c33dcb97.png"},{"id":102919228,"identity":"0c570c8d-4a96-487f-9a73-331ba5ffd674","added_by":"auto","created_at":"2026-02-18 12:07:07","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":144260,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eWeighted Gene Co-expression Network Analysis (WGCNA) identifies key modules driving delayed fiber elongation in Sea Island cotton.\u003c/strong\u003e (a) Spatiotemporal expression heatmap of WGCNA Module Eigengenes. Colors represent relative expression levels (Red: High; Blue: Low). The arrow points to Module 6, which exhibits specific high transcriptional activity in H7124 20 DPA fibers, presenting a \"delayed expression\" characteristic. (b) Top 20 enriched Pfam domains in Module 6. This module is rich in domains involved in signal transduction (e.g., Pkinase, LRR) and transcriptional regulation, indicating that the cells are in an active state of metabolism and signal response. (c) Distribution of member counts for major transcription factor families in Module 6. The MYB and AP2/ERF families show the highest enrichment, implying their core regulatory roles in this module.\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-8754476/v1/878228caa5207c511f26dfa2.png"},{"id":102964054,"identity":"a2c91984-1d30-4998-adb7-71b6e07f4080","added_by":"auto","created_at":"2026-02-19 04:21:19","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":158660,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eReciprocal genomic structural variation mechanisms of key transcription factors (MYB and AP2) between Upland and Sea Island cotton. \u003c/strong\u003eDetailed Structural Variation (SV) maps of six candidate loci based on SyRI analysis reveal two distinct lineage-specific loss mechanisms: Physical Deletion (NOTAL) and Sequence Collapse (HDR). (a) Left panel: Upland cotton (TM-1) specific retained genes (lost in H7124). Blue arrows represent intact genes in TM-1 (\u003cem\u003eMYB93, WIND1, CRF10\u003c/em\u003e). Corresponding regions in the H7124 genome are identified as large-fragment deletions (e.g., a ~201.90 kb NOTAL deletion at the WIND1 locus) or Highly Diverged Regions (HDR), resulting in the loss of functional copies. (b) Right panel: Sea Island cotton (H7124) specific retained genes (lost in TM-1). Red arrows represent intact genes in H7124 (\u003cem\u003eMYB111, ERF105, ERF017\u003c/em\u003e). In TM-1, the region corresponding to \u003cem\u003eMYB111\u003c/em\u003e underwent a precise ~5.06 kb physical deletion, while \u003cem\u003eERF105/ERF017\u003c/em\u003e are located in sequence collapse regions. This genomic \"Double Dissociation\" constitutes the physical basis of the RSR mechanism.\u003c/p\u003e","description":"","filename":"image4.png","url":"https://assets-eu.researchsquare.com/files/rs-8754476/v1/3e15b101eb98bddb4b363742.png"},{"id":102963653,"identity":"25b56760-11e6-4c84-b18a-a7a11b5602f8","added_by":"auto","created_at":"2026-02-19 04:19:46","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":244053,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eLineage-specific loss and microsynteny maps of six key transcription factors in allotetraploid cotton. \u003c/strong\u003eHigh-resolution microsynteny maps constructed using eight representative Gossypium genomes (covering diploid ancestors, wild tetraploids, and cultivars) reveal a \"3-vs-3\" reciprocal loss evolutionary trajectory. Group 1 (G. hirsutum-specific loss): Includes \u003cem\u003eMYB111\u003c/em\u003e, \u003cem\u003eERF017\u003c/em\u003e, and \u003cem\u003eERF105\u003c/em\u003e. These genes are structurally intact and syntenically conserved (red arrows) in diploid ancestors (D5, A2) and wild tetraploids (AD3, AD5), but underwent physical deletion in the Upland cotton lineage (TM-1, J668), leaving only flanking anchor genes (gray arrows). Group 2 (G. barbadense-specific loss): Includes \u003cem\u003eMYB93\u003c/em\u003e, \u003cem\u003eWIND1\u003c/em\u003e, and \u003cem\u003eCRF10\u003c/em\u003e. These genes are highly conserved in Upland cotton but specifically underwent collapse or large-fragment deletion (e.g., the genomic gap at the \u003cem\u003eWIND1\u003c/em\u003e locus) in the Sea Island cotton lineage (H7124, 3-79). This analysis, referenced against wild species, confirms that the phenotypic differentiation of modern cultivars stems from differential purging of the ancient ancestral gene pool.\u003c/p\u003e","description":"","filename":"image5.png","url":"https://assets-eu.researchsquare.com/files/rs-8754476/v1/8f17c3e10bb3e56b7d751eb9.png"},{"id":102964389,"identity":"12c444d7-ab3e-4d65-b00f-184062a80325","added_by":"auto","created_at":"2026-02-19 04:22:11","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":184248,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePhylogenomic footprinting of six RSR candidate genes across 26 Gossypium pan-genomes.\u003c/strong\u003e Bubble plots display the presence and sequence identity of target genes in diploids, wild tetraploids, and cultivars, validating the population-wide universality of the RSR pattern. (a) Upland Cotton Specific Retention Group (Group A): Includes \u003cem\u003eMYB93\u003c/em\u003e(A11), \u003cem\u003eWIND1\u003c/em\u003e (A02), and \u003cem\u003eCRF10\u003c/em\u003e (A12). These genes maintain high sequence identity (red bubbles) and are located on the A-subgenome in all Upland cotton germplasms (e.g., TM-1, J668, ZM113). In contrast, in all Sea Island cotton germplasms (e.g., 3-79, H7124), the A-subgenome copies are lost, and only D-subgenome homeologs with lower sequence identity (blue bubbles) are detected. (b) Sea Island Cotton Specific Retention Group (Group B): Includes \u003cem\u003eMYB111\u003c/em\u003e(A12), \u003cem\u003eERF105\u003c/em\u003e (D03), and \u003cem\u003eERF017\u003c/em\u003e (D13). These genes are highly conserved in Sea Island cotton populations (red bubbles) but appear as low coverage or only detectable homeologs in Upland cotton populations. Bubble size represents query Coverage, and color represents sequence Identity. The bottom axis labels chromosomal positions, revealing the drift from \"orthologs\" to \"paralogs/homeologs.\"\u003c/p\u003e","description":"","filename":"image6.png","url":"https://assets-eu.researchsquare.com/files/rs-8754476/v1/271e3c0d65cda02a04fe8139.png"},{"id":102965558,"identity":"86e18543-e751-47cd-9279-990cf9e27bd0","added_by":"auto","created_at":"2026-02-19 04:31:56","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2506020,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8754476/v1/783f5bbf-6d8a-451a-a838-1bcba2e5c85d.pdf"},{"id":102963627,"identity":"5213cd65-41a9-4c49-9f05-5a73a8551fc6","added_by":"auto","created_at":"2026-02-19 04:19:30","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":1494414,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryMaterials.docx","url":"https://assets-eu.researchsquare.com/files/rs-8754476/v1/3056d3a9d6cddfa1edb86492.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Multi-tissue Transcriptomic and Pan-genomic Analyses Reveal Reciprocal Selective Retention Driving the Phenotypic Trade-off in Gossypium hirsutum and Gossypium barbadense","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003ePolyploidization and domestication are two major forces driving crop evolution, often accompanied by profound genomic reshaping and phenotypic trade-offs. As a classic allotetraploid model, cultivated cotton (Gossypium spp.) presents a fascinating case of \"divergent domestication\" (Wen et al. \u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). The two dominant cultivated species, Upland cotton (Gossypium hirsutum) and Sea Island cotton (G. barbadense), have evolved distinct adaptive strategies under artificial selection: the former was selected for broad environmental adaptability and high yield potential (typified by standard line TM-1), whereas the latter was selected for superior fiber quality at the cost of yield and robustness (typified by H7124) (Avci et al. \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Alagarsamy \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Unraveling the genomic architecture underlying this \"yield-quality\" trade-off is not only crucial for cotton genetic improvement but also provides fundamental insights into how homologous genomes differentiate functionally during polyploid evolution (Liu et al. \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2024\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eWith the advent of the \"Pan-genomic Era,\" cotton genomics has witnessed milestone breakthroughs. Following the release of initial draft genomes (Li et al. \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Zhang et al. \u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e2015\u003c/span\u003e), high-quality reference assemblies (Yuan et al. \u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Hu et al. \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) and recent \"Telomere-to-Telomere\" (T2T) assemblies (Hu et al. \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2025\u003c/span\u003ea; Hu et al. \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2025\u003c/span\u003eb) have provided unprecedented resolution for dissecting complex allotetraploid genomes. Despite this abundance of genomic data, the precise molecular mechanisms driving the phenotypic divergence between the two species remain incompletely understood. Traditional Genome-Wide Association Studies (GWAS) based on Single Nucleotide Polymorphisms (SNPs) often face the \"missing heritability\" challenge when explaining inter-specific phenotypic divergence (Manolio et al. \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2009\u003c/span\u003e). Increasing evidence suggests that large-scale Structural Variations (SVs), such as Presence/Absence Variations (PAVs), act as major drivers of adaptive evolution in polyploid crops by physically reshaping regulatory landscapes (Pinglay et al. \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). A critical knowledge gap remains: how do these \"hard\" genomic variations (SVs) structurally remodel downstream \"soft\" transcriptional networks to establish the canalized trait barriers between Upland and Sea Island cotton (Wendel \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e2000\u003c/span\u003e)?\u003c/p\u003e \u003cp\u003eIntegrating multi-tissue transcriptomics with pan-genomics offers a powerful strategy to bridge this gap. While comparative transcriptomics has been applied to understand tissue-specific regulation in mammals (Yao et al. \u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e2022\u003c/span\u003e), such systematic analyses remain limited in polyploid crops, often confined to single tissues like fibers (Chen et al. \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2012\u003c/span\u003e) or leaves (Cheng et al. \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2025\u003c/span\u003e). Unlike diploid models, the domestication of allotetraploid cotton involves complex subgenome interactions and the differentiation of homologous chromosomes. Therefore, constructing a spatiotemporal transcriptomic atlas across the full growth cycle, combined with pan-genomic SV analysis, is essential for revealing how \"genomic hardware\" drives the remodeling of \"transcriptomic software.\"\u003c/p\u003e \u003cp\u003eIn the complex transcriptional regulatory networks of plants, transcription factors serve as key bridges connecting genomic variation to phenotypic remodeling. Among them, the MYB transcription factor superfamily occupies an irreplaceable core position in cotton fiber development due to its vast membership and high functional diversity (Dubos et al., 2010; Zhang et al., \u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e2025\u003c/span\u003ea). However, cotton domestication involves not only the improvement of single fiber traits but also a fundamental shift in adaptive strategies at the whole-plant level - most notably, the \"broad adaptability\" of Upland cotton versus the \"delicate\" nature of Sea Island cotton. This strongly suggests that, in addition to the MYB family, other key factors regulating cell fate determination and environmental response - such as the AP2/ERF family - may play indispensable synergistic roles in this process. AP2/ERF family members (e.g., \u003cem\u003eWIND1\u003c/em\u003e, \u003cem\u003eERF\u003c/em\u003e) act not only as primary responders to hormone signals like ethylene but also as \"master switches\" regulating cell dedifferentiation and regeneration (Iwase et al., 2011).\u003c/p\u003e \u003cp\u003eBased on multi-tissue transcriptomic and pan-genomic analyses, this study reveals the molecular mechanisms driving the phenotypic divergence between Upland and Sea Island cotton. We discovered that: (1) although the global transcriptional chassis remains highly conserved, significant divergence occurs in specific tissues; (2) a distinct \"phase inversion\" of transcriptional programs takes place during critical stages of fiber development; (3) guided by WGCNA, we pinpointed a key regulatory module driving the \"delayed elongation\" trait and identified its core regulators; and (4) genomic analyses confirmed that these losses stem from asymmetric structural variations targeting genes that were originally intact in the ancestral gene pool, with pan-genomic profiling further verifying the universality of this \"retention vs. loss\" pattern across broad cultivated germplasm resources. Based on these findings, we propose a \"Reciprocal Selective Retention (RSR)\" evolutionary model in cultivated cottons, which suggests that the two species shaped their distinct adaptive strategies by asymmetrically retaining or eliminating specific functional genetic modules from the ancestral gene pool. Based on this model, we dissect the molecular logic underlying the trade-off between \"high yield\" and \"high quality\" at the level of genomic structure, providing a theoretical basis and critical genomic targets for breeding novel cotton varieties that combine both traits.\u003c/p\u003e"},{"header":"2. Materials and Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1 Plant Materials and Data Acquisition\u003c/h2\u003e \u003cp\u003eIn this study, we utilized the Telomere-to-Telomere (T2T) reference genome of Upland cotton (\u003cem\u003eG. hirsutum\u003c/em\u003e TM-1) published by Yan et al. (\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e2025\u003c/span\u003e), and the high-quality reference genome of Sea Island cotton (\u003cem\u003eG. barbadense\u003c/em\u003e H7124) sequenced by Hu et al. (\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). The genomic data for G. barbadense were downloaded from the CottonMD database (Yang et al. \u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Additionally, for pan-genomic evolutionary analysis, we collected chromosome-level genome assemblies of 26 Gossypium species, including wild diploids and tetraploids, from CottonGen (Yu et al. \u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) and the NCBI database (Supplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e2.2 Transcriptome Data Acquisition and Preprocessing\u003c/h2\u003e \u003cp\u003eThe gene expression data used in this study were retrieved from the CottonMD database (Yang et al. \u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). The raw high-throughput sequencing data originated from the genome sequencing project of Upland cotton (TM-1) and Sea Island cotton (H7124) by Hu et al. (\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) (BioProject: PRJNA490626). We downloaded the normalized gene expression matrix (TPM values) and selected sample data covering 21 representative tissues/stages, including roots, stems, leaves, floral organs, and ovules and fibers at various developmental stages. To enhance analysis reliability and reduce background noise, genes with extremely low expression levels (TPM\u0026thinsp;\u0026lt;\u0026thinsp;0.1) across all samples were filtered out prior to downstream analysis.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e2.3 Gene Expression Characterization and Comparative Transcriptomic Analysis\u003c/h2\u003e \u003cp\u003eTo evaluate global expression patterns across samples, Principal Component Analysis (PCA) was visualized using the ggplot2 package (Wickham \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e2016\u003c/span\u003e) in R (R Core Team \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2025\u003c/span\u003e), based on a log2(TPM\u0026thinsp;+\u0026thinsp;1) transformed expression matrix. Additionally, t-SNE (Van der Maaten and Hinton \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2008\u003c/span\u003e) and UMAP (McInnes et al. \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) dimensionality reduction analyses were performed using the Rtsne and umap packages, respectively. To quantify transcriptional conservation between species, expressed genes were filtered with a threshold of TPM\u0026thinsp;\u0026ge;\u0026thinsp;0.1, and the Pearson correlation coefficient (\u003cem\u003eR\u003c/em\u003e) of the number of expressed genes between corresponding tissues of TM-1 and H7124 was calculated using the \u0026ldquo;cor.test\u0026rdquo; function in R. Furthermore, to dissect divergence in expression abundance, expressed genes in each tissue were ranked by TPM values and categorized into five expression windows (Very Low, Low, Medium, High, Very High). The number of \"shared genes\" falling into the same expression window in both species for a given tissue was counted using custom Python scripts. Finally, to eliminate the bias of total gene number differences, we calculated the relative proportions of common genes versus species-specific genes in each tissue. For tissue specificity analysis, the Tissue Specificity Index (Tau) was calculated following the method of Yanai et al. (\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2005\u003c/span\u003e). A Tau value closer to 1 indicates stronger tissue specificity. Specific genes were defined as those with a Specificity Measure (SPM)\u0026thinsp;\u0026gt;\u0026thinsp;0.5 (Kryuchkova-Mostacci and Robinson-Rechavi, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2017\u003c/span\u003e), and UpSet plots were generated using the R package UpSetR (Conway et al. \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2017\u003c/span\u003e) to visualize the intersections of specific genes across different tissues.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003e2.4 Weighted Gene Co-expression Network Construction and Unbiased Functional Annotation\u003c/h2\u003e \u003cp\u003eTo mine key co-expression gene modules driving the delayed fiber elongation in Sea Island cotton (Avci et al. \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2013\u003c/span\u003e), we constructed a weighted gene co-expression network using the WGCNA package in R (Langfelder and Horvath \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2008\u003c/span\u003e). Considering computational efficiency and noise reduction, the top 6,000 genes with the highest variance in fiber development samples were selected as input data. First, a Pearson correlation matrix between genes was calculated, and a scale-free network was constructed based on a soft threshold. Subsequently, the adjacency matrix was converted into a Topological Overlap Matrix (TOM), and co-expression modules were identified using the Dynamic Tree Cut algorithm. Module expression patterns were visualized via Module Eigengenes (ME) to identify target modules specifically highly expressed in 20 DPA fibers of G. barbadense. For unbiased functional annotation of the target module, protein sequences of all genes within the module were extracted and scanned against the Pfam-A database (Mistry et al. \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) using the hmmscan program in HMMER 3.0 software (Eddy \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). The filtering criterion was set to an e-value\u0026thinsp;\u0026lt;\u0026thinsp;1e-5. Based on the scan results, the types of domains contained in each gene were counted, and gene families were ranked by domain frequency to determine the most enriched categories of transcriptional regulators in the module.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003e2.5 Genome-wide Identification of Key Transcription Factor Families (MYB and AP2/ERF)\u003c/h2\u003e \u003cp\u003eTo construct a high-coverage transcription factor family atlas and mine potential unannotated loci, we performed genome-wide identification of the MYB and AP2/ERF families in the TM-1 and H7124 genomes using the Bitacora pipeline (V1.4.2) (Vizueta et al. \u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). We employed the software's \"Full Mode,\" integrating genomic sequences, genome annotation files, and protein sequences for comprehensive analysis. The specific workflow was as follows: Query libraries containing Hidden Markov Models (HMM) for MYB (Pfam: PF00249) and AP2 (Pfam: PF00847) domains were constructed, with Pfam seed files downloaded from the Pfam database (Mistry et al. \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). HMMER (Eddy \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2011\u003c/span\u003e) was first used to scan the input reference proteomes to identify annotated family members. For Homology-based Gene Prediction, the integrated GeMoMa software (Keilwagen et al. \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2016\u003c/span\u003e) was used to deeply mine loci potentially missed in the reference annotation. This step used homologous proteins identified in the first step as templates to predict gene structures directly from genomic DNA sequences, thereby recovering \"Putative New Genes\" that were missed or incompletely annotated. Subsequently, Bitacora automatically merged results from \"proteome search\" and \"genome prediction\" and removed redundancy based on coordinates. The final candidate sequences were further verified for domain integrity using the Pfam database (Mistry et al. \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2021\u003c/span\u003e), discarding sequences with an e-value\u0026thinsp;\u0026lt;\u0026thinsp;1e-5 or missing key domains to ensure high confidence for downstream analysis.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e2.6 Screening and Reciprocal Validation of Species-Specific Transcription Factors\u003c/h2\u003e \u003cp\u003eTo pinpoint key candidate factors undergoing \"Reciprocal Selective Retention\" from vast gene families, we established a rigorous \"PAV Intersection-Reciprocal Validation\" screening workflow.\u003c/p\u003e \u003cp\u003eInitial Screening: First, genome-wide Presence/Absence Variation (PAV) lists generated by comparative genomic analysis using MUMmer4 (Mar\u0026ccedil;ais et al. \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) and Minimap2 (Li \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) were intersected with the identified MYB and AP2/ERF family members. This preliminarily screened for candidate transcription factors present in H7124 but absent in TM-1 (H7124-specific), or present in TM-1 but absent in H7124 (TM-1-specific). Reciprocal BLAST Validation: To exclude false positives caused by genome assembly errors, missing annotations, or pseudogenization, we implemented strict sequence-level validation for all candidate genes. The complete CDS sequences of candidate genes were extracted as queries and reciprocally aligned to the counterpart reference genome using BLASTN (Camacho et al. \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2009\u003c/span\u003e) (e.g., aligning H7124-specific candidates back to the TM-1 genome, and vice versa). Retention Criteria: We retained only those genes for which no valid homologous match was detected in the counterpart genome, or where detected matches exhibited severe sequence truncation (Coverage\u0026thinsp;\u0026lt;\u0026thinsp;60%) and high sequence divergence (Identity\u0026thinsp;\u0026lt;\u0026thinsp;85%).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e2.7 Genomic Structural Variation and Microsynteny Analysis\u003c/h2\u003e \u003cp\u003eTo elucidate the physical mechanisms leading to the loss of the aforementioned key transcription factors (e.g., physical deletion/NOTAL or sequence collapse/HDR), we combined whole-genome alignment with local microsynteny analysis.\u003c/p\u003e \u003cp\u003eWhole-Genome Structural Variation (SV) Identification: The SyRI (Synteny and Rearrangement Identifier) tool (Goel et al. \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) was used to systematically identify syntenic regions, inversions, translocations, and local variations (including NOTAL and HDR) based on NUCmer (Mar\u0026ccedil;ais et al. \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) alignment results of TM-1 and H7124 (parameters: --maxmatch -l 100 -c 1000). The Python tool plotsr (Goel and Schneeberger \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) was used to visualize the SV status of candidate gene loci. Targeted Microsynteny Mapping: To reconstruct the evolutionary history of the six candidate loci, we selected eight representative Gossypium genomes (Supplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e) to construct microsynteny maps. Centered on the target gene (e.g., MYB111 or WIND1), 2\u0026ndash;5 single-copy, highly conserved genes upstream and downstream were selected as physical anchors. Genomic sequences between anchors were extracted, and the presence form (intact, truncated, or completely lost) of the target gene in different species was determined using BLASTN (Camacho et al. \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2009\u003c/span\u003e). The Python tool plotsr (Goel and Schneeberger \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) was then used to plot microsynteny structures to illustrate the specific types of gene loss.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003e2.8 Pan-genome Homology Search and Evolutionary Trajectory Analysis\u003c/h2\u003e \u003cp\u003eTo verify the universality of the RSR pattern in Gossypium germplasm resources, we extended the analysis to 26 Gossypium pan-genomes. Using the CDS sequences of the six candidate genes as probes, homology searches were performed against all pan-genomes using the BLAST+ package (Camacho et al. \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2009\u003c/span\u003e) (threshold: e-value\u0026thinsp;\u0026lt;\u0026thinsp;1e-5). A custom Python script was used to extract the best hit locus in each genome and calculate its Identity and Coverage relative to the query sequence. Simultaneously, chromosomal location information (A-subgenome or D-subgenome) of the best hit was recorded to distinguish between orthologs and partially retained homeologs. The final results were visualized as bubble plots using the Python tool plotsr (Goel and Schneeberger \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) to display the lineage-specific retention and loss trajectories of these key factors during the domestication from wild species to cultivars.\u003c/p\u003e \u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Global Transcriptome Landscape Reveals Evolutionary Conservatism and Tissue-Specific Divergence Between Upland and Sea Island Cotton\u003c/h2\u003e \u003cp\u003eTo systematically dissect the regulatory divergence driving the phenotypic trade-off between yield and quality, we constructed a comprehensive transcriptomic atlas covering 21 tissues throughout the full growth period of Upland cotton (G. hirsutum, TM-1) and Sea Island cotton (G. barbadense, H7124).\u003c/p\u003e \u003cp\u003eDimensionality reduction analysis (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA) showed that samples clustered tightly by tissue type rather than species. Vegetative tissues (roots, stems, leaves) and floral organs of both species clustered together, indicating that despite divergent domestication, the core transcriptional chassis maintaining basic physiological functions remains highly conserved. However, beneath this global conservation, we observed significant lineage-specific differentiation in transcriptional activity. Comparison of expressed gene numbers across homologous tissues showed moderate correlation (Pearson\u0026rsquo;s R\u0026thinsp;=\u0026thinsp;0.63) but highlighted unique outliers closely related to agronomic traits (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). Notably, TM-1 exhibited significantly more expressed genes in roots and anthers, reflecting its robust resource acquisition capability and reproductive potential. In contrast, H7124 maintained a more active transcriptome in developing fibers (especially at 20 DPA) and late-stage ovules, suggesting a more persistent regulatory program underlying its superior fiber quality. Furthermore, the distribution of the Tissue Specificity Index (Tau) revealed a shift in regulatory strategies (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). While both species exhibited a bimodal distribution, TM-1 showed an expansion in the number of highly tissue-specific genes (Tau\u0026thinsp;\u0026gt;\u0026thinsp;0.5). To verify this differentiation at the gene level, we further performed pairwise comparisons of Tau indices for orthologous genes (Supplementary Fig.\u0026nbsp;1). The analysis showed that despite extremely high evolutionary conservation genome-wide (R\u0026thinsp;=\u0026thinsp;0.951), confirming that functional specialization of most genes has been \"canalized,\" we still identified a series of \"outlier genes\" significantly deviating from the diagonal. These genes, which underwent drastic drift in expression breadth between species, combined with the expansion observed in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC, strongly suggest that the evolutionary driving force for the \"high yield\" trait of TM-1 stems from the directional specialization and remodeling of key functional modules in specific organs (especially underground roots and reproductive sinks).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003e3.2 Asymmetric Expansion of Tissue-Specific Gene Repertoires and Phase Inversion of Fiber\u003c/h2\u003e \u003cp\u003e \u003cb\u003eDevelopment Programs\u003c/b\u003e \u003c/p\u003e \u003cp\u003eTo dissect the fine-tuning mechanisms determining \"yield-quality\" divergence, we first systematically deconstructed the distribution patterns of tissue-specific gene repertoires (SPM\u0026thinsp;\u0026gt;\u0026thinsp;0.5) in the two species using UpSet plots (Supplementary Figs.\u0026nbsp;2 \u0026amp; 3). The analysis revealed that Upland cotton and Sea Island cotton diverged in their investment directions for \"functionally specialized genes\" during domestication. Upland cotton (TM-1) exhibited a significant \"underground-priority\" strategy (Supplementary Fig.\u0026nbsp;3). Its root-specific gene number underwent an explosive expansion, constituting the most significant specialized module among all tissues, suggesting the evolution of a complex root network to support high yield potential. However, as an evolutionary trade-off, its number of specific genes significantly contracted during the middle to late stages of fiber development (10\u0026ndash;25 DPA). In contrast, although Sea Island cotton (H7124) had fewer root genes, it maintained sustained high investment in reproductive sink organs (Supplementary Fig.\u0026nbsp;2); particularly in 20 DPA Fiber (elongation-thickening transition period) and 25 DPA Fiber, H7124 still retained an extremely vast specific gene repertoire, providing the genetic basis for the fine shaping of \"extra-long staple\" quality. This asymmetric distribution of gene repertoires\u0026mdash;\"TM-1 strong roots\" vs. \"H7124 strong fibers\"\u0026mdash;further manifested as a drastic \"phase inversion\" in the spatiotemporal dynamics of the transcriptome (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e):\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFirst, the overall distribution of gene expression (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb) and stratified sharing analysis (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ec) reconfirmed the evolutionary conservation of vegetative growth. In vegetative organs such as roots, stems, and leaves, the two species not only shared highly consistent overall distribution shapes of gene expression abundance but also retained a high proportion of common genes in all expression windows (especially medium-to-high expression intervals) (green bars, approx. 30%-50%, Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea), indicating that the transcriptional network maintaining basic biomass accumulation in cotton plants is stable across species.\u003c/p\u003e \u003cp\u003eHowever, entering the critical stages of fiber and ovule development, the transcriptional program underwent fundamental remodeling, presenting a significant \"species-biased phase inversion\" (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea): H7124-dominated \"Delayed Elongation\" Phase (20 DPA): At the critical node of transition from fiber elongation to secondary wall thickening, Sea Island cotton-specific expressed genes (red bars) occupied absolute dominance (approx. 70%), while common genes and TM-1 specific genes were compressed to extremely low proportions. Combined with the wider high-expression interval of H7124 in this period in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb, this indicates that Sea Island cotton extends the developmental window of fiber elongation by specifically activating a vast transcriptional network (i.e., the gene repertoire observed in Supplementary Fig.\u0026nbsp;2), thereby achieving the high-quality phenotype. TM-1-dominated \"Precocious Filling\" Phase (25 DPA): Subsequently, during the high-speed secondary wall synthesis period, the trend reversed. The proportion of Upland cotton-specific genes (blue bars) significantly rebounded and took dominance. This suggests that Upland cotton initiates the secondary wall deposition program centered on biomass accumulation earlier, reflecting its breeding selection strategy for \"early maturity and high yield (high lint percentage)\". In summary, although the basic transcriptional chassis of cotton is conservative, Upland and Sea Island cotton likely achieved spatiotemporal separation of developmental programs by recruiting distinct specific gene modules: H7124 chose a delay strategy of \"trading time for quality,\" while TM-1 chose a precocious strategy of \"efficiency first\". This phase inversion at the transcriptional level (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e evidence), caused by the asymmetric expansion of gene repertoires (UpSet evidence), provides systematic molecular insights for understanding the evolutionary trade-off between \"high yield\" and \"high quality\".\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003e3.3 WGCNA Reveals a \"Delayed Elongation\" Regulatory Module Centered on MYB-AP2\u003c/h2\u003e \u003cp\u003eTo deconstruct the \"fiber development phase inversion\" phenomenon observed in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e and mine the core drivers maintaining high transcriptional activity in Sea Island cotton (H7124) at 20 DPA, we constructed a gene co-expression network using WGCNA. The analysis pinpointed Module 6 as the key target module: the expression profile of this module highly matches the \"delayed elongation\" phenotype of Sea Island cotton (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ea), maintaining extremely high activity in H7124 fibers at 10\u0026ndash;20 DPA, while rapidly declining in Upland cotton (TM-1) during the same period. This significant species-specific expression pattern may imply that Module 6 is the transcriptional basis for Sea Island cotton fibers breaking through the conventional elongation time limit.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eStatistics on the number of gene families contained in this module showed (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eb) that it is enriched with a large number of Protein Kinase (Pkinase) and Leucine-Rich Repeat (LRR) domains, indicating that cells are receiving continuous growth signals to maintain an active metabolic state. More critically, at the transcriptional regulation level (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ec), the MYB and AP2/ERF families showed the highest recruitment abundance. Among them, the MYB family (38 members) ranked first as recognized fiber development regulators, followed closely by the AP2/ERF family (37 members), which typically acts as ethylene response factors regulating cell elongation and stress adaptation. Considering the classic functions of the MYB family in secondary wall and fiber elongation, and the key role of AP2/ERF in coordinating hormone signals and environmental adaptation, the high co-enrichment of these two families in Module 6 implies that the Sea Island cotton-specific fiber elongation network may be orchestrated synergistically by these two types of key factors. To verify this hypothesis and trace the genetic roots leading to the expression divergence of Module 6 between species, we subsequently performed systematic identification of MYB and AP2 families genome-wide. Through intersection screening of gene family members with whole-genome PAV (Presence/Absence Variation) lists, supplemented by strict reciprocal BLAST validation (Supplementary Figs.\u0026nbsp;4 \u0026amp; 5), we finally locked down 6 key transcription factors that underwent \"reciprocal loss\" between the two cotton species from the genome. To parse the physical mechanisms of these loss events, we further performed microscopic synteny and structural variation analyses on them.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003e3.4 Reciprocal Genomic Structural Variation and Physical Loss Mechanisms of Key Transcription Factors\u003c/h2\u003e \u003cp\u003eTo dissect the physical basis of expression divergence for the key transcription factors identified by WGCNA, PAV, and reciprocal BLAST, we employed a \"Reciprocal Whole-genome Alignment\" strategy. First, SyRI analysis constructed a macroscopic structural variation map using each genome as a reference (Supplementary Figs.\u0026nbsp;6 \u0026amp; 7). Chromosome-scale visualization clearly displayed a highly collinear background (gray links) between TM-1 and H7124 at the whole-genome level (A01-A13, D01-D13), confirming the high quality and overall stability of the genome assemblies for both species. Based on this robust macroscopic framework, we precisely extracted local alignment information for the six candidate gene loci from the whole-genome alignment results and plotted high-resolution microscopic structural variation maps (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). This progressive \"zoom-in\" analysis from \"whole-chromosome background\" to \"specific loci\" revealed two lineage-specific loss mechanisms:\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFirst, regarding \u003cem\u003eMYB111\u003c/em\u003e (A12), which is specifically retained in Sea Island cotton and serves as the core factor driving \"delayed elongation\" as mentioned earlier, SyRI analysis revealed the root cause of its silencing in Upland cotton: a physical deletion (NOTAL) of 5.06 kb occurred at the corresponding chromosomal position in TM-1 (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eb). This \"clean\" genomic excision directly removed the coding region of \u003cem\u003eMYB111\u003c/em\u003e, genetically terminating the potential for further fiber elongation in Upland cotton, thereby shifting the strategy toward early maturity and high yield. In a mirror contrast, \u003cem\u003eWIND1\u003c/em\u003e (A02), specifically retained in Upland cotton and a potential regulator of cell regeneration and root plasticity, underwent a catastrophic loss event in Sea Island cotton. The corresponding region in H7124 suffered a massive deletion of 201.90 kb (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003ea), resulting in the complete removal of the gene and its surrounding regulatory regions. This may explain the defects in root adaptability (i.e., the \"delicate\" nature) of Sea Island cotton. In addition to physical deletion, Sequence Collapse (HDR) represents another major loss mechanism. For instance, the Upland-specific \u003cem\u003eMYB93\u003c/em\u003e (A11) and \u003cem\u003eCRF10\u003c/em\u003e (A12) both fell into Highly Diverged Regions (HDR) at their corresponding loci in Sea Island cotton, indicating that the original gene sequences underwent drastic rearrangement or degeneration, leading to loss of function.\u003c/p\u003e \u003cp\u003eIn summary, Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e implies that this \"double dissociation\" phenomenon between the two cultivated species is not a sequencing error but likely a genuine genomic event. Upland cotton \"discarded\" quality genes like \u003cem\u003eMYB111\u003c/em\u003e in exchange for early maturity, while Sea Island cotton \"discarded\" adaptability genes like \u003cem\u003eWIND1\u003c/em\u003e in exchange for specialized development. This asymmetric structural variation on homologous chromosomes constitutes the solid genetic basis for the \"yield-quality\" phenotypic trade-off between the two.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003e3.5 Pan-genome Microsynteny Reveals Lineage-Specific Loss Trajectories of Key Factors\u003c/h2\u003e \u003cp\u003eTo further confirm that the structural variations identified in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e were driven by directional selection during domestication rather than random events, we expanded our scope to the evolutionary history of the Gossypium genus. By incorporating the genomes of diploid ancestors (\u003cem\u003eG. raimondii\u003c/em\u003e, \u003cem\u003eG. arboreum\u003c/em\u003e) and wild tetraploids (\u003cem\u003eG. tomentosum\u003c/em\u003e, \u003cem\u003eG. mustelinum\u003c/em\u003e), we constructed high-resolution microsynteny maps for the six key loci (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe analysis clearly revealed an evolutionary pattern of \"ancestral omnipotence vs. descendant specialization,\" strongly supporting the RSR hypothesis. \"Upland-style Elimination\" of Quality Genes: Focusing on \u003cem\u003eMYB111\u003c/em\u003e (determining delayed fiber elongation) and \u003cem\u003eERF105/ERF017\u003c/em\u003e (environmental buffering factors) (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e right panel), we found that these genes maintained intact gene structures and collinear relationships in diploid ancestors and wild tetraploids. This implies that \"high quality/long staple\" was originally an ancient trait potential of the Gossypium genus. However, on the divergence branch of Upland cotton (TM-1, J668), these loci underwent precise physical excision. This suggests that Upland cotton may have \"actively\" discarded these high-energy-consuming quality regulatory genes during domestication to achieve early maturity and high yield.\u003c/p\u003e \u003cp\u003e\"Sea Island-style Elimination\" of Adaptive Genes: Conversely, at the \u003cem\u003eMYB93\u003c/em\u003e (root architecture) and \u003cem\u003eWIND1\u003c/em\u003e (regeneration capacity) loci (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e left panel), the Sea Island lineage exhibited specific loss. In particular, while WIND1 existed intact in Upland cotton and all wild relatives, a significant genomic gap appeared in H7124 and 3\u0026ndash;79. This \"functional deficiency\" specifically occurring in the Sea Island evolutionary chain is likely the genetic root of its poor root adaptability and environmental sensitivity (i.e., being \"delicate\"). In summary, microsynteny evidence indicates that the phenotypic divergence between Upland and Sea Island cotton did not stem from the evolution of new genes, but rather from their inheritance of a complete gene pool from a common ancestor, followed by divergent \"genomic subtraction\" in opposite directions. This complementary trajectory of gene loss ultimately solidified their distinct agronomic traits.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003e3.6. Pan-genomic Evolutionary Footprints Confirm the Universality of Reciprocal Selective Retention (RSR)\u003c/h2\u003e \u003cp\u003eTo verify whether the RSR pattern revealed in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e and Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e is universal across cotton germplasm resources, we extended the analysis to 26 Gossypium pan-genomes, covering diploid ancestors, wild tetraploids, and modern cultivars. Using the six key genes as probes, we constructed high-resolution phylogenomic footprints (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e). Pan-genomic alignment clearly categorized these six genes into two distinct evolutionary trajectories, revealing that the \"3-vs-3\" reciprocal loss pattern is likely a fixed genetic divergence at the species level between Upland and Sea Island cotton.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e\"Alternative Detection\" of Homeologs Detects Subgenome-Specific Loss: In the \"Upland-Retained Group\" (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003ea, e.g., \u003cem\u003eWIND1\u003c/em\u003e), all Upland cotton accessions (TM-1, J668, etc.) showed high-identity matches located on chromosome A02 (red bubbles, Identity\u0026thinsp;\u0026gt;\u0026thinsp;99%), indicating that this gene is core and conserved within the Upland cotton population. However, in all Sea Island cotton accessions (3\u0026ndash;79, H7124, Giza7, etc.), BLAST could only detect lower-identity matches located on chromosome D02 (blue bubbles, Identity\u0026thinsp;~\u0026thinsp;98%). This \"chromosomal drift\" from A02 to D02 is crucial\u0026mdash;it implies that the original A-subgenome functional copy (\u003cem\u003eWIND1\u003c/em\u003e) in the Sea Island cotton genome has been completely lost, forcing the probe to \"settle for the second best\" and match the D-subgenome homeolog. This population-wide loss pattern confirms that the absence of \u003cem\u003eWIND1\u003c/em\u003e is a species characteristic of Sea Island cotton, rather than a mutation in individual varieties.\u003c/p\u003e \u003cp\u003eCollective Silencing of \"Quality Genes\" in Upland Population: Conversely, in the \"Sea Island-Retained Group\" (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eb, e.g., \u003cem\u003eMYB111\u003c/em\u003e), the Sea Island cotton population consistently retained the intact gene on chromosome A12. However, in the Upland cotton population, this locus appeared as extremely low coverage (small bubbles) or homeologous drift. This further increases the likelihood that Upland cotton, in pursuit of early maturity and high yield, \"purged\" this fiber elongation factor at the whole-population level during domestication. In summary, pan-genomic evidence indicates that the RSR mechanism is not accidental genetic drift but a \"core genetic imprint\" solidified in the genomes of the two species after long-term domestication selection. This highly consistent \"retention vs. loss\" pattern between species provides a solid theoretical basis for improving Upland cotton using wild germplasm resources or the Sea Island cotton gene pool.\u003c/p\u003e \u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Reciprocal Selective Retention (RSR): A New Genomic Evolutionary Perspective Breaking the \"Yield-Quality\" Negative Correlation\u003c/h2\u003e \u003cp\u003eThe negative correlation between \"high yield\" (typified by Upland cotton) and \"high quality\" (typified by Sea Island cotton) has long been regarded as an insurmountable bottleneck in cotton breeding, often attributed to linkage drag or physiological energy constraints (Yang et al., \u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e2026\u003c/span\u003e). However, our multi-omics integrated analysis offers an alternative genomic explanation: the Reciprocal Selective Retention (RSR) evolutionary model. Unlike the gradual accumulation of Single Nucleotide Polymorphisms (SNPs), RSR reveals a radical \"genomic subtraction\" strategy, where species actively eliminated genetic modules inconsistent with their survival strategies (Olson, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e1999\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eSpecifically, this model reveals the significantly divergent evolutionary logics adopted by the two species: (1) Upland Cotton's \"High-Yield Robustness Module\": To support its superior yield potential, Upland cotton specifically retained \u003cem\u003eCRF10\u003c/em\u003e (Cytokinin Response Factor, potentially regulating sink capacity and biomass) (Rashotte et al. \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e2006\u003c/span\u003e), \u003cem\u003eWIND1\u003c/em\u003e (cell regeneration factor) (Iwase et al. 2011), and MYB93 (root system architecture factor) (Gibbs et al. \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2014\u003c/span\u003e). These genes collectively construct a genetic chassis integrating \"strong acquisition (roots), strong sink capacity (CRF), and strong repair (WIND)\". (2) Sea Island Cotton's \"Quality Specialization Module\": As an evolutionary cost, Sea Island cotton lost the aforementioned broad-adaptability genes, instead specifically retaining \u003cem\u003eMYB111\u003c/em\u003e (maintaining high flavonoid levels to finely tune/delay the end of elongation, thereby achieving longer fibers) (Stracke et al. \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2007\u003c/span\u003e; Tan et al. \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2013\u003c/span\u003e) and \u003cem\u003eERF105/ERF017\u003c/em\u003e (stress defense factors) (Bolt et al. \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). We speculate that these ERF members may act as \"environmental buffers\" during the prolonged fiber development window, providing necessary physiological homeostasis protection for the precise synthesis of high-quality fibers. This asymmetric \"hardware\" solidification on homologous chromosomes constitutes the key genomic structural basis for the phenotypic trade-off between the two species.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003e4.2 Transcriptome \"Phase Inversion\": A Developmental Trade-off of \"Time\" for \"Quality\"\u003c/h2\u003e \u003cp\u003eOur comparative transcriptomic landscape reveals an intriguing paradox in cotton domestication: vegetative organs are highly conserved, while reproductive organs undergo drastic remodeling. This phenomenon implies that the cotton genome is subject to strong evolutionary constraints to maintain basic life activities (transcriptional chassis) on one hand, while possessing extremely high plasticity in key traits like fiber development on the other.\u003c/p\u003e \u003cp\u003eThe most significant discovery is the \"Phase Inversion\" of the transcriptional program during fiber development. The \"delayed elongation\" network specifically activated by Sea Island cotton at 20 DPA is essentially a strategy of \"trading time for quality.\" Through the sustained expression of key factors like \u003cem\u003eMYB111\u003c/em\u003e, Sea Island cotton delays the initiation of secondary wall thickening, thereby gaining a longer elongation window, which is consistent with previous physiological observations (Avci et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2013\u003c/span\u003e). In contrast, the transcriptional profile of Upland cotton reflects an \"efficiency-first\" strategy: the physical loss of \u003cem\u003eMYB111\u003c/em\u003e leads to the premature termination of the elongation program, forcing cells to rapidly enter the secondary wall filling stage (the blue dominant peak at 25 DPA). Although this \"precocious mode\" limits fiber length, it significantly shortens the growth cycle and increases boll weight (biomass) (Haigler et al. \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2012\u003c/span\u003e), perfectly fitting the demand for \"high lint percentage and short accumulated temperature\" in extensive cultivation.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003e\u003cb\u003e4.3 Evolutionary Constraints and Adaptive Differentiation: Evolutionary Trade-off between \"Resource Acquisition\" and \"Sink Development Timing\"\u003c/b\u003e\u003c/h2\u003e \u003cp\u003eOur comparative transcriptomic landscape and pan-genomic analysis further reveal the co-evolutionary logic of \"resource acquisition (Source)\" and \"sink capacity construction (Sink)\" during cotton domestication (White et al., \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e2016\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eFirst is the \"underground cornerstone\" for building efficient environmental adaptability. Our data indicate that TM-1 exhibits a significant expansion of specific genes in roots and specifically retains \u003cem\u003eMYB93\u003c/em\u003e. Previous studies have shown that the homolog of \u003cem\u003eMYB93\u003c/em\u003e in Arabidopsis acts as a negative regulator of lateral root development (Gibbs et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2014\u003c/span\u003e). The retention of this \"suppressor\" by Upland cotton may not be accidental but an optimization strategy for Root System Architecture (RSA): limiting ineffective proliferation of shallow lateral roots to promote deep rooting of the main root or improve water and fertilizer use efficiency (Lynch, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e1995\u003c/span\u003e). Compared to the sensitivity of Sea Island cotton roots to specific environments, this root remodeling based on genomic specific retention provides a foundation for Upland cotton to establish a robust nutrient absorption network in diverse environments, offering sufficient material and energy support for the subsequent reproductive burst (Hu et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eSecondly, supported by a robust \"Source\", Upland cotton achieved an explosive expansion of \"Sink Capacity\". The core of the yield difference lies in the increase in boll number per plant and seed number per boll, which mainly depends on the number of locules. Upland cotton typically has 4\u0026ndash;5 locules, significantly higher than the 3-locule characteristic of Sea Island cotton (Zhang et al., \u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Viot and Wendel, \u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Our analysis precisely captured the transcriptomic imprint determining this trait: the fundamental remodeling of the reproductive organ development program. On one hand, TM-1 showed significant specific gene amplification in anthers, which may enhance pollen viability and fertilization efficiency to meet the high demand of multi-locule and multi-ovule fertilization for male gametes; on the other hand, our common gene analysis revealed that the pistil\u0026mdash;the maternal tissue determining carpel fusion and locule number\u0026mdash;is the most drastically differentiated floral organ between the two species (common gene proportion\u0026thinsp;\u0026lt;\u0026thinsp;5%) (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). This highly specific transcriptional profile suggests that Upland cotton may have broken the genetic limit of \"3 locules\" in Sea Island cotton by reconstructing the meristem maintenance pathway or floral organ identity gene network, thereby establishing a high sink capacity architecture of \"4\u0026ndash;5 locules\", which represents a significant advantage for yield.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003e4.4 Implications for Molecular Module Breeding: Reassembling \"High-Yield Chassis\" and \"Quality Modules\"\u003c/h2\u003e \u003cp\u003eThe proposal of the RSR model provides novel insights for cotton genetic improvement. Since the divergence of \"high yield\" and \"high quality\" stems from the lineage-specific loss of functional modules, future breeding strategies should shift from traditional hybridization to \"Genomic Design\" based on pan-genomic information.\u003c/p\u003e \u003cp\u003eStrategy 1: Reshaping the adaptability of Sea Island cotton. To address the restricted cultivation range of Sea Island cotton, gene editing or precise introgression could be employed to \"reintroduce\" the Upland cotton-specific MYB93-WIND1 root module. Given the conserved roles of WIND1 and MYB93 in cell regeneration and root architecture (Iwase et al., 2011; Gibbs et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2014\u003c/span\u003e), this strategy holds the potential to reconstruct the underground robustness of Sea Island cotton, thereby enhancing its environmental plasticity.\u003c/p\u003e \u003cp\u003eStrategy 2: Breaking the quality ceiling of Upland cotton. Addressing the limitation of fiber length in Upland cotton, we can attempt to introduce the \u003cem\u003eMYB111\u003c/em\u003e delay module from Sea Island cotton. Previous studies have confirmed that flavonoid metabolism is closely related to fiber elongation (Tan et al., \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2013\u003c/span\u003e), and \u003cem\u003eMYB111\u003c/em\u003e is a key regulator of this pathway (Stracke et al., \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). However, it should be noted that simple replenishment may lead to late maturity. Therefore, using fiber-specific promoters (such as SCW promoters) to finely tune its expression window (Huang et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) to moderately extend the elongation phase without significantly prolonging the total growth period will be key to creating new \"high-yield and high-quality\" Upland cotton varieties.\u003c/p\u003e \u003c/div\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eThis study systematically dissected the genetic basis of the \"yield-quality\" phenotypic trade-off between Upland and Sea Island cotton by integrating multi-tissue transcriptomics and pan-genomic structural variation analysis. We found that: (1) The two species share a conserved transcriptional chassis but achieve functional specialization through asymmetric expansion of gene repertoires in specific tissues (Upland roots/anthers vs. Sea Island fibers). (2) The \"Phase Inversion\" of the transcriptional program during fiber development is key to fiber quality differences, with Sea Island cotton trading for a longer elongation window via a delayed expression strategy. (3) The \"high yield\" characteristic of Upland cotton benefits from its specifically retained adaptive genetic modules. For instance, the retained \u003cem\u003eMYB93\u003c/em\u003e and \u003cem\u003eWIND1\u003c/em\u003e may have optimized root architecture and environmental adaptability, building a robust \"underground resource acquisition\" system to support the explosive expansion of the above-ground reproductive sink. (4) This process is driven by the Reciprocal Selective Retention (RSR) mechanism at the genomic level, i.e., the complementary loss of key members of MYB and AP2/ERF families on homologous chromosomes (e.g., Upland cotton discarding \u003cem\u003eMYB111\u003c/em\u003e, Sea Island cotton discarding \u003cem\u003eWIND1\u003c/em\u003e). In summary, the RSR mechanism reveals the evolutionary wisdom of \"genomic subtraction\" in polyploid domestication. Our findings provide a robust theoretical framework and genomic targets for breaking the long-standing negative correlation between yield and quality through future genomic design breeding.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e \u003ch2\u003eDeclaration of Competing Interest\u003c/h2\u003e \u003cp\u003eThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.\u003c/p\u003e \u003c/p\u003e\u003ch2\u003eFunding\u003c/h2\u003e \u003cp\u003eThis work was supported by the National Key Research and Development Program of China (Grant No. 2024YFD1200300), the National Natural Science Foundation of China (Grant No. 31701471), and the State Key Laboratory of Cotton Bio-breeding and Integrated Utilization (Grant No. CB2025A13).\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eYifan Xu : Conceptualization, Methodology, Software, Formal analysis, Investigation, Data curation, Writing \u0026ndash; original draft, Visualization. Zaoyang Gong : Validation, Data curation. Qinglin Shen :Validation, Investigation. Rongzheng Zhao : Data curation, Software. Chuankang Cheng :Validation, Investigation. Wanting Su : Data curation, Visualization. Yibing Li : Validation, Investigation. Xingrui Yang : Data curation. Xin Ruan : Investigation. Fengyun Li : Validation. Kai Guo : Resources, Investigation. Dajun Liu : Resources, Investigation. Xueying Liu : Supervision, Writing \u0026ndash; review \u0026amp; editing. Zhonghua Teng : Supervision, Writing \u0026ndash; review \u0026amp; editing. Fang Liu :Supervision, Writing \u0026ndash; review \u0026amp; editing. Zhengsheng Zhang : Resources, Supervision. Yanchao Xu : Conceptualization, Methodology, Software, Supervision, Writing \u0026ndash; review \u0026amp; editing. Dexin Liu : Funding acquisition, Project administration, Supervision, Writing \u0026ndash; review \u0026amp; editing.\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003eWe acknowledge the High Performance Computing (HPC) clusters at Southwest University for their support.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eData will be made available on request.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAlagarsamy M (2023) Assessing genetic variation in Gossypium barbadense L. germplasm based on fibre characters. J Cotton Res 6:15. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s42397-023-00153-y\u003c/span\u003e\u003cspan address=\"10.1186/s42397-023-00153-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAvci U, Pattathil S, Singh B, Brown VL, Hahn MG, Haigler CH (2013) Cotton fiber cell walls of Gossypium hirsutum and Gossypium barbadense have differences related to loosely-bound xyloglucan. PLoS ONE 8:e56315. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pone.0056315\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0056315\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBolt S, Zuther E, Zintl S et al (2017) ERF105 is a transcription factor gene of Arabidopsis thaliana required for freezing tolerance and cold acclimation. Plant Cell Environ 40:108\u0026ndash;120. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/pce.12838\u003c/span\u003e\u003cspan address=\"10.1111/pce.12838\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCamacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/1471-2105-10-421\u003c/span\u003e\u003cspan address=\"10.1186/1471-2105-10-421\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChang X, He X, Li J et al (2024) High-quality Gossypium hirsutum and Gossypium barbadense genome assemblies reveal the landscape and evolution of centromeres. Plant Commun 5:100722. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.xplc.2023.100722\u003c/span\u003e\u003cspan address=\"10.1016/j.xplc.2023.100722\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen X, Chen W, Li X et al (2012) Molecular mechanisms of fiber differential development between G. barbadense and G. hirsutum revealed by genetical genomics. PLoS ONE 7:e30056. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pone.0030056\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0030056\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen ZJ, Sreedasyam A, Ando A et al (2020) Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nat Genet 52:525\u0026ndash;533. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-020-0614-5\u003c/span\u003e\u003cspan address=\"10.1038/s41588-020-0614-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCheng H, Liu S, Zhang Y et al (2025) Comparative single-cell transcriptomic map reveals divergence in leaves between two cotton species at cell type resolution. J Adv Res. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jare.2025.04.012\u003c/span\u003e\u003cspan address=\"10.1016/j.jare.2025.04.012\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCock PJ, Antao T, Chang JT et al (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25:1422\u0026ndash;1423\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eConway JR, Lex A, Gehlenborg N (2017) UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33:2938\u0026ndash;2940. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bioinformatics/btx364\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/btx364\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7:e1002195. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pcbi.1002195\u003c/span\u003e\u003cspan address=\"10.1371/journal.pcbi.1002195\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGibbs DJ, Vo\u0026szlig; U, Harding SA et al (2014) AtMYB93 is a novel negative regulator of lateral root development in Arabidopsis. New Phytol 203:1194\u0026ndash;1207. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/nph.12879\u003c/span\u003e\u003cspan address=\"10.1111/nph.12879\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoel M, Schneeberger K (2022) plotsr: visualizing structural similarities and rearrangements between multiple genomes. Bioinformatics 38:2922\u0026ndash;2926. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bioinformatics/btac196\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/btac196\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoel M, Sun H, Jiao WB et al (2019) SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol 20:277. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s13059-019-1911-0\u003c/span\u003e\u003cspan address=\"10.1186/s13059-019-1911-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGrover CE, Yuan D, Arick MA et al (2019) The genome sequence of Gossypioides kirkii illustrates a descending dysploidy in plants. Front Plant Sci 10:1541. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fpls.2019.01541\u003c/span\u003e\u003cspan address=\"10.3389/fpls.2019.01541\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHaigler CH, Betancur L, Stiff MR, Tuttle JR (2012) Cotton fiber: a powerful single-cell model for cell wall research. Front Plant Sci 3:104. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fpls.2012.00104\u003c/span\u003e\u003cspan address=\"10.3389/fpls.2012.00104\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHu G, Wang Z, Tian Z et al (2025) A telomere-to-telomere genome assembly of cotton provides insights into centromere evolution and short-season adaptation. Nat Genet 57:1031\u0026ndash;1043. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-025-02130-4\u003c/span\u003e\u003cspan address=\"10.1038/s41588-025-02130-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHu Y, Chen J, Fang L et al (2019) Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat Genet 51:739\u0026ndash;748. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-019-0371-5\u003c/span\u003e\u003cspan address=\"10.1038/s41588-019-0371-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang G, Bao Z, Feng L et al (2024) A telomere-to-telomere cotton genome assembly reveals centromere evolution and a Mutator transposon-linked module regulating embryo development. Nat Genet 56:1953\u0026ndash;1963. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-024-01877-6\u003c/span\u003e\u003cspan address=\"10.1038/s41588-024-01877-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang J, Chen F, Guo Y et al (2021) GhMYB7 promotes secondary wall cellulose deposition in cotton fibres by regulating GhCesA gene expression through three distinct cis-elements. New Phytol 232:1718\u0026ndash;1737. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/nph.17612\u003c/span\u003e\u003cspan address=\"10.1111/nph.17612\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKeilwagen J, Wenk M, Erickson JL et al (2016) Using intron position conservation for homology-based gene prediction. Nucleic Acids Res 44:e89. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/gkw092\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkw092\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKryuchkova-Mostacci N, Robinson-Rechavi M (2017) A benchmark of tissue-specificity metrics for RNA-seq data. Brief Bioinform 18:205\u0026ndash;214. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bib/bbw008\u003c/span\u003e\u003cspan address=\"10.1093/bib/bbw008\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLangfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9:559. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/1471-2105-9-559\u003c/span\u003e\u003cspan address=\"10.1186/1471-2105-9-559\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi F, Fan G, Lu C et al (2015) Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat Biotechnol 33:524\u0026ndash;530. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/nbt.3208\u003c/span\u003e\u003cspan address=\"10.1038/nbt.3208\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094\u0026ndash;3100. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bioinformatics/bty191\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/bty191\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi Y, Tu L, Pettolino FA et al (2016) GbEXPATR, a species-specific expansin, enhances cotton fibre elongation through cell wall restructuring. Plant Biotechnol J 14:1006\u0026ndash;1017. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/pbi.12450\u003c/span\u003e\u003cspan address=\"10.1111/pbi.12450\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu R, Xiao X, Gong J et al (2024) Genetic linkage analysis of stable QTLs in Gossypium hirsutum RIL population revealed function of GhCesA4 in fiber development. J Adv Res 60:133\u0026ndash;148. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jare.2023.12.005\u003c/span\u003e\u003cspan address=\"10.1016/j.jare.2023.12.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLynch J (1995) Root architecture and plant productivity. Plant Physiol 109:7\u0026ndash;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1104/pp.109.1.7\u003c/span\u003e\u003cspan address=\"10.1104/pp.109.1.7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eManolio TA, Collins FS, Cox NJ et al (2009) Finding the missing heritability of complex diseases. Nature 461:747\u0026ndash;753. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/nature08494\u003c/span\u003e\u003cspan address=\"10.1038/nature08494\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMar\u0026ccedil;ais G, Delcher AL, Phillippy AM et al (2018) MUMmer4: A fast and versatile genome alignment system. PLoS Comput Biol 14:e1005944. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pcbi.1005944\u003c/span\u003e\u003cspan address=\"10.1371/journal.pcbi.1005944\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcInnes L, Healy J, Melville J (2018) UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMeng Q, Xie P, Xu Z et al (2025) Pangenome analysis reveals yield- and fiber-related diversity and interspecific gene flow in Gossypium barbadense L. Nat Commun 16:4995. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41467-025-60254-x\u003c/span\u003e\u003cspan address=\"10.1038/s41467-025-60254-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMistry J, Chuguransky S, Williams L et al (2021) Pfam: the protein families database in 2021. Nucleic Acids Res 49:D412\u0026ndash;D419. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/gkaa913\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkaa913\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOlson MV (1999) When less is more: gene loss as an engine of evolutionary change. Am J Hum Genet 64:18\u0026ndash;23. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1086/302219\u003c/span\u003e\u003cspan address=\"10.1086/302219\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePeng R, Xu Y, Tian S et al (2022) Evolutionary divergence of duplicated genomes in newly described allotetraploid cottons. Proc Natl Acad Sci USA 119:e2208496119. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1073/pnas.2208496119\u003c/span\u003e\u003cspan address=\"10.1073/pnas.2208496119\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePerkin LC, Bell A, Hinze LL et al (2021) Genome assembly of two nematode-resistant cotton lines (Gossypium hirsutum L.). G3 Genes Genomes Genet. 11:jkab276. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/g3journal/jkab276\u003c/span\u003e\u003cspan address=\"10.1093/g3journal/jkab276\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePinglay S, Lalanne J-B, Daza RM et al (2025) Multiplex generation and single-cell analysis of structural variants in mammalian genomes. Science 387:ado5978. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1126/science.ado5978\u003c/span\u003e\u003cspan address=\"10.1126/science.ado5978\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eR Core Team (2025) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.R-project.org/\u003c/span\u003e\u003cspan address=\"https://www.R-project.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRashotte AM, Mason MG, Hutchison CE et al (2006) A subset of Arabidopsis AP2 transcription factors mediates cytokinin responses in concert with a two-component pathway. Proc Natl Acad Sci USA 103:11081\u0026ndash;11085. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1073/pnas.0602038103\u003c/span\u003e\u003cspan address=\"10.1073/pnas.0602038103\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSheri V, Mohan H, Jogam P et al (2025) CRISPR/Cas genome editing for cotton precision breeding: mechanisms, advances, and prospects. J Cotton Res 8:4. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s42397-024-00206-w\u003c/span\u003e\u003cspan address=\"10.1186/s42397-024-00206-w\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStracke R, Ishihara H, Huep G et al (2007) Differential regulation of closely related R2R3-MYB transcription factors controls flavonol accumulation in different parts of the Arabidopsis thaliana seedling. Plant J 50:660\u0026ndash;677. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/j.1365-313X.2007.03078.x\u003c/span\u003e\u003cspan address=\"10.1111/j.1365-313X.2007.03078.x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTan J, Tu L, Deng F et al (2013) A genetic and metabolic analysis revealed that cotton fiber cell development was retarded by flavonoid naringenin. Plant Physiol 162:86\u0026ndash;95. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1104/pp.112.212142\u003c/span\u003e\u003cspan address=\"10.1104/pp.112.212142\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVan der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579\u0026ndash;2605\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eViot CR, Wendel JF (2023) Evolution of the cotton genus, Gossypium, and its domestication in the Americas. Crit Rev Plant Sci 42:1\u0026ndash;33. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/07352689.2022.2156061\u003c/span\u003e\u003cspan address=\"10.1080/07352689.2022.2156061\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVizueta J, S\u0026aacute;nchez-Gracia A, Rozas J (2020) BITACORA: a comprehensive tool for the identification and annotation of gene families in genome assemblies. Mol Ecol Resour 20:1445\u0026ndash;1452. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/1755-0998.13202\u003c/span\u003e\u003cspan address=\"10.1111/1755-0998.13202\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang J, Liang Y, Gong Z et al (2023) Genomic and epigenomic insights into the mechanism of cold response in upland cotton (Gossypium hirsutum). Plant Physiol Biochem 205:108206. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.plaphy.2023.108206\u003c/span\u003e\u003cspan address=\"10.1016/j.plaphy.2023.108206\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang M, Li J, Qi Z et al (2022) Genomic innovation and regulatory rewiring during evolution of the cotton genus Gossypium. Nat Genet 54:1959\u0026ndash;1971. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-022-01237-2\u003c/span\u003e\u003cspan address=\"10.1038/s41588-022-01237-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang M, Li J, Wang P et al (2021) Comparative genome analyses highlight transposon-mediated genome expansion and the evolutionary architecture of 3D genomic folding in cotton. Mol Biol Evol 38:3621\u0026ndash;3636. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/molbev/msab128\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msab128\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWen X, Chen Z, Yang Z et al (2023) A comprehensive overview of cotton genomics, biotechnology and molecular biological studies. Sci China Life Sci 66:2214\u0026ndash;2256. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s11427-022-2278-0\u003c/span\u003e\u003cspan address=\"10.1007/s11427-022-2278-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWendel JF (2000) Genome evolution in polyploids. Plant Mol Biol 42:225\u0026ndash;249. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1023/A:1006392424384\u003c/span\u003e\u003cspan address=\"10.1023/A:1006392424384\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWhite AC, Rogers A, Rees M et al (2016) How can we make plants grow faster? A source\u0026ndash;sink perspective on growth rate. J Exp Bot 67:31\u0026ndash;45. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/jxb/erv447\u003c/span\u003e\u003cspan address=\"10.1093/jxb/erv447\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWickham H (2016) ggplot2: elegant graphics for data analysis. Springer, New York\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWilkins D (2020) gggenes: Draw Gene Arrow Maps in 'ggplot2'. R package version 0.4.1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://CRAN.R-project.org/package=gggenes\u003c/span\u003e\u003cspan address=\"https://CRAN.R-project.org/package=gggenes\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXu Y, Wei Y, Zhou Z et al (2024) Widespread incomplete lineage sorting and introgression shaped adaptive radiation in the Gossypium genus. Plant Commun 5:100728. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.xplc.2023.100728\u003c/span\u003e\u003cspan address=\"10.1016/j.xplc.2023.100728\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXu Z, Wang G, Zhu X et al (2025) Genome assembly of two allotetraploid cotton germplasms reveals mechanisms of somatic embryogenesis and enables precise genome editing. Nat Genet 57:2028\u0026ndash;2039. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-025-02258-3\u003c/span\u003e\u003cspan address=\"10.1038/s41588-025-02258-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYan H, Han J, Jin S et al (2025) Post-polyploidization centromere evolution in cotton. Nat Genet. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-025-02115-3\u003c/span\u003e\u003cspan address=\"10.1038/s41588-025-02115-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYanai I, Benjamin H, Shmoish M et al (2005) Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics 21:650\u0026ndash;659. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bioinformatics/bti042\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/bti042\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang Z, Wang J, Huang Y et al (2023) CottonMD: a multi-omics database for cotton biological study. Nucleic Acids Res 51:D1446\u0026ndash;D1456. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/gkac863\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkac863\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang Z, Yang Z, Gao C et al (2026) Graph pan-genome illuminates evolutionary trajectories and agronomic trait architecture in allotetraploid cotton. Nat Genet 58:218\u0026ndash;229. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-025-02462-1\u003c/span\u003e\u003cspan address=\"10.1038/s41588-025-02462-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYao Y, Liu S, Xia C et al (2022) Comparative transcriptome in large-scale human and cattle populations. Genome Biol 23:176. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s13059-022-02745-4\u003c/span\u003e\u003cspan address=\"10.1186/s13059-022-02745-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu J, Jung S, Cheng CH et al (2021) CottonGen: the community database for cotton genomics, genetics, and breeding research. Plants 10:2805. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/plants10122805\u003c/span\u003e\u003cspan address=\"10.3390/plants10122805\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYuan D, Tang Z, Wang M et al (2015) The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres. Sci Rep 5:17662. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/srep17662\u003c/span\u003e\u003cspan address=\"10.1038/srep17662\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang D, Zhou H, Zhang Y et al (2025) Diverse roles of MYB transcription factors in plants. J Integr Plant Biol. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/jipb.13869\u003c/span\u003e\u003cspan address=\"10.1111/jipb.13869\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang J, Yu M, Guo Y et al (2025) Evolutionary divergence of an ethylene-responsive transcriptional cascade governs a dose-dependent balance between cotton fiber length and strength. Adv Sci. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1002/advs.202514154\u003c/span\u003e\u003cspan address=\"10.1002/advs.202514154\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang T, Hu Y, Jiang W et al (2015) Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol 33:531\u0026ndash;537. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/nbt.3207\u003c/span\u003e\u003cspan address=\"10.1038/nbt.3207\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang X, Tian X, Gao X et al (2025) Integrated metabolomic and transcriptomic analyses identify MYB genes regulating key metabolites and agronomic traits in upland cotton Gossypium hirsutum. Nat Genet 57:2819\u0026ndash;2830. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41588-025-02363-3\u003c/span\u003e\u003cspan address=\"10.1038/s41588-025-02363-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"functional-and-integrative-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"fige","sideBox":"Learn more about [Functional \u0026 Integrative Genomics](http://link.springer.com/journal/10142)","snPcode":"10142","submissionUrl":"https://submission.nature.com/new-submission/10142/3","title":"Functional \u0026 Integrative Genomics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Cotton, Pan-genome, Multi-tissue comparative transcriptomics, Reciprocal Selective Retention (RSR), Structural variation, Phenotypic trade-off","lastPublishedDoi":"10.21203/rs.3.rs-8754476/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8754476/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThe long-standing phenotypic trade-off between Upland cotton (\u003cem\u003eGossypium hirsutum\u003c/em\u003e, high yield) and Sea Island cotton (\u003cem\u003eG. barbadense\u003c/em\u003e, high quality) represents a major bottleneck in cotton genetic improvement and domestication. Despite rapid advances in genomics, the genomic structural variations driving the divergence of these two domestication strategies, and their subsequent impact on downstream transcriptional networks, remain unclear. This study integrates multi-tissue transcriptomic atlases spanning the full growth period with pan-genomic analyses of both Upland and Sea Island cotton, proposing a \"Reciprocal Selective Retention (RSR)\" strategy adopted by the two species during evolution. Transcriptomic analysis revealed that while the transcriptional chassis of the two species is highly conserved, a drastic \"phase inversion\" occurs during fiber development. Sea Island cotton appears to trade off for quality by specifically activating a \"delayed elongation\" module (at 20 DPA), whereas Upland cotton initiates a \"precocious filling\" program to pursue yield. Further Weighted Gene Co-expression Network Analysis (WGCNA), combined with genomic Presence/Absence Variation (PAV) analysis, identified a set of key transcription factors exhibiting a \"3-vs-3\" reciprocal loss pattern. Upland cotton specifically retained \u003cem\u003eCRF10\u003c/em\u003e, \u003cem\u003eWIND1\u003c/em\u003e, and \u003cem\u003eMYB93\u003c/em\u003e, constructing a genetic foundation of \"robust root system and strong regeneration\" to support high yield. Conversely, Sea Island cotton specifically retained \u003cem\u003eMYB111\u003c/em\u003e and \u003cem\u003eERF105/017\u003c/em\u003e to maintain long-staple characteristics and environmental buffering. Whole-genome structural variation and microsynteny analyses confirmed that these differences stem from asymmetric physical deletions (e.g., a 5 kb deletion in \u003cem\u003eMYB111\u003c/em\u003e) or fragment sequence collapse (e.g., a 15.9 kb sequence divergence in \u003cem\u003eMYB93\u003c/em\u003e) on homologous chromosomes. The RSR model proposed in this study not only offers novel insights into the genetic architecture of the yield-quality trade-off but also provides precise genomic targets for molecular design breeding.\u003c/p\u003e","manuscriptTitle":"Multi-tissue Transcriptomic and Pan-genomic Analyses Reveal Reciprocal Selective Retention Driving the Phenotypic Trade-off in Gossypium hirsutum and Gossypium barbadense","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-02-18 12:07:03","doi":"10.21203/rs.3.rs-8754476/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"editorInvitedReview","content":"","date":"2026-04-29T20:34:26+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-04-23T20:30:53+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"123827201468452350315120134896446877108","date":"2026-04-16T06:00:04+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"180135560572631573225654754270824975070","date":"2026-04-13T14:54:13+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"39214548876907071845893860364312272243","date":"2026-04-01T14:19:04+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-02-27T09:40:29+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"182011700601367194880074794335693680354","date":"2026-02-16T10:21:13+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"22458058772292995767685200855674899275","date":"2026-02-14T12:34:41+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"206764597949495639497458417715586578323","date":"2026-02-13T14:09:01+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-02-13T05:46:26+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-02-12T13:52:16+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-02-12T03:29:57+00:00","index":"","fulltext":""},{"type":"submitted","content":"Functional \u0026 Integrative Genomics","date":"2026-02-01T07:29:28+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"functional-and-integrative-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"fige","sideBox":"Learn more about [Functional \u0026 Integrative Genomics](http://link.springer.com/journal/10142)","snPcode":"10142","submissionUrl":"https://submission.nature.com/new-submission/10142/3","title":"Functional \u0026 Integrative Genomics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"a079ccc9-0c92-4b97-a687-c2c0a13c3cdf","owner":[],"postedDate":"February 18th, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"","date":"2026-04-29T20:34:26+00:00","index":79,"fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-02-18T12:07:03+00:00","versionOfRecord":[],"versionCreatedAt":"2026-02-18 12:07:03","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8754476","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8754476","identity":"rs-8754476","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.