Population-level analysis of rice transposon polymorphisms reveals an LTR-retrotransposon insertion linked to grain length | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Population-level analysis of rice transposon polymorphisms reveals an LTR-retrotransposon insertion linked to grain length Noemia Morales-Díaz, Josep M Casacuberta, Raúl Castanera This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9057165/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 8 You are reading this latest preprint version Abstract Transposable elements (TEs) constitute a major source of genetic variability in plants by generating insertion polymorphisms that contribute to genome evolution. LTR-retrotransposons are the most abundant plant TEs, but their association with traits has not been comprehensively studied. In this study, we take advantage of the large available genomic resources of rice to characterize LTR-retrotransposon polymorphisms in indica subspecies and uncover their relationship with trait variability. Using genome-wide association studies based on TE-polymorphisms (TIP-GWAS), we identified a non-reference LTR-retrotransposon insertion strongly associated with grain length in a region without previously known QTLs. The insertion is present in the upstream region of the IQD19 gene, which belongs to the calmodulin-binding domain family proteins. The insertion reduces IQD19 gene expression through a DNA-methylation-independent mechanism, likely by interfering with transcription factor regulation. In this study, we identify a potential novel grain length regulator in a region that can be targeted to design genetic markers for rice breeding. We also provide high-quality genome assemblies for two indica related accessions for which the TE is polymorphic. Finally, our results underscore the importance of incorporating TE polymorphisms to the pool of genetic variants used for rice association mapping. Genome Wide Association Study (GWAS) Transposable Elements Grain length Figures Figure 1 Figure 2 Figure 3 Background Rice yield is a complex trait influenced by the interaction of genetic and environmental factors, which regulate yield-related traits such as the number of panicles, the number of spikelets, grain weight or the ratio of filled grains (Hirai et al. 2012; Li and Nanseki 2021). Among these, grain weight is itself affected by grain thickness, width and length (Liu et al. 2017). Notably, the two most widely cultivated rice subspecies exhibit distinct grain morphologies: indica varieties typically produce longer and slender grains, whereas japonica grains tend to be shorter and more ovate in shape (Ikehashi 2009). Natural genetic variants in two QTLs (GS3 and GW5) can explain most of the grain length and width differences betweem the two subspecies (Duan et al. 2017). However, more than 600 QTLs and 31 genes have been described as associated with grain size in the two rice subspecies (Zhang et al. 2025). Some of the known grain-size genes are related to more than one seed trait, especially those associated with cellular proliferation. For example, RGA1 is involved in grain length and weight (Oki et al. 2005; Oki et al. 2009), GS3 in grain length, width and weight (Fan et al. 2006; Mao et al. 2010), RGG1 in grain length and width (Tao et al. 2020), and OsMADS1 in grain length and width (Liu et al. 2018), which highlights the complexity and the pleotropic nature of these traits (Zhang et al. 2025). Some of the genetic variants related to grain size variation are due to changes in the coding region of certain genes, such as the deletion in the first exon of GL10 that leads to shorter grain length and lower grain weight (Zhan et al. 2022). However, in most cases the causal variants were found in the promoter regions, leading to transcriptional changes associated with grain size variation. This is the case of the structural variant (deletion) located upstream of the GSE5 gene, which leads to reduced expression of GSE5 and results in the development of wider grains (Duan et al. 2017). Structural variants (SVs) are a major driver of genome evolution and have a strong potential to regulate gene expression and agronomic trait variability. Their study has gained interest in recent years, because the availability of high-quality genome assemblies allows their analysis with high resolution. Insertion and deletions are by far the most common type of SV, and Transposable Elements (TEs) are the main drivers of structural variant formation in rice (Qin et al. 2021). Due to their mobile nature, TEs have a strong potential to generate genetic diversity linked with transcriptional regulation and agronomic trait variation, as previously described for rice and multiple other crops (Wang et al. 2013; Domínguez et al. 2020; Castanera et al. 2023). LTR-retrotransposons (LTR-RTs) are the most abundant TE order in plants, and in rice they account for ~ 25% of the genome space (Qin et al. 2021). Recent analyses of TE polymorphisms in large rice populations suggest that LTR-RTs have undergone recent amplification, generating many new copies after the split of indica and japonica (Carpentier et al. 2019; Castanera et al. 2021). In this study, we have conducted a Transposon insertion polymorphism-based GWAS approach (TIP-GWAS) in a large indica panel to identify if specific LTR-Retrotransposon polymorphisms are potentially associated with grain length variability. We describe a recent LTR-insertion in chromosome 5 that is strongly associated with grain length. This insertion alters the expression of IQD19 , a new potential rice grain size gene regulator. Materials and Methods Plant material Rice seeds from 8 indica accessions (Supplemental Table S1 ) were obtained from the International Rice Research Institute (IRRI) and grown under controlled greenhouse conditions, with temperatures set at 28ºC during the 16-hour day and 25ºC during the 8-hour night (16/8h photoperiod). Nucleic acids extraction and DNA sequencing Nucleic acids extraction and DNA sequencing Genomic DNA for PCR purposes was extracted from young rice leaves using the CTAB protocol (Inglis et al. 2018). Three-week-old leaves from KIKUBA::IRGC 70722-1 (hereafter referred as KIKUBA) were collected to perform high molecular weight DNA extraction according to the MATAB method with modifications for Oxford Nanopore Sequencing (Escolà et al. 2023). Three-week-old leaves from MANARE (PHOM)::IRGC 61437-1 (hereafter referred as MANARE) were collected to perform high molecular weight DNA extraction according to Kang et al. 2023. RNA was isolated from ten-day-old seedlings, flowers at 1 and 2 days after flowering (DAF), and young seeds at 10 DAF using the Maxwell RSC Plant kit and the Maxwell RSC Instrument from Promega Corporation. DNA removal of the samples was performed using DNA-free DNA Removal Kit from Invitrogen™. DNA of the accessions KIKUBA and MANARE were sequenced using a flow cell of Promethion using the V14 chemistry. Genome assembly and whole-genome comparisons KIKUBA reads (110 GB) were filtered by length (cutoff 25Kb) and assembled with Flye (Kolmogorov et al. 2019). The assembly was subjected to a Pilon (Walker et al. 2014) polishing round with 20X Illumina reads (SRA accession ERS468964), and the polished contigs were scaffolded into pseudomolecules using Ragtag (Alonge et al. 2022) based on the high-quality LIMA genome assembly, which belongs to the same subpopulation group (indica-3) (Yu et al. 2023). Finally, gap-closing was performed with 27X of ultra-long reads retrieved from the Promethion run (N50 = 58 Kb) using TGS-GapCloser. In the case of MANARE, reads (34 GB) were filtered by length (cutoff 8Kb) and then assembled with Flye (Kolmogorov et al. 2019) and scaffolded into pseudomolecules using Ragtag (Alonge et al. 2022) based on the high-quality LIMA genome (Yu et al. 2023). Genome annotation was carried out by lifting the gene models of LIMA, allowing for the detection of extra gene copies arising from gene duplications (parameters -copies -sc 0.97 -polish). Whole genome alignments were performed using Minimap2 (Li 2018) and structural variations were detected from the alignments using SVIM-asm (Heller and Vingron 2021). Paftools call (Li 2018) was used to detect SNPs and indels from Minimap2 alignments. Methylation analyses DNA methylation was analyzed using ONT sequencing data. Basecalling was performed with Standalone Dorado v1.0.2 (Oxford Nanopore Technologies) using the SUP model (v5.2.0) optimized for R10.4 flow cells and compatible with cytosine modification detection. The resulting basecalled reads were used for downstream methylation calling with the dorado basecaller using --modified-bases-models. To extract the methylation information in the different contexts, the modkit pileup command (Oxford Nanopore Technologies 2024) was run with the parameters --motif CG 0 --motif CHH 0 --motif CHG 0. For methylation analysis, average methylation levels were calculated in sliding windows of 100 bp with a 50 bp overlap. PCR and Real-time quantitative PCR Polymerase Chain Reactions (PCR) were used for genotyping according to the standard protocol for DreamTaqTM DNA polymerase (Thermo Fisher). For reverse transcription quantitative PCR (RT-qPCR), RNA samples were used for synthetizing cDNA with the SuperScript® III Reverse Transcriptase from Invitrogen TM. Roche’s SYBR green Master Mix (Roche Applied Science) was employed with the standard protocol (initial denaturation 95ºC − 5 mins, 40 cycles (95ºC -10 s, 56ºC − 10s, 72ºC − 10 s)). Three reference genes were used to normalize the expression levels according to the 2-ΔCT method: UBQ5 , ACT1 and eIF4A . The primers used for PCR and qRT-PCR are listed in Supplemental Table S2. A PCR followed by an agarose gel was performed to verify the amplification region. All the reactions were performed using three biological and three technical replicates. TIP genotyping Transposon insertion coordinates in each accession were obtained from (Castanera et al. 2021). This dataset contains the raw output of PopoolationTE2 software in the mode separate (Kofler et al. 2016) using re-sequencing data from the 420 indica accessions subsampled to 15X, publicly available as part of to the 3000 rice genomes project (3,000 rice genomes project 2014). Nipponbare IRGSP genome assembly (Kawahara et al. 2013) was used as reference genome. Raw, accession-specific TE insertion files were combined into population TIP genotype matrices (TE presence/absence specified in binary format) by intersecting TE insertion coordinates with genome-wide windows as described in (Castanera et al. 2021), but in this case TE insertions from different accessions were considered the same TIP only if they overlap the same genomic window and belong to not only the same TE order but also to the same TE family. To avoid false positives, we considered only the TIPs present in a least one sample with read support > 5 reads and zygosity > 0.7. TIP-GWAS and SNP-GWAS The LFMM 2 R package was applied to perform genome-wide associations (Caye et al. 2019). The latent factor mixed models were used to correct the population in indica accessions (K = 3). For SNP-GWAS, a filtered dataset of 228K bi-allelic SNPs (MAF > 1%, missing rate < 1%) derived from rice 404k CoreSNP dataset available at SNP-seek (Mansueto et al. 2017). A MAF threshold of 5% and Bonferroni corrections were applied for both TIP and SNP-GWAS. Grain length data was downloaded from SNP-seek database ( https://snp-seek.irri.org/ ). Results LTR-retrotransposon TIPs associate with grain length in indica rice In this study, we looked for structural variants associated with grain length phenotype by performing GWAS analysis using transposon insertion polymorphisms (TIPs) as genetic information across a panel of 420 indica accessions. We focused on LTR-retrotransposon TIPs, as they are one of the most abundant and recently active TEs in plants, and they have a strong potential to cause large genetic and phenotypic effects (Galindo‑González et al. 2017). We detected 50,135 TIPs from a total of 112 LTR-retrotransposon families (Supplemental Table S3) and used them for GWAS. We identified eight significant associations (seven LTR-retrotransposon insertions belonging to the Gypsy superfamily and one belonging to the Copia superfamily of LTR-retrotransposons) which were present in two different genomic regions, in chromosome 3 and in chromosome 5 (Fig. 1 a, Supplemental Table S4). The 7 insertions in chromosome 3 are located in an interval of 3,3 Mb. Two of these insertions are in a 0.75 Mb region previously characterized as a grain length QTL which contains GS3 , the most important known grain length regulator in rice (Fan et al. 2006; Mao et al. 2010). This region is also identified by the SNP-GWAS performed in parallel (Fig. 1 b, Supplemental Table S5). The remaining 5 TIPs, form a LD block that falls outside this QTL boundaries (Supplemental Figure S1 ) and overlaps with other previously reported grain length QTLs, including qGL3 (Wan et al. 2005), qLWR (Wan et al. 2005) and qGL-3a (Wan et al. 2006), as well as with one grain weight QTL (gw3.1, (Thomson et al. 2003)) and one grain size QTL (qtaro_961, (Redoña and Mackill 1998)). Therefore, all the associations found in crhomosome 3 probably correspond to already characterized genomic regions associated with grain traits. On the contrary, the additional association detected on chromosome 5 does not overlap with any grain length QTL described so far, and no association was found in this region in the SNP-GWAS run in parallel (Fig. 1 a, b). An analysis of the TIP effect size shows a strong, positive association of the presence of the TE insertion with grain length (Fig. 1 c). Therefore, the association described here represents a new locus controlling grain length in indica rice. As the GS3 locus is known to be a major regulator of grain length, we analyzed the possible interaction of the two loci. We analyzed the phenotypes of the accessions with the different allele combinations at both loci (TIP at chr05 and leading SNP at GS3), and our data suggests that the effect of the newly described TIP in chromosome 5 is independent of the effect of GS3, with both loci exhibiting comparable effect sizes (Fig. 1 d). Moreover, the effects of the two long-grain alleles at both loci are additive, as accessions carrying the long grain alleles at both loci display significantly longer grains than those with only one of them (Fig. 1 d). A subpopulation-specific LTR-RT insertion linked with grain length variability The LTR-retrotransposon insertion on chromosome 5 is present in 14.8% of the indica accessions analyzed, and it is restricted to the indica-3 subgroup, where it reaches a frequency of 30.7%, as well as in some admixed varieties (Fig. 1 e,f). This suggests that it is a recent insertion that occurred well after the split of the two rice subspecies, accompanying the adaptation of indica rice to local conditions. As this insertion is not present in the reference genomes of the japonica and indica rice subspecies (Qin et al. 2021; Yu et al. 2023), we characterized it by sequencing with Oxford Nanopore technology two close-related indica-3 accessions that are polymorphic for the LTR-retrotransposon insertion (TE+; KIKUBA and TE-; MANARE) (Fig. 1 e,f, Supplemental Table S1 ). We mapped the KIKUBA and MANARE long-reads to the IRGSP genome assembly (Kawahara et al. 2013) and unequivocally confirmed the absence of the TE in MANARE, and the presence of the insertion in KIKUBA, with 25 reads containing the full LTR-RT element and the flanking sites. The insertion sequence is 4,806 Kb long, contains all the coding domains of a Copia LTR-Retrotransposon, and includes long terminal repeats (LTRs) of 395 bp sharing 99.7% identity, with only a single nucleotide gap. The insertion is accompanied by a target site duplication (TSD) of 5 bp, typical of LTR-retrotransposon insertions, and the sequence analysis of the internal domains indicate that it belongs to the Tork lineage and the COPI2 family (Jurka 2005). This high level of similarity between the LTRs strongly suggests that it is a very recent insertion, which is also compatible with its presence being almost completely restricted to a single subpopulation (indica-3 accessions and a few additional admixed varieties). To rule out the possibility that other high-impact variants in the region different than the TIP were involved in the association with grain length, we assembled and compared the genomes of KIKUBA and MANARE using our previously generated Nanopore reads. Due to the high coverage and sequence length used, KIKUBA resulted in a near T2T assembly with only 3 gaps in the whole assembly, and with 12 scaffolds representing near-complete chromosomes including telomeric repeats in 60% of the chromosome ends. MANARE was sequenced with less coverage but still produced a highly contiguous and complete assembly (Table 1 ). After scaffolding into pseudomolecules, the two genomes were aligned to identify structural variants, indels and SNPs among the two accessions, showing, as expected, a very high synteny (Supplemental Figure S2). No genetic variants were found inside the IQD19 gene between the two accessions. In addition to the LTR-RT insertion 1,013 bp upstream, we only identified two additional nearby variants flanking the gene, a SNP located 809 bp downstream and a 9 bp insertion 5,850 bp upstream. Table 1 Summary statistics of KIKUBA and MANARE genome assemblies. Metrics KIKUBA MANARE Main genome contigs (> 50Kb) 15 1,373 Contig total sequence 384.18 Mb (0% gap) 380.71 Mb (0.033% gap) Main genome scaffold N/L50 6 / 31.314 Mb 6 / 30.822 Mb Main genome contig N/L50 6 / 30.967 Mb 114 / 0.84 Mb Main genome scaffold N/L90 11 / 24.942 Mb 11 / 25.142 Mb Main genome contig N/L90 11 / 23.408 Mb 590 / 124.867 Kb Max contig length 44.111 Mb 7.826 Mb Max scaffold length 44.111 Mb 44.381 Mb BUSCO score 98.7% 98.6% Gene models 37,014 37,156 The Tork LTR-retrotransposon insertion downregulates IQD19 gene expression An analysis of the region flanking the insertion shows that the LTR-retrotransposon is inserted in the upstream region (1013 bp away) of an annotated coding region for a protein containing an IQ calmodulin-binding motif (RAPDB gene ID: Os05g0521900) identified as IQ-DOMAIN19 in NCBI genebank (hereafter referred as IQD19 ). IQD19 gene is expressed during the development of the grain in both the embryo and the endosperm, with a pattern of expression that would be compatible with a grain size regulator. The IQ calmodulin-binding domain is also present in other regulators of grain development in rice such as GSE5 (Os05g0187500), a well-known regulator of grain width and length. Interestingly, IQD19 has an expression profile that is similar to that of GSE5 (Fig. 2 a,b), reinforcing the idea that both genes could be regulators of related processes. In order to investigate whether the insertion of the LTR-retrotransposon in the upstream region of IQD19 could be altering its expression and affecting grain length, we analyzed IQD19 expression in seedlings and measured the length of the grains of 8 closely related indica-3 accessions (4 carrying the LTR-RT insertion and 4 lacking it) grown in the laboratory. The analysis of IQD19 expression showed a significative decrease of expression (p < 0.05) linked to the presence of the insertion (Fig. 2 c), correlating with significantly longer grains (Fig. 2 d). Transcriptional downregulation of IQD19 by the Tork LTR-RT is independent of DNA methylation LTR-retrotransposons are often highly methylated, in particular the most recent ones (Vonholdt et al. 2012), and DNA methylation extension from LTR-boundaries into gene promoter regions is a common source of transcriptional repression (Hirsch and Springer 2017). To analyze if that was the case for the Tork insertion, we used ONT reads from KIKUBA (TE+) and MANARE (TE-) accessions to analyze the methylation status of the IQD19 gene and its promoter region in the CG, CHG and CHH context. As expected, CG and CHG methylation were high in the region of KIKUBA corresponding to the Tork retrotransposon insertion (Fig. 3 ). However, the IQD19 gene body and the immediate upstream region (1,013 Kb) exhibited similar low methylation levels in both accessions and in the three contexts, suggesting that the methylation at the LTR-retrotransposon sequence was not extended into the gene upstream region flanking the insertion. An alternative explanation for the repressive effect of the Tork insertion could be that it disrupts Transcription Factor Binding Sites (TFBS) or acts as a physical spacer, increasing their distance from the core promoter. These alterations can hinder the recruitment and assembly of the transcriptional machinery, thereby downregulating gene expression (Hirsch and Springer 2017). The analysis of computationally predicted cis-regulatory elements available in RAPDB (PLACE) database in the IQD19 upstream region suggests that the Tork insertion acts as a spacer, displacing multiple cis-acting regulatory DNA elements. The analysis of experimental data from a multiDAP-seq experiment recently described by (Baumgart et al. 2025) revealed that the upstream region of the IQD19 gene contains multiple TFBS from the NAC, B3, and Dof families. These TFBS are located upstream of the insertion site of the LTR-retrotransposon insertion and, therefore, when the insertion is present, the distance between these TFBS and the transcription start of IQD19 is altered (Supplemental Figure S3). These findings suggest that the LTR-retrotransposon insertion alters the landscape of cis-regulatory elements, thereby reshaping the transcriptional behavior of the adjacent IQD19 gene. Conclusions The role of transposable element insertion polymorphisms (TIPs) is an important yet underexplored source of genetic variation influencing agronomic traits. TIPs have historically been overlooked due to their challenging detection, but recent evidence across multiple crops highlights their functional relevance and importance for breeding (Domínguez et al. 2020; Li et al. 2024; Dong et al. 2025).Our study demonstrates that using TIPs for genome-wide association can uncover additional regions bypassed by traditional SNP-GWAS. Our parallel analyses using TIPs and SNPs in indica rice identified three regions associated with grain length, two in chromosome 3 and one in chromosome 5. The significant TIPs found in the two close regions of chromosome 3 are possibly related to the presence of GS3 , a well-characterized grain length and weight regulator (Fan et al. 2006). Beyond this known region, we identified an LTR-retrotransposon insertion strongly associated with grain length in chromosome 5 that has not previously been described. The LTR-RT is inserted in the upstream region of the IQD19 gene, which encodes a protein with a putative calmodulin-binding domain (IQ domain). Calmodulin-binding proteins have been repeatedly described as regulators of agronomic traits related to organ morphology by modulating cell division. For example, the SUN gene, a major regulator of fruit elongation in tomatoes, encodes an IQ domain-containing protein (Xiao et al. 2008; Wu et al. 2011). The Arabidopsis IQD5 gene, a regulator of leaf pavement cell morphogenesis, also encodes an IQ domain-containing protein (Mitra et al. 2019). In rice, GSE5 , which encodes a calmodulin-binding protein with an IQ domain, acts as a negative regulator of grain width and weight (Liu et al. 2017), and its homolog GW5L has also been shown to influence grain size (Tian et al. 2019), whereas IQD14 regulates grain width and length (Yang et al. 2020). The involvement of IQ domain-containing proteins in shaping organ architecture and the specific examples on rice related genes affecting grain size suggests that IQD19 is the actual genetic factor contributing to grain length variation in this region. The pattern of expression of IQD19 , which is very similar to that of GSE5 during rice grain maturation, reinforces this hypothesis. Interestingly, the transcription of GSE5 is also affected by a structural variation in the promoter region, leading to lower GSE5 expression and wider grains (Duan et al. 2017). Transposable element insertions in upstream gene regions can have diverse regulatory consequences. In most cases, TE insertions close to genes are heavily methylated and this methylation can influence the expression of genes located nearby (Hollister and Gaut 2009). However, cases where the insertion influences the expression of downstream genes without affecting methylation also exist, as it has been shown for the maize stiff1 gene (Zhang et al. 2020). In such cases, transposable elements disrupt, displace or modify existing cis-regulatory elements or alter the chromatin state of the region, thereby reshaping the transcriptional profiles of adjacent genes (Hirsch and Springer 2017). Our results showed a lower IQD19 expression on the accessions carrying the Tork promoter insertion, and the analysis of the genome assemblies revealed that is the only variant in the immediate upstream region of IQD19 . Despite the Tork insertion is heavily methylated, the methylation seems to be constrained to the element boundaries without affecting the more proximal region to the transcription start site, suggesting that the repressive effect of the LTR-retrotransposon insertion is not due to an epigenetic change of the IQD19 promoter. The experimental data shows that the region upstream of the insertion of the LTR-retrotransposon contains TFBS that may be regulating IQD19 expression. The distance between these TFBS and the transcription start of IQD19 is greatly increased in the accessions containing the insertion, suggesting that, as described for the maize stiff1 gene (Zhang et al. 2020), the alteration of TFBS landscape could be the mechanisms explaining the repression of IQD19 . Collectively, these results suggest that the insertion of an LTR-retrotransposon is the actual cause of the IQD19 downregulation, and that IQD19 is a novel, negative grain length regulator. This novel region, whose impact on grain length is as strong as GS3, the most important grain length QTL, represents a promising target for rice breeding. Abbreviations GWAS Genome-wide association study LD Linkage disequilibrium LTR-RT Long terminal repeat retrotransposons ONT Oxford nanopore technology PCR Polymerase chain reaction PLACE Plant cis-acting regulatory DNA elements QTLs Quantitative trait loci SNP Single nucleotide polymorphism SRA Sequence read archive SV Structural variant TE Transposable element TSD Target site duplication RAPDB Rice annotation project database TIP Transposon insertion polymorphism. Declarations Ethics approval and consent to participate Not applicable Consent for publication Not applicable Availability of data and materials Raw genome sequencing data have been deposited in the Sequence Read Archive (SRA) under BioProject PRJNA1387227 (private link for reviewers: https://dataview.ncbi.nlm.nih.gov/object/PRJNA1387227?reviewer=cqpic98g2lvl0hkmudvpflintr). Genome assemblies are available under NCBI BioProject PRJNA1387416. In addition, genome assemblies, gene and LTR-RT annotations, along with the TIP/SNP genotype data used for the GWAS are available at Zenodo (private link for reviewers: https://zenodo.org/records/18221427?preview=1&token=eyJhbGciOiJIUzUxMiJ9.eyJpZCI6Ijc1ZDA2YmIwLWFkNWEtNDFhMy05NjBkLTI4YWM0ZDVmYTNiNSIsImRhdGEiOnt9LCJyYW5kb20iOiI3OTQxNzA2NTM5NWE5ZDUxYjRjOTlkNjE0OGZkNTE4YyJ9.38Li7l32RxpaqffZ-8rWTz_dn_IpAOJOu3YtlumCqB0LECjziE3qnE374AI60baypd5hHeR5I704pdngcI0jxA ). Raw PopoolationTE2 predictions are available at Zenodo (https://zenodo.org/records/4058696). All materials currently under restricted access will be made publicly available upon acceptance of the manuscript. Competing interests The authors declare that they have no competing interests. Funding The work done at CRAG was funded by grant PID2022-143167NB-I00, funded by MICIU/AEI/ 10.13039/501100011033 and by “ERDF/EU” and grant CEX2019-000902-S funded by MICIU/AEI /10.13039/501100011033. NMD is funded by MCIU/AEI /10.13039/501100011033 and by “ESF Investing in your future” (reference PRE2020-095111). RC is a Ramón y Cajal contract holder, funded by MICIU/AEI/ https://doi.org/10.13039/501100011033 and by FSE+ (reference RYC2022-037459-I). Authors' contributions NM: Formal analysis, Methodology, Investigation, Writing; JC: Conceptualization, Writing, Supervision, Funding acquisition; RC: Conceptualization, Formal analysis, Writing, Supervision Acknowledgements Not applicable Authors' information Noemia Morals-Díaz: https://orcid.org/0000-0001-5258-486X Josep M. Casacuberta: https://orcid.org/0000-0002-5609-4152 Raúl Castanera: https://orcid.org/0000-0002-3772-7727 References ,000 rice genomes project (2014) The 3,000 rice genomes project. Gigascience 3:7. https://doi.org/10.1186/2047-217X-3-7 Alonge M, Lebeigle L, Kirsche M, Jenike K, Ou S, Aganezov S, Wang X, Lippman ZB, Schatz MC, Soyk S (2022) Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol 23(1):258. https://doi.org/10.1186/s13059-022-02823-7 Baumgart LA, Greenblum SI, Morales-Cruz A, Wang P, Zhang Y, Yang L, Chen C, Dilworth DJ, Garretson AC, Grosjean N, He G, Savage E, Yoshinaga Y, Blaby IK, Daum CG, O’Malley RC (2025) Recruitment, rewiring and deep conservation in flowering plant gene regulation. Nat Plants 11(8):1514–1527. https://doi.org/10.1038/s41477-025-02047-0 Carpentier M-C, Manfroi E, Wei F-J, Wu H-P, Lasserre E, Llauro C, Debladis E, Akakpo R, Hsing Y-I, Panaud O (2019) Retrotranspositional landscape of Asian rice revealed by 3000 genomes. Nat Commun 10(1):24. https://doi.org/10.1038/s41467-018-07974-5 Castanera R, Morales-Díaz N, Gupta S, Purugganan M, Casacuberta JM (2023) Transposons are important contributors to gene expression variability under selection in rice populations. eLife 12. https://doi.org/10.7554/eLife.86324 Castanera R, Vendrell-Mir P, Bardil A, Carpentier M-C, Panaud O, Casacuberta JM (2021) Amplification dynamics of miniature inverted-repeat transposable elements and their impact on rice trait variability. Plant J 107(1):118–135. https://doi.org/10.1111/tpj.15277 Caye K, Jumentier B, Lepeule J, François O (2019) LFMM 2: Fast and Accurate Inference of Gene-Environment Associations in Genome-Wide Studies. Mol Biol Evol 36(4):852–860. https://doi.org/10.1093/molbev/msz008 Domínguez M, Dugas E, Benchouaia M, Leduque B, Jiménez-Gómez JM, Colot V, Quadrana L (2020) The impact of transposable elements on tomato diversity. Nat Commun 11(1):4058. https://doi.org/10.1038/s41467-020-17874-2 Dong Z, Jin S, Hao Y, Zhao T, Shang H, Zhang Z, Fang L, Zheng Z, Li J (2025) Transposon dynamics drive genome evolution and regulate genetic mechanisms of agronomic traits in cotton. Plants 14(16). https://doi.org/10.3390/plants14162509 Duan P, Xu J, Zeng D, Zhang B, Geng M, Zhang G, Huang K, Huang L, Xu R, Ge S, Qian Q, Li Y (2017) Natural variation in the promoter of GSE5 contributes to grain size diversity in rice. Mol Plant 10(5):685–694. https://doi.org/10.1016/j.molp.2017.03.009 Escolà G, González-Miguel VM, Campo S, Catala-Forner M, Domingo C, Marqués L, San Segundo B (2023) Development and Genome-Wide Analysis of a Blast-Resistant japonica Rice Variety. Plants 12(20). https://doi.org/10.3390/plants12203536 Fan C, Xing Y, Mao H, Lu T, Han B, Xu C, Li X, Zhang Q (2006) GS3, a major QTL for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein. Theor Appl Genet 112(6):1164–1171. https://doi.org/10.1007/s00122-006-0218-1 Galindo-González L, Mhiri C, Deyholos MK, Grandbastien M-A (2017) LTR-retrotransposons in plants: Engines of evolution. Gene 626:14–25. https://doi.org/10.1016/j.gene.2017.04.051 Heller D, Vingron M (2021) SVIM-asm: structural variant detection from haploid and diploid genome assemblies. Bioinformatics 36(22–23):5519–5521. https://doi.org/10.1093/bioinformatics/btaa1034 Hirai Y, Keisuke S, Hamagami K (2012) Evaluation of an analytical method to identify determinants of rice yield components and protein content. Comput Electron Agric 83:77–84. https://doi.org/10.1016/j.compag.2012.02.001 Hirsch CD, Springer NM (2017) Transposable element influences on gene expression in plants. Biochim Biophys Acta Gene Regul Mech 1860(1):157–165. https://doi.org/10.1016/j.bbagrm.2016.05.010 Hollister JD, Gaut BS (2009) Epigenetic silencing of transposable elements: a trade-off between reduced transposition and deleterious effects on neighboring gene expression. Genome Res 19(8):1419–1428. https://doi.org/10.1101/gr.091678.109 Ikehashi H (2009) Why are there indica type and japonica type in rice? — history of the studies and a view for origin of two types. Rice Sci 16(1):1–13. https://doi.org/10.1016/S1672-6308(08)60050-5 Inglis PW, Pappas M, de CR, Resende LV, Grattapaglia D (2018) Fast and inexpensive protocols for consistent extraction of high quality DNA and RNA from challenging plant and fungal samples for high-throughput SNP genotyping and sequencing applications. PLoS ONE 13(10):e0206085. https://doi.org/10.1371/journal.pone.0206085 Jurka J (2005) Copi2: Identification of a new Copia-type LTR retrotransposon from rice. Repbase Rep 5(10) Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, Schwartz DC, Tanaka T, Wu J, Zhou S, Childs KL, Davidson RM, Lin H, Quesada-Ocampo L, Vaillancourt B, Sakai H, Lee SS, Kim J, Numa H, Itoh T, Buell CR, Matsumoto T (2013) Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice (N Y) 6(1):4. https://doi.org/10.1186/1939-8433-6-4 Kofler R, Gómez-Sánchez D, Schlötterer C (2016) PoPoolationTE2: Comparative Population Genomics of Transposable Elements Using Pool-Seq. Mol Biol Evol 33(10):2759–2764. https://doi.org/10.1093/molbev/msw137 Kolmogorov M, Yuan J, Lin Y, Pevzner PA (2019) Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37(5):540–546. https://doi.org/10.1038/s41587-019-0072-8 Liu J, Chen J, Zheng X, Wu F, Lin Q, Heng Y, Tian P, Cheng Z, Yu X, Zhou K, Zhang X, Guo X, Wang J, Wang H, Wan J (2017) GW5 acts in the brassinosteroid signalling pathway to regulate grain width and weight in rice. Nat Plants 3:17043. https://doi.org/10.1038/nplants.2017.43 Liu Q, Han R, Wu K, Zhang J, Ye Y, Wang S, Chen J, Pan Y, Li Q, Xu X, Zhou J, Tao D, Wu Y, Fu X (2018) G-protein βγ subunits determine grain size through interaction with MADS-domain transcription factors in rice. Nat Commun 9(1):852. https://doi.org/10.1038/s41467-018-03047-9 Li D, Nanseki T (eds) (2021) Empirical analyses on rice yield determinants of smart farming in japan. Springer Singapore, Singapore Li H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34(18):3094–3100. https://doi.org/10.1093/bioinformatics/bty191 Li X, Dai X, He H, Lv Y, Yang L, He W, Liu C, Wei H, Liu X, Yuan Q, Wang X, Wang T, Zhang B, Zhang H, Chen W, Leng Y, Yu X, Qian H, Zhang B, Guo M, Zhang Z, Shi C, Zhang Q, Cui Y, Xu Q, Cao X, Chen D, Zhou Y, Qian Q, Shang L (2024) A pan-TE map highlights transposable elements underlying domestication and agronomic traits in Asian rice. Natl Sci Rev 11(6):nwae188. https://doi.org/10.1093/nsr/nwae188 Mansueto L, Fuentes RR, Borja FN, Detras J, Abriol-Santos JM, Chebotarov D, Sanciangco M, Palis K, Copetti D, Poliakov A, Dubchak I, Solovyev V, Wing RA, Hamilton RS, Mauleon R, McNally KL, Alexandrov N (2017) Rice SNP-seek database update: new SNPs, indels, and queries. Nucleic Acids Res 45(D1):D1075–D1081. https://doi.org/10.1093/nar/gkw1135 Mao H, Sun S, Yao J, Wang C, Yu S, Xu C, Li X, Zhang Q (2010) Linking differential domain functions of the GS3 protein to natural variation of grain size in rice. Proc Natl Acad Sci USA 107(45):19579–19584. https://doi.org/10.1073/pnas.1014419107 Mitra D, Klemm S, Kumari P, Quegwer J, Möller B, Poeschl Y, Pflug P, Stamm G, Abel S, Bürstenbinder K (2019) Microtubule-associated protein IQ67 DOMAIN5 regulates morphogenesis of leaf pavement cells in Arabidopsis thaliana. J Exp Bot 70(2):529–543. https://doi.org/10.1093/jxb/ery395 Oki K, Fujisawa Y, Kato H, Iwasaki Y (2005) Study of the constitutively active form of the alpha subunit of rice heterotrimeric G proteins. Plant Cell Physiol 46(2):381–386. https://doi.org/10.1093/pcp/pci036 Oki K, Kitagawa K, Fujisawa Y, Kato H, Iwasaki Y (2009) Function of alpha subunit of heterotrimeric G protein in brassinosteroid response of rice plants. Plant Signal Behav 4(2):126–128. https://doi.org/10.4161/psb.4.2.7627 Qin P, Lu H, Du H, Wang H, Chen W, Chen Z, He Q, Ou S, Zhang H, Li X, Li X, Li Y, Liao Y, Gao Q, Tu B, Yuan H, Ma B, Wang Y, Qian Y, Fan S, Li W, Wang J, He M, Yin J, Li T, Jiang N, Chen X, Liang C, Li S (2021) Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184(13):3542–3558e16. https://doi.org/10.1016/j.cell.2021.04.046 Redoña ED, Mackill DJ (1998) Quantitative trait locus analysis for rice panicle and grain characteristics. Theor Appl Genet 96(6–7):957–963. https://doi.org/10.1007/s001220050826 Tao Y, Miao J, Wang J, Li W, Xu Y, Wang F, Jiang Y, Chen Z, Fan F, Xu M, Zhou Y, Liang G, Yang J (2020) RGG1, involved in the cytokinin regulatory pathway, controls grain size in rice. Rice (N Y) 13(1):76. https://doi.org/10.1186/s12284-020-00436-x Thomson MJ, Tai TH, McClung AM, Lai XH, Hinga ME, Lobos KB, Xu Y, Martinez CP, McCouch SR (2003) Mapping quantitative trait loci for yield, yield components and morphological traits in an advanced backcross population between Oryza rufipogon and the Oryza sativa cultivar Jefferson. Theor Appl Genet 107(3):479–493. https://doi.org/10.1007/s00122-003-1270-8 Tian P, Liu J, Mou C, Shi C, Zhang H, Zhao Z, Lin Q, Wang J, Wang J, Zhang X, Guo X, Cheng Z, Zhu S, Ren Y, Lei C, Wang H, Wan J (2019) GW5-Like, a homolog of GW5, negatively regulates grain width, weight and salt resistance in rice. J Integr Plant Biol 61(11):1171–1185. https://doi.org/10.1111/jipb.12745 Vonholdt BM, Takuno S, Gaut BS (2012) Recent retrotransposon insertions are methylated and phylogenetically clustered in japonica rice (Oryza sativa spp. japonica). Mol Biol Evol 29(10):3193–3203. https://doi.org/10.1093/molbev/mss129 Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM (2014) Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9(11):e112963. https://doi.org/10.1371/journal.pone.0112963 Wang W, Mauleon R, Hu Z, Chebotarov D, Tai S, Wu Z, Li M, Zheng T, Fuentes RR, Zhang F, Mansueto L, Copetti D, Sanciangco M, Palis KC, Xu J, Sun C, Fu B, Zhang H, Gao Y, Zhao X, Shen F, Cui X, Yu H, Li Z, Chen M, Detras J, Zhou Y, Zhang X, Zhao Y, Kudrna D, Wang C, Li R, Jia B, Lu J, He X, Dong Z, Xu J, Li Y, Wang M, Shi J, Li J, Zhang D, Lee S, Hu W, Poliakov A, Dubchak I, Ulat VJ, Borja FN, Mendoza JR, Ali J, Li J, Gao Q, Niu Y, Yue Z, Naredo MEB, Talag J, Wang X, Li J, Fang X, Yin Y, Glaszmann J-C, Zhang J, Li J, Hamilton RS, Wing RA, Ruan J, Zhang G, Wei C, Alexandrov N, McNally KL, Li Z, Leung H (2018) Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557(7703):43–49. https://doi.org/10.1038/s41586-018-0063-9 Wang X, Weigel D, Smith LM (2013) Transposon variants and their effects on gene expression in Arabidopsis. PLoS Genet 9(2):e1003255. https://doi.org/10.1371/journal.pgen.1003255 Wan XY, Wan JM, Jiang L, Wang JK, Zhai HQ, Weng JF, Wang HL, Lei CL, Wang JL, Zhang X, Cheng ZJ, Guo XP (2006) QTL analysis for rice grain length and fine mapping of an identified QTL with stable and major effects. Theor Appl Genet 112(7):1258–1270. https://doi.org/10.1007/s00122-006-0227-0 Wan XY, Wan JM, Weng JF, Jiang L, Bi JC, Wang CM, Zhai HQ (2005) Stability of QTLs for rice grain dimension and endosperm chalkiness characteristics across eight environments. Theor Appl Genet 110(7):1334–1346. https://doi.org/10.1007/s00122-005-1976-x Wu S, Xiao H, Cabrera A, Meulia T, van der Knaap E (2011) SUN regulates vegetative and reproductive organ shape by changing cell division patterns. Plant Physiol 157(3):1175–1186. https://doi.org/10.1104/pp.111.181065 Xiao H, Jiang N, Schaffner E, Stockinger EJ, van der Knaap E (2008) A retrotransposon-mediated gene duplication underlies morphological variation of tomato fruit. Science 319(5869):1527–1530. https://doi.org/10.1126/science.1153040 Yang B, Wendrich JR, De Rybel B, Weijers D, Xue H-W (2020) Rice microtubule-associated protein IQ67-DOMAIN14 regulates grain shape by modulating microtubule cytoskeleton dynamics. Plant Biotechnol J 18(5):1141–1152. https://doi.org/10.1111/pbi.13279 Yu Z, Chen Y, Zhou Y, Zhang Y, Li M, Ouyang Y, Chebotarov D, Mauleon R, Zhao H, Xie W, McNally KL, Wing RA, Guo W, Zhang J (2023) Rice Gene Index: A comprehensive pan-genome database for comparative and functional genomics of Asian rice. Mol Plant 16(5):798–801. https://doi.org/10.1016/j.molp.2023.03.012 Zhang T, Wang Z, Liu Q, Zhao D (2025) Genetic Improvement of rice Grain size Using the CRISPR/Cas9 System. Rice (N Y) 18(1):3. https://doi.org/10.1186/s12284-025-00758-8 Zhang Z, Zhang X, Lin Z, Wang J, Liu H, Zhou L, Zhong S, Li Y, Zhu C, Lai J, Li X, Yu J, Lin Z (2020) A Large Transposon Insertion in the stiff1 Promoter Increases Stalk Strength in Maize. Plant Cell 32(1):152–165. https://doi.org/10.1105/tpc.19.00486 Zhan P, Ma S, Xiao Z, Li F, Wei X, Lin S, Wang X, Ji Z, Fu Y, Pan J, Zhou M, Liu Y, Chang Z, Li L, Bu S, Liu Z, Zhu H, Liu G, Zhang G, Wang S (2022) Natural variations in grain length 10 (GL10) regulate rice grain size. J Genet Genomics 49(5):405–413. https://doi.org/10.1016/j.jgg.2022.01.008 Additional Declarations No competing interests reported. Supplementary Files Supplementarymaterial.zip Cite Share Download PDF Status: Under Review Version 1 posted Reviews received at journal 13 May, 2026 Reviewers agreed at journal 30 Apr, 2026 Reviews received at journal 22 Apr, 2026 Reviewers agreed at journal 17 Mar, 2026 Reviewers invited by journal 12 Mar, 2026 Editor assigned by journal 09 Mar, 2026 Submission checks completed at journal 09 Mar, 2026 First submitted to journal 07 Mar, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9057165","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":610050040,"identity":"e66fe223-cf49-424f-bf1a-c1d5ebafefbd","order_by":0,"name":"Noemia Morales-Díaz","email":"","orcid":"","institution":"Centre for Research in Agricultural Genomics, CRAG (CSIC- IRTA-UAB-UB)","correspondingAuthor":false,"prefix":"","firstName":"Noemia","middleName":"","lastName":"Morales-Díaz","suffix":""},{"id":610050041,"identity":"0ba66b8e-618a-4779-9ad3-3b88d8316bf6","order_by":1,"name":"Josep M Casacuberta","email":"","orcid":"","institution":"Centre for Research in Agricultural Genomics, CRAG (CSIC- IRTA-UAB-UB)","correspondingAuthor":false,"prefix":"","firstName":"Josep","middleName":"M","lastName":"Casacuberta","suffix":""},{"id":610050042,"identity":"f4126e05-d407-400e-8d71-8d221bded099","order_by":2,"name":"Raúl Castanera","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA80lEQVRIiWNgGAWjYHACxgNIHBseovSAtUCVpvEwsIEZBkRrOcxAUIt5+xmDAww1dvb27GcPPq6oOS9jPr/5AcOPmj84tcicyQFqOZac2MOTl2x45thtHpljbAaMPcdw2yLBANTC2MCcwMOQYybZwHabR4KNwYCZgQ2PFv43IC319jz8b8x/Nvw7B9TC/oGZ4R8eLRJgWw4z9kjkmDE2th0AauExYGZsw6flWcGBhGPHE3tuvDGWbOxLBmrJKTjY22eMx2HJGx98qKm2Z+/PMfzY8M3OXoL5+MYHP77J4dQCBgnoAgfwqx8Fo2AUjIJRQAgAAArOS0hQJbw3AAAAAElFTkSuQmCC","orcid":"","institution":"Centre for Research in Agricultural Genomics, CRAG (CSIC- IRTA-UAB-UB)","correspondingAuthor":true,"prefix":"","firstName":"Raúl","middleName":"","lastName":"Castanera","suffix":""}],"badges":[],"createdAt":"2026-03-07 09:23:31","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9057165/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9057165/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":105900500,"identity":"80f6548c-de68-437c-b0b7-7aea1046958d","added_by":"auto","created_at":"2026-04-01 09:22:54","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":217464,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAn LTR-retrotransposon associated with longer grains in indica rice. \u003c/strong\u003ea)\u003cstrong\u003e \u003c/strong\u003eTIP-GWAS, b) SNP-GWAS, c) Distribution of grain length in accessions not carrying (0) or carrying (1) the TE insertion at position Chr05:25,965,000. Differences were analyzed using the Wilcoxon signed-rank test, where * stands for p \u0026lt; 0.05 and **** for p \u0026lt; 0.0001. Phenotypic data is available in Supplemental Table S6. d) Distribution of grain length in accessions carrying the different combinations of the two alleles at the GS3 (C = short grain, T = long grain) and the TE insertion at the chromosome 5 loci (0 = TE absence, 1= TE presence). e) Principal component analysis (PCA) illustrating population structure within the indica subpopulation based on LTR-retrotransposon TIPs. f) PCA highlighting accessions carrying (TE+) or lacking (TE−) the TE insertion, including KIKUBA and MANARE, selected for further study. Subpopulation groups are based on \u003ca href=\"https://sciwheel.com/work/citation?ids=5172123\u0026amp;pre=\u0026amp;suf=\u0026amp;sa=0\u0026amp;dbf=0\"\u003e(Wang et al. 2018)\u003c/a\u003e.\u003c/p\u003e","description":"","filename":"Picture1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-9057165/v1/a2cdc406b27fcda8e55b4de1.jpg"},{"id":105905992,"identity":"8a503a69-22d0-4d5c-8997-e1573fa5d008","added_by":"auto","created_at":"2026-04-01 10:16:24","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":250743,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCharacterization of the Tork LTR-RT insertion. \u003c/strong\u003e\u0026nbsp;Expression profile of \u003cem\u003eIQD19 \u003c/em\u003e(a) and \u003cem\u003eGSE5 \u003c/em\u003e(b) in different grain development stages, c) \u003cem\u003eIQD19 \u003c/em\u003erelative gene expression in seedlings of TE (+) and TE (-) varieties. Error bars represent the standard deviation of 3 biological replicates. d) Grain length distribution in TE (+) and TE (-) varieties.\u003c/p\u003e","description":"","filename":"Picture2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-9057165/v1/d5fba88c48aef2ec14e352f4.jpg"},{"id":105900503,"identity":"2cd68e1c-99d1-481c-a632-affac595315a","added_by":"auto","created_at":"2026-04-01 09:22:54","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":136774,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eGenomic context and DNA methylation landscape of \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eIQD19 \u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003ein KIKUBA (TE+) and MANARE (TE-) accessions. \u003c/strong\u003ea) Representation of \u003cem\u003eIQD19 \u003c/em\u003eand its surrounding regions in KIKUBA and MANARE accessions. Green boxes represent exons, and white boxes represent introns. b) Mean DNA methylation levels (%) in KIKUBA (TE+) and MANARE (TE-) accessions.\u003cstrong\u003e \u003c/strong\u003eFour different regions are distinguished, with letters referring to the regions indicated in panel A: (A) \u003cem\u003eIQD19 \u003c/em\u003egene body, (B) an immediate upstream region, (C) Tork LTR-RT insertion, exclusive of KIKUBA, and (D) an upstream region of 2 Kb.\u003c/p\u003e","description":"","filename":"Picture3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-9057165/v1/c348ead452841983478c6d44.jpg"},{"id":105907459,"identity":"ac174dd1-ac89-4a30-a7d7-d5f728206cb6","added_by":"auto","created_at":"2026-04-01 10:31:41","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1308627,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9057165/v1/5e29f4a0-891f-4d1e-a18e-026b3a9ee184.pdf"},{"id":105905580,"identity":"e4ed6285-c042-46a5-afc3-d77de3b6c050","added_by":"auto","created_at":"2026-04-01 10:12:49","extension":"zip","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":2349931,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterial.zip","url":"https://assets-eu.researchsquare.com/files/rs-9057165/v1/72ae996647ccae3d381a76d1.zip"}],"financialInterests":"No competing interests reported.","formattedTitle":"Population-level analysis of rice transposon polymorphisms reveals an LTR-retrotransposon insertion linked to grain length","fulltext":[{"header":"Background","content":"\u003cp\u003eRice yield is a complex trait influenced by the interaction of genetic and environmental factors, which regulate yield-related traits such as the number of panicles, the number of spikelets, grain weight or the ratio of filled grains (Hirai et al. 2012; Li and Nanseki 2021). Among these, grain weight is itself affected by grain thickness, width and length (Liu et al. 2017). Notably, the two most widely cultivated rice subspecies exhibit distinct grain morphologies: indica varieties typically produce longer and slender grains, whereas japonica grains tend to be shorter and more ovate in shape (Ikehashi 2009). Natural genetic variants in two QTLs (GS3 and GW5) can explain most of the grain length and width differences betweem the two subspecies (Duan et al. 2017). However, more than 600 QTLs and 31 genes have been described as associated with grain size in the two rice subspecies (Zhang et al. 2025). Some of the known grain-size genes are related to more than one seed trait, especially those associated with cellular proliferation. For example, \u003cem\u003eRGA1\u003c/em\u003e is involved in grain length and weight (Oki et al. 2005; Oki et al. 2009), \u003cem\u003eGS3\u003c/em\u003e in grain length, width and weight (Fan et al. 2006; Mao et al. 2010), \u003cem\u003eRGG1\u003c/em\u003e in grain length and width (Tao et al. 2020), and \u003cem\u003eOsMADS1\u003c/em\u003e in grain length and width (Liu et al. 2018), which highlights the complexity and the pleotropic nature of these traits (Zhang et al. 2025). Some of the genetic variants related to grain size variation are due to changes in the coding region of certain genes, such as the deletion in the first exon of \u003cem\u003eGL10\u003c/em\u003e that leads to shorter grain length and lower grain weight (Zhan et al. 2022). However, in most cases the causal variants were found in the promoter regions, leading to transcriptional changes associated with grain size variation. This is the case of the structural variant (deletion) located upstream of the \u003cem\u003eGSE5\u003c/em\u003e gene, which leads to reduced expression of \u003cem\u003eGSE5\u003c/em\u003e and results in the development of wider grains (Duan et al. 2017). Structural variants (SVs) are a major driver of genome evolution and have a strong potential to regulate gene expression and agronomic trait variability. Their study has gained interest in recent years, because the availability of high-quality genome assemblies allows their analysis with high resolution. Insertion and deletions are by far the most common type of SV, and Transposable Elements (TEs) are the main drivers of structural variant formation in rice (Qin et al. 2021). Due to their mobile nature, TEs have a strong potential to generate genetic diversity linked with transcriptional regulation and agronomic trait variation, as previously described for rice and multiple other crops (Wang et al. 2013; Dom\u0026iacute;nguez et al. 2020; Castanera et al. 2023). LTR-retrotransposons (LTR-RTs) are the most abundant TE order in plants, and in rice they account for ~\u0026thinsp;25% of the genome space (Qin et al. 2021). Recent analyses of TE polymorphisms in large rice populations suggest that LTR-RTs have undergone recent amplification, generating many new copies after the split of indica and japonica (Carpentier et al. 2019; Castanera et al. 2021). In this study, we have conducted a Transposon insertion polymorphism-based GWAS approach (TIP-GWAS) in a large indica panel to identify if specific LTR-Retrotransposon polymorphisms are potentially associated with grain length variability. We describe a recent LTR-insertion in chromosome 5 that is strongly associated with grain length. This insertion alters the expression of \u003cem\u003eIQD19\u003c/em\u003e, a new potential rice grain size gene regulator.\u003c/p\u003e"},{"header":"Materials and Methods","content":"\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003ePlant material\u003c/h2\u003e \u003cp\u003eRice seeds from 8 indica accessions (Supplemental Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e) were obtained from the International Rice Research Institute (IRRI) and grown under controlled greenhouse conditions, with temperatures set at 28\u0026ordm;C during the 16-hour day and 25\u0026ordm;C during the 8-hour night (16/8h photoperiod).\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eNucleic acids extraction and DNA sequencing\u003c/h3\u003e\n\u003cdiv class=\"Heading\"\u003eNucleic acids extraction and DNA sequencing\u003c/div\u003e \u003cp\u003eGenomic DNA for PCR purposes was extracted from young rice leaves using the CTAB protocol (Inglis et al. 2018). Three-week-old leaves from KIKUBA::IRGC 70722-1 (hereafter referred as KIKUBA) were collected to perform high molecular weight DNA extraction according to the MATAB method with modifications for Oxford Nanopore Sequencing (Escol\u0026agrave; et al. 2023). Three-week-old leaves from MANARE (PHOM)::IRGC 61437-1 (hereafter referred as MANARE) were collected to perform high molecular weight DNA extraction according to Kang et al. 2023. RNA was isolated from ten-day-old seedlings, flowers at 1 and 2 days after flowering (DAF), and young seeds at 10 DAF using the Maxwell RSC Plant kit and the Maxwell RSC Instrument from Promega Corporation. DNA removal of the samples was performed using DNA-free DNA Removal Kit from Invitrogen\u0026trade;. DNA of the accessions KIKUBA and MANARE were sequenced using a flow cell of Promethion using the V14 chemistry.\u003c/p\u003e\n\u003ch3\u003eGenome assembly and whole-genome comparisons\u003c/h3\u003e\n\u003cp\u003eKIKUBA reads (110 GB) were filtered by length (cutoff 25Kb) and assembled with Flye (Kolmogorov et al. 2019). The assembly was subjected to a Pilon (Walker et al. 2014) polishing round with 20X Illumina reads (SRA accession ERS468964), and the polished contigs were scaffolded into pseudomolecules using Ragtag (Alonge et al. 2022) based on the high-quality LIMA genome assembly, which belongs to the same subpopulation group (indica-3) (Yu et al. 2023). Finally, gap-closing was performed with 27X of ultra-long reads retrieved from the Promethion run (N50\u0026thinsp;=\u0026thinsp;58 Kb) using TGS-GapCloser. In the case of MANARE, reads (34 GB) were filtered by length (cutoff 8Kb) and then assembled with Flye (Kolmogorov et al. 2019) and scaffolded into pseudomolecules using Ragtag (Alonge et al. 2022) based on the high-quality LIMA genome (Yu et al. 2023). Genome annotation was carried out by lifting the gene models of LIMA, allowing for the detection of extra gene copies arising from gene duplications (parameters -copies -sc 0.97 -polish). Whole genome alignments were performed using Minimap2 (Li 2018) and structural variations were detected from the alignments using SVIM-asm (Heller and Vingron 2021). Paftools call (Li 2018) was used to detect SNPs and indels from Minimap2 alignments.\u003c/p\u003e\n\u003ch3\u003eMethylation analyses\u003c/h3\u003e\n\u003cp\u003eDNA methylation was analyzed using ONT sequencing data. Basecalling was performed with Standalone Dorado v1.0.2 (Oxford Nanopore Technologies) using the SUP model (v5.2.0) optimized for R10.4 flow cells and compatible with cytosine modification detection. The resulting basecalled reads were used for downstream methylation calling with the dorado basecaller using --modified-bases-models. To extract the methylation information in the different contexts, the modkit pileup command (Oxford Nanopore Technologies 2024) was run with the parameters --motif CG 0 --motif CHH 0 --motif CHG 0. For methylation analysis, average methylation levels were calculated in sliding windows of 100 bp with a 50 bp overlap.\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003ePCR and Real-time quantitative PCR\u003c/h2\u003e \u003cp\u003ePolymerase Chain Reactions (PCR) were used for genotyping according to the standard protocol for DreamTaqTM DNA polymerase (Thermo Fisher). For reverse transcription quantitative PCR (RT-qPCR), RNA samples were used for synthetizing cDNA with the SuperScript\u0026reg; III Reverse Transcriptase from Invitrogen TM. Roche\u0026rsquo;s SYBR green Master Mix (Roche Applied Science) was employed with the standard protocol (initial denaturation 95\u0026ordm;C\u0026thinsp;\u0026minus;\u0026thinsp;5 mins, 40 cycles (95\u0026ordm;C -10 s, 56\u0026ordm;C \u0026minus;\u0026thinsp;10s, 72\u0026ordm;C\u0026thinsp;\u0026minus;\u0026thinsp;10 s)). Three reference genes were used to normalize the expression levels according to the 2-ΔCT method: \u003cem\u003eUBQ5\u003c/em\u003e, \u003cem\u003eACT1\u003c/em\u003e and \u003cem\u003eeIF4A\u003c/em\u003e. The primers used for PCR and qRT-PCR are listed in Supplemental Table S2. A PCR followed by an agarose gel was performed to verify the amplification region. All the reactions were performed using three biological and three technical replicates.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eTIP genotyping\u003c/h3\u003e\n\u003cp\u003eTransposon insertion coordinates in each accession were obtained from (Castanera et al. 2021). This dataset contains the raw output of PopoolationTE2 software in the mode \u003cem\u003eseparate\u003c/em\u003e (Kofler et al. 2016) using re-sequencing data from the 420 indica accessions subsampled to 15X, publicly available as part of to the 3000 rice genomes project (3,000 rice genomes project 2014). Nipponbare IRGSP genome assembly (Kawahara et al. 2013) was used as reference genome. Raw, accession-specific TE insertion files were combined into population TIP genotype matrices (TE presence/absence specified in binary format) by intersecting TE insertion coordinates with genome-wide windows as described in (Castanera et al. 2021), but in this case TE insertions from different accessions were considered the same TIP only if they overlap the same genomic window and belong to not only the same TE order but also to the same TE family. To avoid false positives, we considered only the TIPs present in a least one sample with read support\u0026thinsp;\u0026gt;\u0026thinsp;5 reads and zygosity\u0026thinsp;\u0026gt;\u0026thinsp;0.7.\u003c/p\u003e\n\u003ch3\u003eTIP-GWAS and SNP-GWAS\u003c/h3\u003e\n\u003cp\u003eThe LFMM 2 R package was applied to perform genome-wide associations (Caye et al. 2019). The latent factor mixed models were used to correct the population in indica accessions (K\u0026thinsp;=\u0026thinsp;3). For SNP-GWAS, a filtered dataset of 228K bi-allelic SNPs (MAF\u0026thinsp;\u0026gt;\u0026thinsp;1%, missing rate\u0026thinsp;\u0026lt;\u0026thinsp;1%) derived from rice 404k CoreSNP dataset available at SNP-seek (Mansueto et al. 2017). A MAF threshold of 5% and Bonferroni corrections were applied for both TIP and SNP-GWAS. Grain length data was downloaded from SNP-seek database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://snp-seek.irri.org/\u003c/span\u003e\u003cspan address=\"https://snp-seek.irri.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e).\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eLTR-retrotransposon TIPs associate with grain length in indica rice\u003c/h2\u003e \u003cp\u003eIn this study, we looked for structural variants associated with grain length phenotype by performing GWAS analysis using transposon insertion polymorphisms (TIPs) as genetic information across a panel of 420 indica accessions. We focused on LTR-retrotransposon TIPs, as they are one of the most abundant and recently active TEs in plants, and they have a strong potential to cause large genetic and phenotypic effects (Galindo‑Gonz\u0026aacute;lez et al. 2017). We detected 50,135 TIPs from a total of 112 LTR-retrotransposon families (Supplemental Table S3) and used them for GWAS. We identified eight significant associations (seven LTR-retrotransposon insertions belonging to the Gypsy superfamily and one belonging to the Copia superfamily of LTR-retrotransposons) which were present in two different genomic regions, in chromosome 3 and in chromosome 5 (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ea, Supplemental Table S4). The 7 insertions in chromosome 3 are located in an interval of 3,3 Mb. Two of these insertions are in a 0.75 Mb region previously characterized as a grain length QTL which contains \u003cem\u003eGS3\u003c/em\u003e, the most important known grain length regulator in rice (Fan et al. 2006; Mao et al. 2010). This region is also identified by the SNP-GWAS performed in parallel (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eb, Supplemental Table S5). The remaining 5 TIPs, form a LD block that falls outside this QTL boundaries (Supplemental Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e) and overlaps with other previously reported grain length QTLs, including qGL3 (Wan et al. 2005), qLWR (Wan et al. 2005) and qGL-3a (Wan et al. 2006), as well as with one grain weight QTL (gw3.1, (Thomson et al. 2003)) and one grain size QTL (qtaro_961, (Redo\u0026ntilde;a and Mackill 1998)). Therefore, all the associations found in crhomosome 3 probably correspond to already characterized genomic regions associated with grain traits. On the contrary, the additional association detected on chromosome 5 does not overlap with any grain length QTL described so far, and no association was found in this region in the SNP-GWAS run in parallel (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ea, b). An analysis of the TIP effect size shows a strong, positive association of the presence of the TE insertion with grain length (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ec). Therefore, the association described here represents a new locus controlling grain length in indica rice.\u003c/p\u003e \u003cp\u003eAs the GS3 locus is known to be a major regulator of grain length, we analyzed the possible interaction of the two loci. We analyzed the phenotypes of the accessions with the different allele combinations at both loci (TIP at chr05 and leading SNP at GS3), and our data suggests that the effect of the newly described TIP in chromosome 5 is independent of the effect of GS3, with both loci exhibiting comparable effect sizes (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ed). Moreover, the effects of the two long-grain alleles at both loci are additive, as accessions carrying the long grain alleles at both loci display significantly longer grains than those with only one of them (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ed).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eA subpopulation-specific LTR-RT insertion linked with grain length variability\u003c/h2\u003e \u003cp\u003eThe LTR-retrotransposon insertion on chromosome 5 is present in 14.8% of the indica accessions analyzed, and it is restricted to the indica-3 subgroup, where it reaches a frequency of 30.7%, as well as in some admixed varieties (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ee,f). This suggests that it is a recent insertion that occurred well after the split of the two rice subspecies, accompanying the adaptation of indica rice to local conditions. As this insertion is not present in the reference genomes of the japonica and indica rice subspecies (Qin et al. 2021; Yu et al. 2023), we characterized it by sequencing with Oxford Nanopore technology two close-related indica-3 accessions that are polymorphic for the LTR-retrotransposon insertion (TE+; KIKUBA and TE-; MANARE) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ee,f, Supplemental Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). We mapped the KIKUBA and MANARE long-reads to the IRGSP genome assembly (Kawahara et al. 2013) and unequivocally confirmed the absence of the TE in MANARE, and the presence of the insertion in KIKUBA, with 25 reads containing the full LTR-RT element and the flanking sites. The insertion sequence is 4,806 Kb long, contains all the coding domains of a Copia LTR-Retrotransposon, and includes long terminal repeats (LTRs) of 395 bp sharing 99.7% identity, with only a single nucleotide gap. The insertion is accompanied by a target site duplication (TSD) of 5 bp, typical of LTR-retrotransposon insertions, and the sequence analysis of the internal domains indicate that it belongs to the Tork lineage and the COPI2 family (Jurka 2005). This high level of similarity between the LTRs strongly suggests that it is a very recent insertion, which is also compatible with its presence being almost completely restricted to a single subpopulation (indica-3 accessions and a few additional admixed varieties).\u003c/p\u003e \u003cp\u003eTo rule out the possibility that other high-impact variants in the region different than the TIP were involved in the association with grain length, we assembled and compared the genomes of KIKUBA and MANARE using our previously generated Nanopore reads. Due to the high coverage and sequence length used, KIKUBA resulted in a near T2T assembly with only 3 gaps in the whole assembly, and with 12 scaffolds representing near-complete chromosomes including telomeric repeats in 60% of the chromosome ends. MANARE was sequenced with less coverage but still produced a highly contiguous and complete assembly (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). After scaffolding into pseudomolecules, the two genomes were aligned to identify structural variants, indels and SNPs among the two accessions, showing, as expected, a very high synteny (Supplemental Figure S2). No genetic variants were found inside the \u003cem\u003eIQD19\u003c/em\u003e gene between the two accessions. In addition to the LTR-RT insertion 1,013 bp upstream, we only identified two additional nearby variants flanking the gene, a SNP located 809 bp downstream and a 9 bp insertion 5,850 bp upstream.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSummary statistics of KIKUBA and MANARE genome assemblies.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMetrics\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eKIKUBA\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMANARE\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMain genome contigs (\u0026gt;\u0026thinsp;50Kb)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e15\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e1,373\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eContig total sequence\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e384.18 Mb (0% gap)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e380.71 Mb (0.033% gap)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMain genome scaffold N/L50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 / 31.314 Mb\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e6 / 30.822 Mb\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMain genome contig N/L50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 / 30.967 Mb\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e114 / 0.84 Mb\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMain genome scaffold N/L90\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e11 / 24.942 Mb\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e11 / 25.142 Mb\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMain genome contig N/L90\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e11 / 23.408 Mb\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e590 / 124.867 Kb\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMax contig length\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e44.111 Mb\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e7.826 Mb\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMax scaffold length\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e44.111 Mb\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e44.381 Mb\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBUSCO score\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e98.7%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e98.6%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGene models\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e37,014\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e37,156\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eThe Tork LTR-retrotransposon insertion downregulates\u003c/b\u003e \u003cb\u003eIQD19\u003c/b\u003e \u003cb\u003egene expression\u003c/b\u003e\u003c/p\u003e \u003cp\u003eAn analysis of the region flanking the insertion shows that the LTR-retrotransposon is inserted in the upstream region (1013 bp away) of an annotated coding region for a protein containing an IQ calmodulin-binding motif (RAPDB gene ID: Os05g0521900) identified as IQ-DOMAIN19 in NCBI genebank (hereafter referred as \u003cem\u003eIQD19\u003c/em\u003e). \u003cem\u003eIQD19\u003c/em\u003e gene is expressed during the development of the grain in both the embryo and the endosperm, with a pattern of expression that would be compatible with a grain size regulator. The IQ calmodulin-binding domain is also present in other regulators of grain development in rice such as \u003cem\u003eGSE5\u003c/em\u003e (Os05g0187500), a well-known regulator of grain width and length. Interestingly, \u003cem\u003eIQD19\u003c/em\u003e has an expression profile that is similar to that of GSE5 (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea,b), reinforcing the idea that both genes could be regulators of related processes. In order to investigate whether the insertion of the LTR-retrotransposon in the upstream region of \u003cem\u003eIQD19\u003c/em\u003e could be altering its expression and affecting grain length, we analyzed \u003cem\u003eIQD19\u003c/em\u003e expression in seedlings and measured the length of the grains of 8 closely related indica-3 accessions (4 carrying the LTR-RT insertion and 4 lacking it) grown in the laboratory. The analysis of \u003cem\u003eIQD19\u003c/em\u003e expression showed a significative decrease of expression (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) linked to the presence of the insertion (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ec), correlating with significantly longer grains (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ed).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eTranscriptional downregulation of\u003c/b\u003e \u003cb\u003eIQD19\u003c/b\u003e \u003cb\u003eby the Tork LTR-RT is independent of DNA methylation\u003c/b\u003e\u003c/p\u003e \u003cp\u003eLTR-retrotransposons are often highly methylated, in particular the most recent ones (Vonholdt et al. 2012), and DNA methylation extension from LTR-boundaries into gene promoter regions is a common source of transcriptional repression (Hirsch and Springer 2017). To analyze if that was the case for the Tork insertion, we used ONT reads from KIKUBA (TE+) and MANARE (TE-) accessions to analyze the methylation status of the \u003cem\u003eIQD19\u003c/em\u003e gene and its promoter region in the CG, CHG and CHH context. As expected, CG and CHG methylation were high in the region of KIKUBA corresponding to the Tork retrotransposon insertion (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). However, the \u003cem\u003eIQD19\u003c/em\u003e gene body and the immediate upstream region (1,013 Kb) exhibited similar low methylation levels in both accessions and in the three contexts, suggesting that the methylation at the LTR-retrotransposon sequence was not extended into the gene upstream region flanking the insertion. An alternative explanation for the repressive effect of the Tork insertion could be that it disrupts Transcription Factor Binding Sites (TFBS) or acts as a physical spacer, increasing their distance from the core promoter. These alterations can hinder the recruitment and assembly of the transcriptional machinery, thereby downregulating gene expression (Hirsch and Springer 2017). The analysis of computationally predicted cis-regulatory elements available in RAPDB (PLACE) database in the \u003cem\u003eIQD19\u003c/em\u003e upstream region suggests that the Tork insertion acts as a spacer, displacing multiple cis-acting regulatory DNA elements. The analysis of experimental data from a multiDAP-seq experiment recently described by (Baumgart et al. 2025) revealed that the upstream region of the \u003cem\u003eIQD19\u003c/em\u003e gene contains multiple TFBS from the NAC, B3, and Dof families. These TFBS are located upstream of the insertion site of the LTR-retrotransposon insertion and, therefore, when the insertion is present, the distance between these TFBS and the transcription start of \u003cem\u003eIQD19\u003c/em\u003e is altered (Supplemental Figure S3). These findings suggest that the LTR-retrotransposon insertion alters the landscape of cis-regulatory elements, thereby reshaping the transcriptional behavior of the adjacent \u003cem\u003eIQD19\u003c/em\u003e gene.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThe role of transposable element insertion polymorphisms (TIPs) is an important yet underexplored source of genetic variation influencing agronomic traits. TIPs have historically been overlooked due to their challenging detection, but recent evidence across multiple crops highlights their functional relevance and importance for breeding (Dom\u0026iacute;nguez et al. 2020; Li et al. 2024; Dong et al. 2025).Our study demonstrates that using TIPs for genome-wide association can uncover additional regions bypassed by traditional SNP-GWAS. Our parallel analyses using TIPs and SNPs in indica rice identified three regions associated with grain length, two in chromosome 3 and one in chromosome 5. The significant TIPs found in the two close regions of chromosome 3 are possibly related to the presence of \u003cem\u003eGS3\u003c/em\u003e, a well-characterized grain length and weight regulator (Fan et al. 2006). Beyond this known region, we identified an LTR-retrotransposon insertion strongly associated with grain length in chromosome 5 that has not previously been described. The LTR-RT is inserted in the upstream region of the \u003cem\u003eIQD19\u003c/em\u003e gene, which encodes a protein with a putative calmodulin-binding domain (IQ domain). Calmodulin-binding proteins have been repeatedly described as regulators of agronomic traits related to organ morphology by modulating cell division. For example, the \u003cem\u003eSUN\u003c/em\u003e gene, a major regulator of fruit elongation in tomatoes, encodes an IQ domain-containing protein (Xiao et al. 2008; Wu et al. 2011). The Arabidopsis \u003cem\u003eIQD5\u003c/em\u003e gene, a regulator of leaf pavement cell morphogenesis, also encodes an IQ domain-containing protein (Mitra et al. 2019). In rice, \u003cem\u003eGSE5\u003c/em\u003e, which encodes a calmodulin-binding protein with an IQ domain, acts as a negative regulator of grain width and weight (Liu et al. 2017), and its homolog GW5L has also been shown to influence grain size (Tian et al. 2019), whereas \u003cem\u003eIQD14\u003c/em\u003e regulates grain width and length (Yang et al. 2020). The involvement of IQ domain-containing proteins in shaping organ architecture and the specific examples on rice related genes affecting grain size suggests that \u003cem\u003eIQD19\u003c/em\u003e is the actual genetic factor contributing to grain length variation in this region. The pattern of expression of \u003cem\u003eIQD19\u003c/em\u003e, which is very similar to that of \u003cem\u003eGSE5\u003c/em\u003e during rice grain maturation, reinforces this hypothesis. Interestingly, the transcription of \u003cem\u003eGSE5\u003c/em\u003e is also affected by a structural variation in the promoter region, leading to lower \u003cem\u003eGSE5\u003c/em\u003e expression and wider grains (Duan et al. 2017). Transposable element insertions in upstream gene regions can have diverse regulatory consequences. In most cases, TE insertions close to genes are heavily methylated and this methylation can influence the expression of genes located nearby (Hollister and Gaut 2009). However, cases where the insertion influences the expression of downstream genes without affecting methylation also exist, as it has been shown for the maize \u003cem\u003estiff1\u003c/em\u003e gene (Zhang et al. 2020). In such cases, transposable elements disrupt, displace or modify existing cis-regulatory elements or alter the chromatin state of the region, thereby reshaping the transcriptional profiles of adjacent genes (Hirsch and Springer 2017). Our results showed a lower \u003cem\u003eIQD19\u003c/em\u003e expression on the accessions carrying the Tork promoter insertion, and the analysis of the genome assemblies revealed that is the only variant in the immediate upstream region of \u003cem\u003eIQD19\u003c/em\u003e. Despite the Tork insertion is heavily methylated, the methylation seems to be constrained to the element boundaries without affecting the more proximal region to the transcription start site, suggesting that the repressive effect of the LTR-retrotransposon insertion is not due to an epigenetic change of the \u003cem\u003eIQD19\u003c/em\u003e promoter. The experimental data shows that the region upstream of the insertion of the LTR-retrotransposon contains TFBS that may be regulating \u003cem\u003eIQD19\u003c/em\u003e expression. The distance between these TFBS and the transcription start of \u003cem\u003eIQD19\u003c/em\u003e is greatly increased in the accessions containing the insertion, suggesting that, as described for the maize \u003cem\u003estiff1\u003c/em\u003e gene (Zhang et al. 2020), the alteration of TFBS landscape could be the mechanisms explaining the repression of \u003cem\u003eIQD19\u003c/em\u003e. Collectively, these results suggest that the insertion of an LTR-retrotransposon is the actual cause of the \u003cem\u003eIQD19\u003c/em\u003e downregulation, and that \u003cem\u003eIQD19\u003c/em\u003e is a novel, negative grain length regulator. This novel region, whose impact on grain length is as strong as GS3, the most important grain length QTL, represents a promising target for rice breeding.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eGWAS\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eGenome-wide association study\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eLD\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eLinkage disequilibrium\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eLTR-RT\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eLong terminal repeat retrotransposons\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eONT\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eOxford nanopore technology\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003ePCR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003ePolymerase chain reaction\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003ePLACE\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003ePlant cis-acting regulatory DNA elements\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eQTLs\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eQuantitative trait loci\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eSNP\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eSingle nucleotide polymorphism\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eSRA\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eSequence read archive\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eSV\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eStructural variant\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTE\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eTransposable element\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTSD\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eTarget site duplication\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eRAPDB\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eRice annotation project database\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eTIP\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eTransposon insertion polymorphism.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003eEthics approval and consent to participate\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003eConsent for publication\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003eAvailability of data and materials\u003c/p\u003e\n\u003cp\u003eRaw genome sequencing data have been deposited in the Sequence Read Archive (SRA) under BioProject PRJNA1387227 (private link for reviewers: https://dataview.ncbi.nlm.nih.gov/object/PRJNA1387227?reviewer=cqpic98g2lvl0hkmudvpflintr). Genome assemblies are available under NCBI BioProject PRJNA1387416. In addition, genome assemblies, gene and LTR-RT annotations, along with the TIP/SNP genotype data used for the GWAS are available at Zenodo (private link for reviewers: https://zenodo.org/records/18221427?preview=1\u0026amp;token=eyJhbGciOiJIUzUxMiJ9.eyJpZCI6Ijc1ZDA2YmIwLWFkNWEtNDFhMy05NjBkLTI4YWM0ZDVmYTNiNSIsImRhdGEiOnt9LCJyYW5kb20iOiI3OTQxNzA2NTM5NWE5ZDUxYjRjOTlkNjE0OGZkNTE4YyJ9.38Li7l32RxpaqffZ-8rWTz_dn_IpAOJOu3YtlumCqB0LECjziE3qnE374AI60baypd5hHeR5I704pdngcI0jxA ). Raw PopoolationTE2 predictions are available at Zenodo (https://zenodo.org/records/4058696).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAll materials currently under restricted access will be made publicly available upon acceptance of the manuscript.\u003c/p\u003e\n\u003cp\u003eCompeting interests\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003eFunding\u003c/p\u003e\n\u003cp\u003eThe work done at CRAG was funded by grant PID2022-143167NB-I00, funded by MICIU/AEI/ 10.13039/501100011033 and by \u0026ldquo;ERDF/EU\u0026rdquo; and grant CEX2019-000902-S funded by MICIU/AEI /10.13039/501100011033. NMD is funded by MCIU/AEI /10.13039/501100011033 and by \u0026ldquo;ESF Investing in your future\u0026rdquo; (reference PRE2020-095111). RC is a Ram\u0026oacute;n y Cajal contract holder, funded by MICIU/AEI/ https://doi.org/10.13039/501100011033 and by FSE+ (reference RYC2022-037459-I).\u003c/p\u003e\n\u003cp\u003eAuthors\u0026apos; contributions\u003c/p\u003e\n\u003cp\u003eNM: Formal analysis, Methodology, Investigation, Writing; JC: Conceptualization, Writing, Supervision, Funding acquisition; RC: Conceptualization, Formal analysis, Writing, Supervision\u003c/p\u003e\n\u003cp\u003eAcknowledgements\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003eAuthors\u0026apos; information\u003c/p\u003e\n\u003cp\u003eNoemia Morals-D\u0026iacute;az: https://orcid.org/0000-0001-5258-486X\u003c/p\u003e\n\u003cp\u003eJosep M. Casacuberta: \u0026nbsp;https://orcid.org/0000-0002-5609-4152\u003c/p\u003e\n\u003cp\u003eRa\u0026uacute;l Castanera: https://orcid.org/0000-0002-3772-7727\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003e,000 rice genomes project (2014) The 3,000 rice genomes project. Gigascience 3:7. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/2047-217X-3-7\u003c/span\u003e\u003cspan address=\"10.1186/2047-217X-3-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlonge M, Lebeigle L, Kirsche M, Jenike K, Ou S, Aganezov S, Wang X, Lippman ZB, Schatz MC, Soyk S (2022) Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol 23(1):258. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s13059-022-02823-7\u003c/span\u003e\u003cspan address=\"10.1186/s13059-022-02823-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBaumgart LA, Greenblum SI, Morales-Cruz A, Wang P, Zhang Y, Yang L, Chen C, Dilworth DJ, Garretson AC, Grosjean N, He G, Savage E, Yoshinaga Y, Blaby IK, Daum CG, O\u0026rsquo;Malley RC (2025) Recruitment, rewiring and deep conservation in flowering plant gene regulation. Nat Plants 11(8):1514\u0026ndash;1527. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41477-025-02047-0\u003c/span\u003e\u003cspan address=\"10.1038/s41477-025-02047-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCarpentier M-C, Manfroi E, Wei F-J, Wu H-P, Lasserre E, Llauro C, Debladis E, Akakpo R, Hsing Y-I, Panaud O (2019) Retrotranspositional landscape of Asian rice revealed by 3000 genomes. Nat Commun 10(1):24. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41467-018-07974-5\u003c/span\u003e\u003cspan address=\"10.1038/s41467-018-07974-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCastanera R, Morales-D\u0026iacute;az N, Gupta S, Purugganan M, Casacuberta JM (2023) Transposons are important contributors to gene expression variability under selection in rice populations. eLife 12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.7554/eLife.86324\u003c/span\u003e\u003cspan address=\"10.7554/eLife.86324\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCastanera R, Vendrell-Mir P, Bardil A, Carpentier M-C, Panaud O, Casacuberta JM (2021) Amplification dynamics of miniature inverted-repeat transposable elements and their impact on rice trait variability. Plant J 107(1):118\u0026ndash;135. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/tpj.15277\u003c/span\u003e\u003cspan address=\"10.1111/tpj.15277\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCaye K, Jumentier B, Lepeule J, Fran\u0026ccedil;ois O (2019) LFMM 2: Fast and Accurate Inference of Gene-Environment Associations in Genome-Wide Studies. Mol Biol Evol 36(4):852\u0026ndash;860. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/molbev/msz008\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msz008\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDom\u0026iacute;nguez M, Dugas E, Benchouaia M, Leduque B, Jim\u0026eacute;nez-G\u0026oacute;mez JM, Colot V, Quadrana L (2020) The impact of transposable elements on tomato diversity. Nat Commun 11(1):4058. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41467-020-17874-2\u003c/span\u003e\u003cspan address=\"10.1038/s41467-020-17874-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDong Z, Jin S, Hao Y, Zhao T, Shang H, Zhang Z, Fang L, Zheng Z, Li J (2025) Transposon dynamics drive genome evolution and regulate genetic mechanisms of agronomic traits in cotton. Plants 14(16). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/plants14162509\u003c/span\u003e\u003cspan address=\"10.3390/plants14162509\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDuan P, Xu J, Zeng D, Zhang B, Geng M, Zhang G, Huang K, Huang L, Xu R, Ge S, Qian Q, Li Y (2017) Natural variation in the promoter of GSE5 contributes to grain size diversity in rice. Mol Plant 10(5):685\u0026ndash;694. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.molp.2017.03.009\u003c/span\u003e\u003cspan address=\"10.1016/j.molp.2017.03.009\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEscol\u0026agrave; G, Gonz\u0026aacute;lez-Miguel VM, Campo S, Catala-Forner M, Domingo C, Marqu\u0026eacute;s L, San Segundo B (2023) Development and Genome-Wide Analysis of a Blast-Resistant japonica Rice Variety. Plants 12(20). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/plants12203536\u003c/span\u003e\u003cspan address=\"10.3390/plants12203536\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFan C, Xing Y, Mao H, Lu T, Han B, Xu C, Li X, Zhang Q (2006) GS3, a major QTL for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein. Theor Appl Genet 112(6):1164\u0026ndash;1171. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s00122-006-0218-1\u003c/span\u003e\u003cspan address=\"10.1007/s00122-006-0218-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGalindo-Gonz\u0026aacute;lez L, Mhiri C, Deyholos MK, Grandbastien M-A (2017) LTR-retrotransposons in plants: Engines of evolution. Gene 626:14\u0026ndash;25. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.gene.2017.04.051\u003c/span\u003e\u003cspan address=\"10.1016/j.gene.2017.04.051\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeller D, Vingron M (2021) SVIM-asm: structural variant detection from haploid and diploid genome assemblies. Bioinformatics 36(22\u0026ndash;23):5519\u0026ndash;5521. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bioinformatics/btaa1034\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/btaa1034\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHirai Y, Keisuke S, Hamagami K (2012) Evaluation of an analytical method to identify determinants of rice yield components and protein content. Comput Electron Agric 83:77\u0026ndash;84. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.compag.2012.02.001\u003c/span\u003e\u003cspan address=\"10.1016/j.compag.2012.02.001\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHirsch CD, Springer NM (2017) Transposable element influences on gene expression in plants. Biochim Biophys Acta Gene Regul Mech 1860(1):157\u0026ndash;165. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.bbagrm.2016.05.010\u003c/span\u003e\u003cspan address=\"10.1016/j.bbagrm.2016.05.010\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHollister JD, Gaut BS (2009) Epigenetic silencing of transposable elements: a trade-off between reduced transposition and deleterious effects on neighboring gene expression. Genome Res 19(8):1419\u0026ndash;1428. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1101/gr.091678.109\u003c/span\u003e\u003cspan address=\"10.1101/gr.091678.109\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIkehashi H (2009) Why are there indica type and japonica type in rice? \u0026mdash; history of the studies and a view for origin of two types. Rice Sci 16(1):1\u0026ndash;13. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/S1672-6308(08)60050-5\u003c/span\u003e\u003cspan address=\"10.1016/S1672-6308(08)60050-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eInglis PW, Pappas M, de CR, Resende LV, Grattapaglia D (2018) Fast and inexpensive protocols for consistent extraction of high quality DNA and RNA from challenging plant and fungal samples for high-throughput SNP genotyping and sequencing applications. PLoS ONE 13(10):e0206085. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pone.0206085\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0206085\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJurka J (2005) Copi2: Identification of a new Copia-type LTR retrotransposon from rice. Repbase Rep 5(10)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, Schwartz DC, Tanaka T, Wu J, Zhou S, Childs KL, Davidson RM, Lin H, Quesada-Ocampo L, Vaillancourt B, Sakai H, Lee SS, Kim J, Numa H, Itoh T, Buell CR, Matsumoto T (2013) Improvement of the \u003cem\u003eOryza sativa\u003c/em\u003e Nipponbare reference genome using next generation sequence and optical map data. Rice (N Y) 6(1):4. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/1939-8433-6-4\u003c/span\u003e\u003cspan address=\"10.1186/1939-8433-6-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKofler R, G\u0026oacute;mez-S\u0026aacute;nchez D, Schl\u0026ouml;tterer C (2016) PoPoolationTE2: Comparative Population Genomics of Transposable Elements Using Pool-Seq. Mol Biol Evol 33(10):2759\u0026ndash;2764. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/molbev/msw137\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msw137\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKolmogorov M, Yuan J, Lin Y, Pevzner PA (2019) Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37(5):540\u0026ndash;546. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41587-019-0072-8\u003c/span\u003e\u003cspan address=\"10.1038/s41587-019-0072-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu J, Chen J, Zheng X, Wu F, Lin Q, Heng Y, Tian P, Cheng Z, Yu X, Zhou K, Zhang X, Guo X, Wang J, Wang H, Wan J (2017) GW5 acts in the brassinosteroid signalling pathway to regulate grain width and weight in rice. Nat Plants 3:17043. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/nplants.2017.43\u003c/span\u003e\u003cspan address=\"10.1038/nplants.2017.43\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu Q, Han R, Wu K, Zhang J, Ye Y, Wang S, Chen J, Pan Y, Li Q, Xu X, Zhou J, Tao D, Wu Y, Fu X (2018) G-protein βγ subunits determine grain size through interaction with MADS-domain transcription factors in rice. Nat Commun 9(1):852. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41467-018-03047-9\u003c/span\u003e\u003cspan address=\"10.1038/s41467-018-03047-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi D, Nanseki T (eds) (2021) Empirical analyses on rice yield determinants of smart farming in japan. Springer Singapore, Singapore\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34(18):3094\u0026ndash;3100. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bioinformatics/bty191\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/bty191\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi X, Dai X, He H, Lv Y, Yang L, He W, Liu C, Wei H, Liu X, Yuan Q, Wang X, Wang T, Zhang B, Zhang H, Chen W, Leng Y, Yu X, Qian H, Zhang B, Guo M, Zhang Z, Shi C, Zhang Q, Cui Y, Xu Q, Cao X, Chen D, Zhou Y, Qian Q, Shang L (2024) A pan-TE map highlights transposable elements underlying domestication and agronomic traits in Asian rice. Natl Sci Rev 11(6):nwae188. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nsr/nwae188\u003c/span\u003e\u003cspan address=\"10.1093/nsr/nwae188\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMansueto L, Fuentes RR, Borja FN, Detras J, Abriol-Santos JM, Chebotarov D, Sanciangco M, Palis K, Copetti D, Poliakov A, Dubchak I, Solovyev V, Wing RA, Hamilton RS, Mauleon R, McNally KL, Alexandrov N (2017) Rice SNP-seek database update: new SNPs, indels, and queries. Nucleic Acids Res 45(D1):D1075\u0026ndash;D1081. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/gkw1135\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkw1135\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMao H, Sun S, Yao J, Wang C, Yu S, Xu C, Li X, Zhang Q (2010) Linking differential domain functions of the GS3 protein to natural variation of grain size in rice. Proc Natl Acad Sci USA 107(45):19579\u0026ndash;19584. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1073/pnas.1014419107\u003c/span\u003e\u003cspan address=\"10.1073/pnas.1014419107\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMitra D, Klemm S, Kumari P, Quegwer J, M\u0026ouml;ller B, Poeschl Y, Pflug P, Stamm G, Abel S, B\u0026uuml;rstenbinder K (2019) Microtubule-associated protein IQ67 DOMAIN5 regulates morphogenesis of leaf pavement cells in Arabidopsis thaliana. J Exp Bot 70(2):529\u0026ndash;543. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/jxb/ery395\u003c/span\u003e\u003cspan address=\"10.1093/jxb/ery395\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOki K, Fujisawa Y, Kato H, Iwasaki Y (2005) Study of the constitutively active form of the alpha subunit of rice heterotrimeric G proteins. Plant Cell Physiol 46(2):381\u0026ndash;386. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/pcp/pci036\u003c/span\u003e\u003cspan address=\"10.1093/pcp/pci036\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOki K, Kitagawa K, Fujisawa Y, Kato H, Iwasaki Y (2009) Function of alpha subunit of heterotrimeric G protein in brassinosteroid response of rice plants. Plant Signal Behav 4(2):126\u0026ndash;128. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.4161/psb.4.2.7627\u003c/span\u003e\u003cspan address=\"10.4161/psb.4.2.7627\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQin P, Lu H, Du H, Wang H, Chen W, Chen Z, He Q, Ou S, Zhang H, Li X, Li X, Li Y, Liao Y, Gao Q, Tu B, Yuan H, Ma B, Wang Y, Qian Y, Fan S, Li W, Wang J, He M, Yin J, Li T, Jiang N, Chen X, Liang C, Li S (2021) Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184(13):3542\u0026ndash;3558e16. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.cell.2021.04.046\u003c/span\u003e\u003cspan address=\"10.1016/j.cell.2021.04.046\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRedo\u0026ntilde;a ED, Mackill DJ (1998) Quantitative trait locus analysis for rice panicle and grain characteristics. Theor Appl Genet 96(6\u0026ndash;7):957\u0026ndash;963. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s001220050826\u003c/span\u003e\u003cspan address=\"10.1007/s001220050826\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTao Y, Miao J, Wang J, Li W, Xu Y, Wang F, Jiang Y, Chen Z, Fan F, Xu M, Zhou Y, Liang G, Yang J (2020) RGG1, involved in the cytokinin regulatory pathway, controls grain size in rice. Rice (N Y) 13(1):76. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12284-020-00436-x\u003c/span\u003e\u003cspan address=\"10.1186/s12284-020-00436-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThomson MJ, Tai TH, McClung AM, Lai XH, Hinga ME, Lobos KB, Xu Y, Martinez CP, McCouch SR (2003) Mapping quantitative trait loci for yield, yield components and morphological traits in an advanced backcross population between Oryza rufipogon and the Oryza sativa cultivar Jefferson. Theor Appl Genet 107(3):479\u0026ndash;493. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s00122-003-1270-8\u003c/span\u003e\u003cspan address=\"10.1007/s00122-003-1270-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTian P, Liu J, Mou C, Shi C, Zhang H, Zhao Z, Lin Q, Wang J, Wang J, Zhang X, Guo X, Cheng Z, Zhu S, Ren Y, Lei C, Wang H, Wan J (2019) GW5-Like, a homolog of GW5, negatively regulates grain width, weight and salt resistance in rice. J Integr Plant Biol 61(11):1171\u0026ndash;1185. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/jipb.12745\u003c/span\u003e\u003cspan address=\"10.1111/jipb.12745\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVonholdt BM, Takuno S, Gaut BS (2012) Recent retrotransposon insertions are methylated and phylogenetically clustered in japonica rice (Oryza sativa spp. japonica). Mol Biol Evol 29(10):3193\u0026ndash;3203. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/molbev/mss129\u003c/span\u003e\u003cspan address=\"10.1093/molbev/mss129\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWalker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM (2014) Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9(11):e112963. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pone.0112963\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0112963\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang W, Mauleon R, Hu Z, Chebotarov D, Tai S, Wu Z, Li M, Zheng T, Fuentes RR, Zhang F, Mansueto L, Copetti D, Sanciangco M, Palis KC, Xu J, Sun C, Fu B, Zhang H, Gao Y, Zhao X, Shen F, Cui X, Yu H, Li Z, Chen M, Detras J, Zhou Y, Zhang X, Zhao Y, Kudrna D, Wang C, Li R, Jia B, Lu J, He X, Dong Z, Xu J, Li Y, Wang M, Shi J, Li J, Zhang D, Lee S, Hu W, Poliakov A, Dubchak I, Ulat VJ, Borja FN, Mendoza JR, Ali J, Li J, Gao Q, Niu Y, Yue Z, Naredo MEB, Talag J, Wang X, Li J, Fang X, Yin Y, Glaszmann J-C, Zhang J, Li J, Hamilton RS, Wing RA, Ruan J, Zhang G, Wei C, Alexandrov N, McNally KL, Li Z, Leung H (2018) Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557(7703):43\u0026ndash;49. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41586-018-0063-9\u003c/span\u003e\u003cspan address=\"10.1038/s41586-018-0063-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang X, Weigel D, Smith LM (2013) Transposon variants and their effects on gene expression in Arabidopsis. PLoS Genet 9(2):e1003255. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pgen.1003255\u003c/span\u003e\u003cspan address=\"10.1371/journal.pgen.1003255\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWan XY, Wan JM, Jiang L, Wang JK, Zhai HQ, Weng JF, Wang HL, Lei CL, Wang JL, Zhang X, Cheng ZJ, Guo XP (2006) QTL analysis for rice grain length and fine mapping of an identified QTL with stable and major effects. Theor Appl Genet 112(7):1258\u0026ndash;1270. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s00122-006-0227-0\u003c/span\u003e\u003cspan address=\"10.1007/s00122-006-0227-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWan XY, Wan JM, Weng JF, Jiang L, Bi JC, Wang CM, Zhai HQ (2005) Stability of QTLs for rice grain dimension and endosperm chalkiness characteristics across eight environments. Theor Appl Genet 110(7):1334\u0026ndash;1346. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s00122-005-1976-x\u003c/span\u003e\u003cspan address=\"10.1007/s00122-005-1976-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu S, Xiao H, Cabrera A, Meulia T, van der Knaap E (2011) SUN regulates vegetative and reproductive organ shape by changing cell division patterns. Plant Physiol 157(3):1175\u0026ndash;1186. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1104/pp.111.181065\u003c/span\u003e\u003cspan address=\"10.1104/pp.111.181065\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXiao H, Jiang N, Schaffner E, Stockinger EJ, van der Knaap E (2008) A retrotransposon-mediated gene duplication underlies morphological variation of tomato fruit. Science 319(5869):1527\u0026ndash;1530. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1126/science.1153040\u003c/span\u003e\u003cspan address=\"10.1126/science.1153040\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang B, Wendrich JR, De Rybel B, Weijers D, Xue H-W (2020) Rice microtubule-associated protein IQ67-DOMAIN14 regulates grain shape by modulating microtubule cytoskeleton dynamics. Plant Biotechnol J 18(5):1141\u0026ndash;1152. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/pbi.13279\u003c/span\u003e\u003cspan address=\"10.1111/pbi.13279\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu Z, Chen Y, Zhou Y, Zhang Y, Li M, Ouyang Y, Chebotarov D, Mauleon R, Zhao H, Xie W, McNally KL, Wing RA, Guo W, Zhang J (2023) Rice Gene Index: A comprehensive pan-genome database for comparative and functional genomics of Asian rice. Mol Plant 16(5):798\u0026ndash;801. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.molp.2023.03.012\u003c/span\u003e\u003cspan address=\"10.1016/j.molp.2023.03.012\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang T, Wang Z, Liu Q, Zhao D (2025) Genetic Improvement of rice Grain size Using the CRISPR/Cas9 System. Rice (N Y) 18(1):3. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12284-025-00758-8\u003c/span\u003e\u003cspan address=\"10.1186/s12284-025-00758-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang Z, Zhang X, Lin Z, Wang J, Liu H, Zhou L, Zhong S, Li Y, Zhu C, Lai J, Li X, Yu J, Lin Z (2020) A Large Transposon Insertion in the stiff1 Promoter Increases Stalk Strength in Maize. Plant Cell 32(1):152\u0026ndash;165. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1105/tpc.19.00486\u003c/span\u003e\u003cspan address=\"10.1105/tpc.19.00486\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhan P, Ma S, Xiao Z, Li F, Wei X, Lin S, Wang X, Ji Z, Fu Y, Pan J, Zhou M, Liu Y, Chang Z, Li L, Bu S, Liu Z, Zhu H, Liu G, Zhang G, Wang S (2022) Natural variations in grain length 10 (GL10) regulate rice grain size. J Genet Genomics 49(5):405\u0026ndash;413. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jgg.2022.01.008\u003c/span\u003e\u003cspan address=\"10.1016/j.jgg.2022.01.008\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":true,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"rice","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"rice","sideBox":"Learn more about [Rice](http://thericejournal.springeropen.com)","snPcode":"12284","submissionUrl":"https://submission.nature.com/new-submission/12284/3","title":"Rice","twitterHandle":"@SpringerOpen","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Genome Wide Association Study (GWAS), Transposable Elements, Grain length","lastPublishedDoi":"10.21203/rs.3.rs-9057165/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9057165/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eTransposable elements (TEs) constitute a major source of genetic variability in plants by generating insertion polymorphisms that contribute to genome evolution. LTR-retrotransposons are the most abundant plant TEs, but their association with traits has not been comprehensively studied. In this study, we take advantage of the large available genomic resources of rice to characterize LTR-retrotransposon polymorphisms in indica subspecies and uncover their relationship with trait variability. Using genome-wide association studies based on TE-polymorphisms (TIP-GWAS), we identified a non-reference LTR-retrotransposon insertion strongly associated with grain length in a region without previously known QTLs. The insertion is present in the upstream region of the \u003cem\u003eIQD19\u003c/em\u003e gene, which belongs to the calmodulin-binding domain family proteins. The insertion reduces \u003cem\u003eIQD19\u003c/em\u003e gene expression through a DNA-methylation-independent mechanism, likely by interfering with transcription factor regulation. In this study, we identify a potential novel grain length regulator in a region that can be targeted to design genetic markers for rice breeding. We also provide high-quality genome assemblies for two indica related accessions for which the TE is polymorphic. Finally, our results underscore the importance of incorporating TE polymorphisms to the pool of genetic variants used for rice association mapping.\u003c/p\u003e","manuscriptTitle":"Population-level analysis of rice transposon polymorphisms reveals an LTR-retrotransposon insertion linked to grain length","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-01 09:22:50","doi":"10.21203/rs.3.rs-9057165/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"editorInvitedReview","content":"","date":"2026-05-13T11:03:23+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"231262260385156164359955229562091303303","date":"2026-05-01T03:55:50+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-04-22T14:46:04+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"304951132136821011924128562430064132976","date":"2026-03-18T01:20:02+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-03-12T06:19:03+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-03-09T04:14:59+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-03-09T04:13:59+00:00","index":"","fulltext":""},{"type":"submitted","content":"Rice","date":"2026-03-07T09:14:34+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"rice","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"rice","sideBox":"Learn more about [Rice](http://thericejournal.springeropen.com)","snPcode":"12284","submissionUrl":"https://submission.nature.com/new-submission/12284/3","title":"Rice","twitterHandle":"@SpringerOpen","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"968e05e4-57d8-48da-b24b-23c3538cc7b0","owner":[],"postedDate":"April 1st, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"","date":"2026-05-13T11:03:23+00:00","index":24,"fulltext":""},{"type":"reviewerAgreed","content":"231262260385156164359955229562091303303","date":"2026-05-01T03:55:50+00:00","index":23,"fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-04-01T09:22:50+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-01 09:22:50","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9057165","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9057165","identity":"rs-9057165","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.