Exploring Alu-Driven DNA Transductions in the Primate Genomes

preprint OA: closed
Full text JSON View at publisher
Full text 128,494 characters · extracted from preprint-html · click to expand
Exploring Alu-Driven DNA Transductions in the Primate Genomes | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Exploring Alu-Driven DNA Transductions in the Primate Genomes Reza Halabian, Jessica M. Storer, Savannah J. Hoyt, Gabrielle A. Hartley, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4595082/v1 This work is licensed under a CC BY 4.0 License Status: Under Revision Version 1 posted 9 You are reading this latest preprint version Abstract Long terminal repeats (LTRs) and non-LTRs retrotransposons, aka retroelements, collectively occupy a substantial part of the human genome. Certain non-LTR retroelements, such as L1 and SVA, have the potential for DNA transduction, which involves the concurrent mobilization of flanking non-transposon DNA during retrotransposition. These events can be detected by computational approaches. Despite being the most abundant short interspersed sequences (SINEs) that are still active within the genomes of humans and other primates, the transduction rate caused by Alu sequences remains unexplored. Therefore, we conducted an analysis to address this research gap and utilized an in-house program to probe for the presence of Alu -related transductions in the human genome. We analyzed 118,489 full-length Alu Y subfamilies annotated within the first complete human reference genome, T2T-CHM13. For comparative insights, we extended our exploration to two non-human primate genomes, the chimpanzee and the rhesus monkey. After manual curation, our findings did not confirm any Alu -mediated transductions, whose source genes are, unlike L1 or SVA, transcribed by RNA polymerase III, implying that they are infrequent or possibly absent not only in the human but also in chimpanzee and rhesus monkey genomes. Although we identified loci in which the 3’ Target Site Duplication (TSD) was located distantly from the retrotransposed Alu Ys, a transduction hallmark, our study could not find further support for such events. The observation of these instances can be explained by the incorporation of other nucleotides into the poly(A) tails in conjunction with polymerase slippage. DNA transduction Alu transposed elements retrosequences human genome human genomics primate genomes Figures Figure 1 Figure 2 Figure 3 1. Introduction Discovered by Barbara McClintock in the 1940s, mobile elements [ 1 ], often termed transposable or transposed elements (TEs), are present in most, if not all eukaryotic genomes [ 2 ]. In the human genome, the discernible TEs contribute to about 46% or over 1.4 Gb of the sequence [ 3 ]. Based on the transposition mode, TEs have been divided into Type I elements that are mobilized via a “copy-and-paste” mechanism and Type II elements that move via a “cut-and-paste” manner [ 4 ]. Notably, not all copy-and-paste TEs can give rise to new copies, instead, there are only a limited number of master, source, or founder elements that spread their copies within genomes [ 5 ]. Therefore, most TEs are not transposable but instead only transposed elements [ 6 ]. Whether a TE is transposable presumably depends on retention of intact internal open reading frames and expression, particularly in the germ line. While for the L1 elements, bona fide source genes had been identified [ 7 , 8 ], there is only sparse information on Alu source genes [ 9 – 12 ]. Autonomous elements, such as L1s, encode most of the gene products that are required for their activity, while non-autonomous elements, such as SVAs and Alu s rely on the machinery of autonomous elements to proliferate within genomes [ 13 – 15 ]. TE-families are subject to regular biological processes, including birth and death, i.e., a specific family or a subfamily of a TE appears in the genome and goes extinct after a period of activity. Consequently, only a small number of TE families are active within a certain evolutionary period. For instance, currently, only three non-LTR retroelements seem to be active in humans. These include one autonomous family (L1) and two non-autonomous families ( Alu and SVA) [ 16 ] while older elements, such as mammalian-wide repeats (MIR), are still present and discernible, but their source genes are inactive [ 17 , 18 ]. Interestingly, all these TEs belong to type I and consequently, Alu s and SVAs most likely employ the L1 molecular machinery for their retrotransposition. Although there are numerous sequences related to Type II transposons in the human genome, there is no evidence of their recent activity in primates [ 19 ]. At least by their abundance, TEs leave a significant legacy behind by influencing the evolution of genomes, including genome structure, gene evolution, and gene expression [ 20 ]. Jumping of TEs seems to be random, although there are some preferences concerning the genomic context into which the individual elements insert [ 21 – 24 ]. Nevertheless, in most cases, insertion of a TE into a new location has either a neutral or negative effect on the host genome in agreement with the neutral theory of evolution [ 25 ]. In fact, initial observations of the impact of retroposons on the human genome were made because of the disease phenotypes they caused [ 26 , 27 ]. Early examples where a TE could transduce a piece of DNA unrelated to the TE’s original (source) locus to the new integration site (Fig. 1 ) can be found in the literature [ 28 – 31 ]. In 1999 John Moran and colleagues demonstrated that active L1 elements are capable of DNA transductions in vitro [ 32 ]. Following these observations, several large-scale computational studies indicated the extent of the phenomenon at the genomic level [ 33 – 35 ]. While the potential evolutionary consequences of this process were noted [ 36 ], none of the studies confirmed exon shuffling caused by DNA transductions at fixed loci in the human genome. Unexpectedly, such a confirmation came from the studies of another active primate transposon, namely the SVA element [ 37 ]. Thus far, there are no systematic studies of Alu -driven DNA transductions. However, Kojima reported recent Alu monomer activity and one of these events was accompanied by a short DNA transduction [ 39 ]. Recently, Hoyt et al. reported that Alu -mediated DNA transduction occurred in the origin of AluSx-WaluSat, one of the recent composite elements in the human genome [ 3 ]. Consequently, we were interested in finding out the scale of Alu -driven DNA transductions in primates with a special focus on the human genome. Interestingly, despite over a million copies of Alu elements in the human genome, we could not confirm any evidence supporting the involvement of Alu s in the transduction process. However, it must be stressed that our investigation concentrated on Alu Ys, and we have applied very strict criteria throughout the analysis. 2. Materials and methods 2.1 Genome assemblies and annotations We obtained the human reference genome (chm13v2.0.fa, accessed on August 15th, 2022) from the GitHub page of the T2T consortium ( https://github.com/marbl/CHM13 ). Moreover, the RepeatMasker (CHM13v2.0_RM-2022MAR23.out) and segmental duplication (T2T-CHM13v2.SDs.bed) annotations were downloaded from the same repository on the same date. The genome of the chimpanzee, panTro6, ( https://hgdownload.soe.ucsc.edu/goldenPath/panTro6/bigZips/panTro6.fa.gz ) and the RepeatMasker annotation ( https://hgdownload.soe.ucsc.edu/goldenPath/panTro6/bigZips/panTro6.fa.out.gz ) were obtained on 12.12.2022. Finally, we accessed two files related to the rhesus monkey (rheMac10) on 22.12.2022 from the following repositories: the reference genome ( https://hgdownload.soe.ucsc.edu/goldenPath/rheMac10/bigZips/rheMac10.fa.gz ) and the RepeatMasker annotation ( https://hgdownload.soe.ucsc.edu/goldenPath/rheMac10/bigZips/rheMac10.fa.out.gz ). 2.2 Transduction identification and validation Due to their different mode of transcription (RNA polymerase III), the existing tools for profiling 3' transductions mediated by L1s seem inefficient for Alu s, as they often yield many false positives. Specifically, these tools lack a built-in verification step, requiring users to validate transductions by employing additional strategies. Hence, we developed a computational pipeline using a set of custom-built Python scripts to detect and confirm Alu transduction events more accurately (Fig. 2 ). This method considers the transcriptional termination signal for pol III (a stretch of at least four thymine residues) during the validation step [ 38 ]. In addition, the method identifies the sequence and coordinates of target site duplications (TSDs) more precisely by constructing k-mers from segments around each Alu sequence. Our approach begins by extracting full-length (FL) Alu Y subfamily members from RepeatMasker annotations. FL Alu Ys are characterized as elements starting within four nucleotides of the consensus sequence and extending to, or beyond, 267 nucleotides (relative to the consensus sequence) in length [ 3 ]. The script then filters out FL Alu Ys overlapping with segmental duplicates, provided such annotation exists. For each FL Alu Y, our script searches a region spanning 100 bp upstream and 4500 bp downstream to locate TSDs. This involves generating overlapping k-mers (5–45 base pairs in length) and allowing a single mismatch to determine the FL Alu Y boundaries. The process includes evaluating the size and position of poly(A) tracts. An Alu Y is marked as an unverified potential transduction if the gap between the end of Alu and the start of the poly(A) tract is over 70 nucleotides long. Any Alu s with a transduction segment comprising another Alu , SVA, or L1 element is excluded to avoid any confounding conclusions downstream. The final step involves transduction validation, applying the following key criteria: (a) aligning each transduced segment to a database of potential source elements using pblat [ 40 ] to identify the source element, and (b) ensuring that the transcription termination motif (at least four Ts) is situated downstream of the transduced DNA relative to the source element. This database comprises sequences spanning 4500 bp downstream of each FL Alu Y subfamily member. Alignment criteria with pblat include concordant orientation between hit and subject, a minimum 90% identity between query (offspring) and subject (source), alignment of at least 30% of the query (offspring), the alignment start position of the query (offspring) being within 20 nucleotides of the subject (source) start position. Additionally, both offspring (query) and source (subject) must belong to the same subfamily. To address ambiguities in parent-offspring relationships, particularly in the absence of segmental duplication annotations, we extended our analysis to include 1kb of upstream and downstream sequences of each parent and offspring sequences identified during the previous step. Subsequently, using YASS (command line version) [ 41 ], the sequence of each progenitor and its corresponding offspring were aligned against each other, and dot-plots were generated with default settings except for an adjusted E-value of 1e-3. Sequences exhibiting duplicated patterns between offspring and parents were excluded. Ultimately, the final results were subjected to manual curation to generate a bona fide Alu transduction catalog. 3. Results and Discussion Many experiments have been conducted to study features of Alu element insertions across different genomes. Kojima’s investigation on Alu monomers in the human genome revealed eight recent insertions of monomer units, including one originating from another retrotransposed monomer with transduction of a short 3’ flanking sequence [39]. Beyond that, Hoyt et al. identified a new repetitive element termed WaluSat located within the short arms of acrocentric chromosomes of the first complete human genome (T2T-CHM13) [3]. Interestingly, in some cases, these elements were immediately preceded by Alu Sx3, both flanked by target site duplications. The authors suggested the possibility of a transduction process contributing to forming these chimeric elements ( Alu Sx3+WaluSat). Given the absence of prior studies examining genome-wide Alu -mediated transductions, in contrast to the extensive analyses of similar events caused by L1s and SVAs, these observations motivated us to investigate the first complete human reference genome (T2T-CHM13) [42, 43] to discover and estimate the frequency of such occurrences mediated by Alu s. Specifically, we focused on the most recent Alu family, Alu Ys, known to harbor active subfamilies capable of ongoing transpositional activity [44] because this ensured high confidence in detecting target site duplications, a prerequisite for the identification of true transductions. The initial count of Alu Y subfamily members from the RepeatMasker output (CHM13v2.0_RM-2022MAR23.out) provided by the T2T consortium was 166,483. However, as described in the Method section, we extracted only 128,695 full-length Alu Ys from the RepeatMasker output. Some of these full-length elements are expected to retain their transcriptional and retrotransposition capabilities [ 3 ]. In order to ensure the unambiguous assignment of the source element to each offspring, we excluded the full-length AluY s that were located within segmental duplication regions [3, 45]. Consequently, our analysis comprised 118,489 full-length Alu Ys (Figure 3 and Table 1) as input for our pipeline to detect transduction events mediated by these sequences. While allowing for one mismatch, we found and confirmed TSDs for 118,157 of the analyzed Alu Ys (Table 1). The length of TSDs ranged from 5 to 45 nucleotides, with a median of 11 bp (Table 1). Table 1. Comparative summary of Alu Y elements and associated target site duplications (TSDs) Species (reference genome) Total count of Alu Ys within the RepeatMasker annotation Full-length Alu Ys count Count of confirmed TSD Median length of TSDs (bp) Human (T2T-CHM13) 166,483 118,489* 118,157 11 Chimpanzee (panTro6) 131,610 106,386 105,973 ** 11 Rhesus monkey (rheMac10) 234,579 191,128 190,265 ** 13 FL Alu Ys located within segmental duplicates are excluded. ** TSDs composed of homopolymers are excluded. Additionally, Alu s with unknown downstream sequences are filtered out, owing to incomplete genomic sequencing in panTro6 and rheMac10. Of 118,489 FL Alu Ys, 1118 (~ 0.94%) exhibited a sign of potential DNA transduction. This was indicated by the presence of an additional sequence located between the end of Alu , as annotated by RepeatMasker, and the 3’ TSD. Given that our transduction discovery was based on identifying a 3’ TSD located far away from the end of an Alu , we used Karlin-Altschul statistics [46, 47], specifically estimating the E and P -values of their TSDs, in order to ensure these 1118 elements represent true transduction events. It is important to note that this step was an extra measure not included in our primary pipeline. The results of this analysis revealed that the estimated values associated with these TSDs were exceptionally high, suggesting that the detected 3' TSDs, situated distantly from the Alu element ends, could potentially be coincidental or random occurrences rather than associated with the transposition mechanism. However, to reduce the likelihood of whether these transduction signatures observed are merely artifacts due to the shortness and low complexity of TSDs (a frequent feature in the genome), we traced the origins of these additional DNA sequences in the 1118 Alu elements by aligning them to the T2T-CHM13 human reference genome and identifying the presence of an Alu transcription termination signal (at least four Ts) after these sequences. Therefore, this validation step was crucial in confirming the initially identified list of transductions and excluding false signals. While potential sources were identified for 24 of the 1118 Alu Ys (Table 2), manual examination failed to verify any instances of transductions. Instead, the initially transduced segments identified in the previous step appear to be parts of poly(A) tails that have accumulated mutations, resulting in the formation of microsatellite-like sequences (Table 2). This observation is consistent with prior studies highlighting the contribution of Alu s to the generation of satellite-like repeats [15, 48-50]. Moreover, this finding is consistent with studies that suggest slippage by the L1 ORF2 polymerase during insertion can lead to the expansion of the A-rich tail [51, 52], a phenomenon that explain the observed distance of the 3’ TSDs from the end of these 1118 Alu Ys. The poly(A) tail length variations and sequence heterogeneities suggest the dynamic nature of these regions [49], which add a layer of complexity to the analysis of Alu elements. Table 2. Summary of potential Alu Y transductions and source elements within T2T-CHM13 genome 1. Offspring / Strand Subfamily Transduction length TSD sequence 2 Transduction coordinate (relative to the source) chr1:191553811-191554066/- Alu Y 76 TATTTRTGA chr10:36394757-36394824 chr2:34400942-34401222/- Alu Yk3 90 AAATGAATCACATC chr4:14793755-14793835 chr2:38486281-38486586/+ Alu Y 81 GCTTAAACARA chr20:10196593-10196682 chr2:38693213-38693496/- Alu Y 96 AGAAATTTCCACTTTCT chr3:34334398-34334492 chr2:106821418-106821701/- Alu Y 84 AWTAAAATCAGCAAGCT chrY:17686862-17686943 chr2:220606423-220606732/- Alu Y 162 AAGAAATGCAGAGCCTG chr15:27226259-27226380 chr2:234786716-234787003/- Alu Y 148 AAAGAAAAATGGATCA chr14:73568464-73568570 chr4:118025027-118025320/- Alu Y 106 GAAAACAGYAACT chr3:144088538-144088646 chr6:30920269-30920579/+ Alu Yh3 76 GAAAATRTT chr8:30049556-30049630 chr6:53949956-53950260/- Alu Y 84 ARAAAGCCCTATACT chr3:136247709-136247785 chr6:121569044-121569334/+ Alu Y 73 AWATCCATAGATC chr7:100591995-100592064 chr8:120513727-120514018/+ Alu Y 135 AGAAAATGYTGCTCCA chr11:33498353-33498454 chr8:140576303-140576590/+ Alu Y 124 AGAAATACAKAAAAAA chr3:52060842-52060933 chr9:32786267-32786577/+ Alu Y 88 AAAGAAARAAAGAAAAGAAAGA chr2:234786608-234786700 chr10:8025040-8025358/+ Alu Y 106 AAAGAAAKAAGG chr4:135974140-135974249 chr12:55019214-55019515/- Alu Y 74 ARAAAAGATGA chrX:149270515-149270587 Chr12:107816096-107816401/+ Alu Y 123 GAAAACTGTTCAAAGGC chr18:63988234-63988339 chr16:23725161-23725443/+ Alu Y 72 AAMTAAAAAAGCTCCCACA chr8:23620977-23621049 chr17:79884367-79884646/+ Alu Yk4 86 AGTGTACCRTG chr20:5713053-5713122 chr18:63987918-63988232/+ Alu Y 108 AGAAATGCAAATGCT chr12:107816419-107816524 chr18:68672331-68672617/+ Alu Y 119 TAAAATACTAGAAACT chr13:61183495-61183612 chr19:48016097-48016379/- Alu Y 98 AAAAAAATAAAATAAAAAATAAAMTA chrY:17686854-17686950 chr19:59227951-59228233/- Alu Y 73 TAACCAGCARCCTC chrY:8313141-8313212 chr20:5713123-5713404/- Alu Yk2 99 TAAAAACAGATMT chr17:79884663-79884734 Table 2 (continued) Sequence details 3 taaaataaataaaaataaaaatacgataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaa taaaataaaataaaataaaataaaataaaataaataaatacataaatacataaataaataaataaataaataaataaataaataaataaa aaagaaaagaaaagaaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaaagaaaga gaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagagagaaagagagaaagagagaaagagagaaagaaagaaagaaagaat ataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaataaaataaaataaaataaaataac aaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaagaaagaaaaagaaagaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagacagacagacagacagacagacagagagagagagagaaag gaaagagagagagaaagaaagaaagagaaagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagagagagagaaagaaagaaggaaagaaagaaa gagaaaaggaaaaaagaaaagaaaaaaagaaagaaagaaaaaagaaagaaagaaagaaagaaagaaagaaaagagaaagaaaggaagaaagaaagaaagaaagaaa gagaaagaaggaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaaggaaggaaggaagg gaaaaagaaagaaagaaagaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaagc taacaacataacataacataacataacataacataacataacataacataacataacataaataaaataaaat aaagaaagaaagaaagaaagaaagaaagaaagaaagaaagagagagagagagagagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaaga gtcgaaaagaagaaaagaaagaaagaaagaaagaaagaaagaaagagagagagagagagagagagagagagagagagggagggagggagggagggagggagggagggagggaaagaaaagaaag ggaaagaaagagagagaaagaaagaagaaagaaagaaagaaagaaagaaagaagaaagagaaagtaagaaagaaagaaagaaagaaag gaaagaaagaaagaaagaaagagagagagagagagagagagagagagagagagagagagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaaga gaaagaaagaaagaaagaaagaaagaaagaaagaaagagaaagaaagaaagaaagaggaaggaaggaaggaagg gagagagagagagagagaaagaaagaaagaaagagagagagaaagaaagaaagaaagagagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaag aaaataaaattaaattaaaataaaataaaataaaataaaataaaataaaataaaataaaataaaaaataaat taaaataaaataaaataaaataaataaaataaaataaataaaataaaataaaataaataaaataaaataaaataataaaataaaat taaagaaagaaagaaagagagagagagagagagagagagagagagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagg taataaaataaaattaaaataaaattaaaataaaataaaataaaataaaataataaaataaaataaaaaataaaataaaataaaataaaataaataaaataaaataaaataaaataaat aaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaataaaataaaataaaataaaataaaataaaa aataaaataaaataaaataaaataaaataaaataaataaaataaaataaaatataaaataaaataaaataaaa aataaaataaaataaaataaaataaataataaaataaaataataaaataaaataaaataaaataaaataaaataaaatataaaataaaataaaataaaa 1 Note: we could not confirm any of the transductions listed in this table upon manual inspection. 2 IUPAC nucleotide codes are employed in instances of mismatch within TSD pairs. 3 Sequences initially identified as transductions, later confirmed as poly(A) tail components While our primary focus was the human genome, we extended our scope to non-human primate genomes for a more comprehensive assessment of Alu -mediated transductions. Our goal was to investigate whether the infrequency of Alu -mediated transductions was unique to humans or a broader phenomenon across other primates in general. Therefore, we analyzed 106,386 Alu Y elements in the chimpanzee genome, panTro6, and 191,128 in the rhesus monkey genome, rheMac10, (Table 1 and Figure 1). It is important to highlight that these genomes were not complete telomere-to-telomere assemblies. Consequently, we were unable to examine 162 and 85 FL Alu Y elements in the chimpanzee and rhesus monkey genomes, respectively, due to gaps in the sequences downstream of these elements. Initially, approximately 1.3% of the Alu Y elements in the panTro6 genome and 1.1% in the rheMac10 genome appeared to display a transduction signature. However, similar to our findings in the human genome, closer inspection revealed that these signatures only consisted of heterogeneous poly(A) tails of varying lengths (Supplementary Tables 1 and 2). Unlike L1s and SVAs which are transcribed by RNA polymerase II, Alu s are predominantly transcribed by RNA polymerase III and thus use a distinct transcription termination mechanism, reflecting their unique interactions with RNA polymerases and subsequent genomic implications. In L1s and SVAs, transcription termination is mediated by a canonical polyadenylation signal (AATAAA) or a variant near the 3' end of the element, which occasionally is bypassed, resulting in the incorporation of downstream DNA leading to 3’ transduction [3, 34, 45]. In contrast, Alu elements employ a termination process characterized by a stretch of at least four thymine bases, oligo(T) [38]. Usually, Alu transcripts extend into downstream sequences until they encounter a termination signal, which can result in the inclusion of additional non- Alu sequences within their transcripts[15, 53]. However, our study finds that Alu -mediated transductions are uncommon in the human genome, unlike transductions rendered by L1 and SVA elements. This finding suggests that the disparity in transduction frequency between Alu s and L1s or SVAs can be attributed to differences in post-transcriptional processes, particularly in the integration of the reverse transcript into the genome. It seems that Alu transcripts with additional non- Alu sequences produced by RNA polymerase III are not optimal substrates for the L1 machinery, impacting their amplification. This is consistent with evidence indicating that Alu transcripts with extended non- Alu sequences are inefficient templates for retroposition [15, 54]. It is also possible that transcriptionally and retropositionally active Alu s ( Alu master genes) have a strong termination signal adjacent to the element and their transcripts do not contain an extra DNA segment. Although we don’t really know which elements in our genome are master genes, 88 full-length young Alu sequences in the chm13v2.0 reference genome are immediately followed by at least four Ts (Makalowski and Halabian, unpublished data). In conclusion, our research suggests that Alu -mediated transductions in the genomes of human, chimpanzee, and rhesus monkey, are extremely rare. This is a notable deviation from the relatively frequent transductions observed within L1 and SVA elements. It is important to acknowledge that our analysis was centered on the youngest Alu elements ( Alu Ys), which are presumed to be transcribed by RNA polymerase III. Moreover, we applied strict criteria to our analyses to eliminate potential false positives. Thus, the existence of rare Alu transductions by co-transcription via RNA polymerase II transcripts or derived from older Alu subfamilies, i.e., Alu S and Alu J elements, cannot be excluded. In summary, despite the fact that Alu s outnumber L1s and SVAs, transductions mediated by Alu elements seem to be uncommon, if any at all. Moreover, since we analyzed three different primate genomes from a broad phylogenetic distribution, we predict our findings will likely apply to the primate phylum. However, biology is full of exceptions and surprises; therefore, we cannot rule out the possibility that rare Alu -mediated cases might be discovered in the future. Declarations Author Contribution W.M., R.O. and J.B. initiated the study and drafted the research strategy. R.H. developed the software toolbox, analyzed the genomes for the Alu-driven transductions, generated figures and tables, and drafted the manuscript.J.M., S.H. and G.H. provided unpublished TE annotations of the genomes analyzed. R.H., J.S. J.B., and W.M. critically reviewed and discussed the ressults.R.H., J.B and W.M. wrote the final version of the manuscript.All authors reviewed the manuscript. Data availability The codes used for this paper have been deposited on GitHub and can be accessed through https://github.com/IOB-Muenster/Transduction-Tracker-Verifier . References McClintock, B., The origin and behavior of mutable loci in maize. Proc Natl Acad Sci U S A 1950, 36 (6), 344-55. https://doi.org/10.1073/pnas.36.6.344. Makałowski, W.; Gotea, V.; Pande, A.; Makałowska, I., Transposable Elements: Classification, Identification, and Their Use As a Tool For Comparative Genomics. Methods Mol Biol 2019, 1910 , 177-207. https://doi.org/10.1007/978-1-4939-9074-0_6. Hoyt, S. J.; Storer, J. M.; Hartley, G. A.; Grady, P. G. S.; Gershman, A.; de Lima, L. G.; Limouse, C.; Halabian, R.; Wojenski, L.; Rodriguez, M.; Altemose, N.; Rhie, A.; Core, L. J.; Gerton, J. L.; Makalowski, W.; Olson, D.; Rosen, J.; Smit, A. F. A.; Straight, A. F.; Vollger, M. R.; Wheeler, T. J.; Schatz, M. C.; Eichler, E. E.; Phillippy, A. M.; Timp, W.; Miga, K. H.; O'Neill, R. J., From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. Science 2022, 376 (6588), eabk3112. https://doi.org/10.1126/science.abk3112. Finnegan, D. J., Eukaryotic transposable elements and genome evolution. Trends Genet 1989, 5 (4), 103-7. https://doi.org/10.1016/0168-9525(89)90039-5. Deininger, P. L.; Batzer, M. A.; Hutchison, C. A., 3rd; Edgell, M. H., Master genes in mammalian repetitive DNA amplification. Trends Genet 1992, 8 (9), 307-11. https://doi.org/10.1016/0168-9525(92)90262-3. Brosius, J., The persistent contributions of RNA to eukaryotic gen(om)e architecture and cellular function. Cold Spring Harb Perspect Biol 2014, 6 (12), a016089. https://doi.org/10.1101/cshperspect.a016089. Brouha, B.; Schustak, J.; Badge, R. M.; Lutz-Prigge, S.; Farley, A. H.; Moran, J. V.; Kazazian, H. H., Jr., Hot L1s account for the bulk of retrotransposition in the human population. Proc Natl Acad Sci U S A 2003, 100 (9), 5280-5. https://doi.org/10.1073/pnas.0831042100. Scott, E. C.; Gardner, E. J.; Masood, A.; Chuang, N. T.; Vertino, P. M.; Devine, S. E., A hot L1 retrotransposon evades somatic repression and initiates human colorectal cancer. Genome Res 2016, 26 (6), 745-55. https://doi.org/10.1101/gr.201814.115. Shen, M. R.; Batzer, M. A.; Deininger, P. L., Evolution of the master Alu gene(s). J Mol Evol 1991, 33 (4), 311-20. https://doi.org/10.1007/bf02102862. Alemán, C.; Roy-Engel, A. M.; Shaikh, T. H.; Deininger, P. L., Cis-acting influences on Alu RNA levels. Nucleic Acids Res 2000, 28 (23), 4755-61. https://doi.org/10.1093/nar/28.23.4755. Cordaux, R.; Hedges, D. J.; Batzer, M. A., Retrotransposition of Alu elements: how many sources? Trends Genet 2004, 20 (10), 464-7. https://doi.org/10.1016/j.tig.2004.07.012. Tang, W.; Liang, P., Alu master copies serve as the drivers of differential SINE transposition in recent primate genomes. Anal Biochem 2020, 606 , 113825. https://doi.org/10.1016/j.ab.2020.113825. Kazazian, H. H., Jr., Mobile elements: drivers of genome evolution. Science 2004, 303 (5664), 1626-32. https://doi.org/10.1126/science.1089670. Hancks, D. C.; Kazazian, H. H., Jr., SVA retrotransposons: Evolution and genetic instability. Semin Cancer Biol 2010, 20 (4), 234-45. https://doi.org/10.1016/j.semcancer.2010.04.001. Deininger, P., Alu elements: know the SINEs. Genome Biology 2011, 12 (12), 236. https://doi.org/10.1186/gb-2011-12-12-236. Mills, R. E.; Bennett, E. A.; Iskow, R. C.; Devine, S. E., Which transposable elements are active in the human genome? Trends Genet 2007, 23 (4), 183-91. https://doi.org/10.1016/j.tig.2007.02.006. Smit, A. F. A.; Riggs, A. D., MIRs are classic, tRNA-derived SINEs that amplified before the mammalian radiation. Nucleic Acids Res 1995, 23 (1), 98-102. https://doi.org/10.1093/nar/23.1.98. Jurka, J.; Zietkiewicz, E.; Labuda, D., Ubiquitous mammalian-wide interspersed repeats (MIRs) are molecular fossils from the mesozoic era. Nucleic Acids Res 1995, 23 (1), 170-5. https://doi.org/10.1093/nar/23.1.170. Pace, J. K., 2nd; Feschotte, C., The evolutionary history of human DNA transposons: evidence for intense activity in the primate lineage. Genome Res 2007, 17 (4), 422-32. https://doi.org/10.1101/gr.5826307. Makałowski, W., Genomic scrap yard: how genomes utilize all that junk. Gene 2000, 259 (1-2), 61-7. https://doi.org/10.1016/s0378-1119(00)00436-4. Bourque, G.; Burns, K. H.; Gehring, M.; Gorbunova, V.; Seluanov, A.; Hammell, M.; Imbeault, M.; Izsvák, Z.; Levin, H. L.; Macfarlan, T. S.; Mager, D. L.; Feschotte, C., Ten things you should know about transposable elements. Genome Biology 2018, 19 (1), 199. https://doi.org/10.1186/s13059-018-1577-z. Capy, P.; Van-Hua, A. l., Transposable elements and genome evolution . ISTE Ltd / John Wiley and Sons Inc: Hoboken, 2023. Korenberg, J. R.; Rykowski, M. C., Human genome organization: Alu, lines, and the molecular structure of metaphase chromosome bands. Cell 1988, 53 (3), 391-400. https://doi.org/10.1016/0092-8674(88)90159-6. Ovchinnikov, I.; Troxel, A. B.; Swergold, G. D., Genomic characterization of recent human LINE-1 insertions: evidence supporting random insertion. Genome Res 2001, 11 (12), 2050-8. https://doi.org/10.1101/gr.194701. Kimura, M., The neutral theory of molecular evolution . Cambridge University Press: Cambridge Cambridgeshire ; New York, 1983; p xv, 367 p. Kazazian, H. H., Jr.; Wong, C.; Youssoufian, H.; Scott, A. F.; Phillips, D. G.; Antonarakis, S. E., Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man. Nature 1988, 332 (6160), 164-6. https://doi.org/10.1038/332164a0. Mitchell, G. A.; Labuda, D.; Fontaine, G.; Saudubray, J. M.; Bonnefont, J. P.; Lyonnet, S.; Brody, L. C.; Steel, G.; Obie, C.; Valle, D., Splice-mediated insertion of an Alu sequence inactivates ornithine delta-aminotransferase: a role for Alu elements in human mutation. Proc Natl Acad Sci U S A 1991, 88 (3), 815-9. https://doi.org/10.1073/pnas.88.3.815. Miki, Y.; Nishisho, I.; Horii, A.; Miyoshi, Y.; Utsunomiya, J.; Kinzler, K. W.; Vogelstein, B.; Nakamura, Y., Disruption of the APC gene by a retrotransposal insertion of L1 sequence in a colon cancer. Cancer Res 1992, 52 (3), 643-5. https://www.ncbi.nlm.nih.gov/pubmed/1310068. Holmes, S. E.; Dombroski, B. A.; Krebs, C. M.; Boehm, C. D.; Kazazian, H. H., Jr., A new retrotransposable human L1 element from the LRE2 locus on chromosome 1q produces a chimaeric insertion. Nat Genet 1994, 7 (2), 143-8. https://doi.org/10.1038/ng0694-143. McNaughton, J. C.; Hughes, G.; Jones, W. A.; Stockwell, P. A.; Klamut, H. J.; Petersen, G. B., The evolution of an intron: analysis of a long, deletion-prone intron in the human dystrophin gene. Genomics 1997, 40 (2), 294-304. https://doi.org/10.1006/geno.1996.4543. Rozmahel, R.; Heng, H. H.; Duncan, A. M.; Shi, X. M.; Rommens, J. M.; Tsui, L. C., Amplification of CFTR exon 9 sequences to multiple locations in the human genome. Genomics 1997, 45 (3), 554-61. https://doi.org/10.1006/geno.1997.4968. Moran, J. V.; DeBerardinis, R. J.; Kazazian, H. H., Jr., Exon shuffling by L1 retrotransposition. Science 1999, 283 (5407), 1530-4. https://doi.org/10.1126/science.283.5407.1530. Goodier, J. L.; Ostertag, E. M.; Kazazian, H. H., Jr., Transduction of 3'-flanking sequences is common in L1 retrotransposition. Hum Mol Genet 2000, 9 (4), 653-7. https://doi.org/10.1093/hmg/9.4.653. Pickeral, O. K.; Makalowski, W.; Boguski, M. S.; Boeke, J. D., Frequent human genomic DNA transduction driven by LINE-1 retrotransposition. Genome Res 2000, 10 (4), 411-5. https://doi.org/10.1101/gr.10.4.411. Szak, S. T.; Pickeral, O. K.; Makalowski, W.; Boguski, M. S.; Landsman, D.; Boeke, J. D., Molecular archeology of L1 insertions in the human genome. Genome Biol 2002, 3 (10), research0052. https://doi.org/10.1186/gb-2002-3-10-research0052. Eickbush, T., Exon shuffling in retrospect. Science 1999, 283 (5407), 1465;1467. https://doi.org/10.1126/science.283.5407.1465. Xing, J.; Wang, H.; Belancio, V. P.; Cordaux, R.; Deininger, P. L.; Batzer, M. A., Emergence of primate genes by retrotransposon-mediated sequence transduction. Proc Natl Acad Sci U S A 2006, 103 (47), 17608-13. https://doi.org/10.1073/pnas.0603224103. Bogenhagen, D. F.; Brown, D. D., Nucleotide sequences in Xenopus 5S DNA required for transcription termination. Cell 1981, 24 (1), 261-70. https://doi.org/10.1016/0092-8674(81)90522-5. Kojima, K. K., Alu monomer revisited: recent generation of Alu monomers. Mol Biol Evol 2011, 28 (1), 13-5. https://doi.org/10.1093/molbev/msq218. Wang, M.; Kong, L., pblat: a multithread blat algorithm speeding up aligning sequences to genomes. BMC Bioinformatics 2019, 20 (1), 28. https://doi.org/10.1186/s12859-019-2597-8. Noé, L.; Kucherov, G., YASS: enhancing the sensitivity of DNA similarity search. Nucleic Acids Res 2005, 33 (suppl_2), W540-W543. https://doi.org/10.1093/nar/gki478. Nurk, S.; Koren, S.; Rhie, A.; Rautiainen, M.; Bzikadze, A. V.; Mikheenko, A.; Vollger, M. R.; Altemose, N.; Uralsky, L.; Gershman, A.; Aganezov, S.; Hoyt, S. J.; Diekhans, M.; Logsdon, G. A.; Alonge, M.; Antonarakis, S. E.; Borchers, M.; Bouffard, G. G.; Brooks, S. Y.; Caldas, G. V.; Chen, N.-C.; Cheng, H.; Chin, C.-S.; Chow, W.; de Lima, L. G.; Dishuck, P. C.; Durbin, R.; Dvorkina, T.; Fiddes, I. T.; Formenti, G.; Fulton, R. S.; Fungtammasan, A.; Garrison, E.; Grady, P. G. S.; Graves-Lindsay, T. A.; Hall, I. M.; Hansen, N. F.; Hartley, G. A.; Haukness, M.; Howe, K.; Hunkapiller, M. W.; Jain, C.; Jain, M.; Jarvis, E. D.; Kerpedjiev, P.; Kirsche, M.; Kolmogorov, M.; Korlach, J.; Kremitzki, M.; Li, H.; Maduro, V. V.; Marschall, T.; McCartney, A. M.; McDaniel, J.; Miller, D. E.; Mullikin, J. C.; Myers, E. W.; Olson, N. D.; Paten, B.; Peluso, P.; Pevzner, P. A.; Porubsky, D.; Potapova, T.; Rogaev, E. I.; Rosenfeld, J. A.; Salzberg, S. L.; Schneider, V. A.; Sedlazeck, F. J.; Shafin, K.; Shew, C. J.; Shumate, A.; Sims, Y.; Smit, A. F. A.; Soto, D. C.; Sović, I.; Storer, J. M.; Streets, A.; Sullivan, B. A.; Thibaud-Nissen, F.; Torrance, J.; Wagner, J.; Walenz, B. P.; Wenger, A.; Wood, J. M. D.; Xiao, C.; Yan, S. M.; Young, A. C.; Zarate, S.; Surti, U.; McCoy, R. C.; Dennis, M. Y.; Alexandrov, I. A.; Gerton, J. L.; O’Neill, R. J.; Timp, W.; Zook, J. M.; Schatz, M. C.; Eichler, E. E.; Miga, K. H.; Phillippy, A. M., The complete sequence of a human genome. Science 2022, 376 (6588), 44-53. https://doi.org/doi:10.1126/science.abj6987. Rhie, A.; Nurk, S.; Cechova, M.; Hoyt, S. J.; Taylor, D. J.; Altemose, N.; Hook, P. W.; Koren, S.; Rautiainen, M.; Alexandrov, I. A.; Allen, J.; Asri, M.; Bzikadze, A. V.; Chen, N.-C.; Chin, C.-S.; Diekhans, M.; Flicek, P.; Formenti, G.; Fungtammasan, A.; Garcia Giron, C.; Garrison, E.; Gershman, A.; Gerton, J. L.; Grady, P. G. S.; Guarracino, A.; Haggerty, L.; Halabian, R.; Hansen, N. F.; Harris, R.; Hartley, G. A.; Harvey, W. T.; Haukness, M.; Heinz, J.; Hourlier, T.; Hubley, R. M.; Hunt, S. E.; Hwang, S.; Jain, M.; Kesharwani, R. K.; Lewis, A. P.; Li, H.; Logsdon, G. A.; Lucas, J. K.; Makalowski, W.; Markovic, C.; Martin, F. J.; Mc Cartney, A. M.; McCoy, R. C.; McDaniel, J.; McNulty, B. M.; Medvedev, P.; Mikheenko, A.; Munson, K. M.; Murphy, T. D.; Olsen, H. E.; Olson, N. D.; Paulin, L. F.; Porubsky, D.; Potapova, T.; Ryabov, F.; Salzberg, S. L.; Sauria, M. E. G.; Sedlazeck, F. J.; Shafin, K.; Shepelev, V. A.; Shumate, A.; Storer, J. M.; Surapaneni, L.; Taravella Oill, A. M.; Thibaud-Nissen, F.; Timp, W.; Tomaszkiewicz, M.; Vollger, M. R.; Walenz, B. P.; Watwood, A. C.; Weissensteiner, M. H.; Wenger, A. M.; Wilson, M. A.; Zarate, S.; Zhu, Y.; Zook, J. M.; Eichler, E. E.; O’Neill, R. J.; Schatz, M. C.; Miga, K. H.; Makova, K. D.; Phillippy, A. M., The complete sequence of a human Y chromosome. Nature 2023 . https://doi.org/10.1038/s41586-023-06457-y. Bennett, E. A.; Keller, H.; Mills, R. E.; Schmidt, S.; Moran, J. V.; Weichenrieder, O.; Devine, S. E., Active Alu retrotransposons in the human genome. Genome Res 2008, 18 (12), 1875-83. https://doi.org/10.1101/gr.081737.108. Halabian, R.; Makałowski, W., A Map of 3’ DNA Transduction Variants Mediated by Non-LTR Retroelements on 3202 Human Genomes. Biology 2022, 11 (7), 1032. https://www.mdpi.com/2079-7737/11/7/1032. Reich, J. G.; Drabsch, H.; Däumler, A., On the statistical assessment of similarities in DNA sequences. Nucleic Acids Res 1984, 12 (13), 5529-43. https://doi.org/10.1093/nar/12.13.5529. Altschul, S. F.; Erickson, B. W., Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. Mol Biol Evol 1985, 2 (6), 526-38. https://doi.org/10.1093/oxfordjournals.molbev.a040370. Arcot, S. S.; Wang, Z.; Weber, J. L.; Deininger, P. L.; Batzer, M. A., Alu Repeats: A Source for the Genesis of Primate Microsatellites. Genomics 1995, 29 (1), 136-144. https://doi.org/https://doi.org/10.1006/geno.1995.1224. Roy-Engel, A. M.; Salem, A. H.; Oyeniran, O. O.; Deininger, L.; Hedges, D. J.; Kilroy, G. E.; Batzer, M. A.; Deininger, P. L., Active Alu element "A-tails": size does matter. Genome Res 2002, 12 (9), 1333-44. https://doi.org/10.1101/gr.384802. Jurka, J.; Gentles, A. J., Origin and diversification of minisatellites derived from human Alu sequences. Gene 2006, 365 , 21-26. https://doi.org/https://doi.org/10.1016/j.gene.2005.09.029. Dewannieux, M.; Heidmann, T., Role of poly(A) tail length in Alu retrotransposition. Genomics 2005, 86 (3), 378-381. https://doi.org/https://doi.org/10.1016/j.ygeno.2005.05.009. Wagstaff, B. J.; Hedges, D. J.; Derbes, R. S.; Campos Sanchez, R.; Chiaromonte, F.; Makova, K. D.; Roy-Engel, A. M., Rescuing Alu: recovery of new inserts shows LINE-1 preserves Alu activity through A-tail expansion. PLoS Genet 2012, 8 (8), e1002842. https://doi.org/10.1371/journal.pgen.1002842. Cordaux, R.; Batzer, M. A., The impact of retrotransposons on human genome evolution. Nature Reviews Genetics 2009, 10 (10), 691-703. https://doi.org/10.1038/nrg2640. Comeaux, M. S.; Roy-Engel, A. M.; Hedges, D. J.; Deininger, P. L., Diverse cis factors controlling Alu retrotransposition: what causes Alu elements to die? Genome Res 2009, 19 (4), 545-55. https://doi.org/10.1101/gr.089789.108. Additional Declarations No competing interests reported. Supplementary Files Supplementarytables.xlsx Cite Share Download PDF Status: Under Revision Version 1 posted Editorial decision: Revision requested 10 Oct, 2024 Reviews received at journal 04 Aug, 2024 Reviews received at journal 30 Jul, 2024 Reviewers agreed at journal 27 Jul, 2024 Reviewers agreed at journal 26 Jul, 2024 Reviewers invited by journal 25 Jul, 2024 Editor assigned by journal 20 Jun, 2024 Submission checks completed at journal 18 Jun, 2024 First submitted to journal 17 Jun, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4595082","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":320226632,"identity":"f98b1bb4-e316-42a1-8c2d-cdb6b8fda5d5","order_by":0,"name":"Reza Halabian","email":"","orcid":"","institution":"University of Münster","correspondingAuthor":false,"prefix":"","firstName":"Reza","middleName":"","lastName":"Halabian","suffix":""},{"id":320226633,"identity":"53d2c0cf-ed95-4c76-bfe6-33734d2a5489","order_by":1,"name":"Jessica M. Storer","email":"","orcid":"","institution":"University of Connecticut","correspondingAuthor":false,"prefix":"","firstName":"Jessica","middleName":"M.","lastName":"Storer","suffix":""},{"id":320226635,"identity":"870e61d2-e03e-48cc-8747-fc2cb1c6ce7a","order_by":2,"name":"Savannah J. Hoyt","email":"","orcid":"","institution":"University of Connecticut","correspondingAuthor":false,"prefix":"","firstName":"Savannah","middleName":"J.","lastName":"Hoyt","suffix":""},{"id":320226636,"identity":"cf0f5e55-b804-4f61-9c1f-1021844022a8","order_by":3,"name":"Gabrielle A. Hartley","email":"","orcid":"","institution":"University of Connecticut","correspondingAuthor":false,"prefix":"","firstName":"Gabrielle","middleName":"A.","lastName":"Hartley","suffix":""},{"id":320226638,"identity":"c7fc1b46-9c14-40f5-9977-1934e6f9dfe5","order_by":4,"name":"Jürgen Brosius","email":"","orcid":"","institution":"Sichuan University","correspondingAuthor":false,"prefix":"","firstName":"Jürgen","middleName":"","lastName":"Brosius","suffix":""},{"id":320226640,"identity":"d749d8a1-5f23-47c4-8751-ca41479f6ffb","order_by":5,"name":"Rachel J. O’Neill","email":"","orcid":"","institution":"University of Connecticut","correspondingAuthor":false,"prefix":"","firstName":"Rachel","middleName":"J.","lastName":"O’Neill","suffix":""},{"id":320226641,"identity":"b7cf5aee-3886-4aea-8a7f-7cd37ab674a7","order_by":6,"name":"Wojciech Makalowski","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA80lEQVRIie3RMWsCMRTA8XeId8s7up6g9CsIQqfSfJULhXwGB4cDQZf0G9yHOOhQuqUEziV4a444uOh0g5M4iDQFwWK52NEh//GFH49HAHy++ywFEM8And8zvE3YheA/iE3IqwUuMlxkm+KgKvoRQfcdxytCorc1NGMHUYLVXBv6OYXQoNpSjothkKt28qRTpnFnaCEf9iaeyRQTBp145ib1cbe0xG6JT5Lg49aSk5sY1OJMMhnwJLQkayfE3mL66nX0Q+q8lJQrBl952U56c87qpnwZFJUIdTORJJqXwbqZtBP7BenfmXABgOjGu8/n8/m+AfTjW13EPNtjAAAAAElFTkSuQmCC","orcid":"","institution":"University of Münster","correspondingAuthor":true,"prefix":"","firstName":"Wojciech","middleName":"","lastName":"Makalowski","suffix":""}],"badges":[],"createdAt":"2024-06-17 15:26:06","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4595082/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4595082/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":59627628,"identity":"6d0b7cfa-75c9-4c4a-9c02-7cb6fdcf4ad0","added_by":"auto","created_at":"2024-07-04 04:14:14","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":137681,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTE-driven DNA transduction. \u0026nbsp;\u003c/strong\u003eThe cartoon illustrates a transduction process induced by a non-LTR retroelement. In the case of \u003cem\u003eAlu\u003c/em\u003es, the transcription termination motif is a poly T sequence with a minimum length of four nucleotides [38]. Hypothetically, if this polyT is located far from the source \u003cem\u003eAlu\u003c/em\u003e, the resulting transcript comprises the original \u003cem\u003eAlu\u003c/em\u003e and non-\u003cem\u003eAlu\u003c/em\u003esequence. This chimeric transcript has the potential to be inserted in a new genomic locus, giving rise to transduction.\u003c/p\u003e","description":"","filename":"1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4595082/v1/b14e8d5ec9f4a0d4321cebfc.jpg"},{"id":59627625,"identity":"a2a9cbf0-4f7f-4bf7-ba06-24d30f74cf16","added_by":"auto","created_at":"2024-07-04 04:14:14","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":126240,"visible":true,"origin":"","legend":"\u003cp\u003eSchematic overview of the pipeline used in this study to identify and verify AluY-mediated transductions.\u003c/p\u003e","description":"","filename":"2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4595082/v1/7b8d2bfb05701ec7c7dfd74b.jpg"},{"id":59627626,"identity":"863d4fe0-c4ea-4b70-a025-8dfd357b3a64","added_by":"auto","created_at":"2024-07-04 04:14:14","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":135599,"visible":true,"origin":"","legend":"\u003cp\u003eCount of analyzed full-length elements per each \u003cem\u003eAlu\u003c/em\u003eY subfamily category.\u003c/p\u003e","description":"","filename":"3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4595082/v1/69cce3cc0a63e82effc2f9ff.jpg"},{"id":59628345,"identity":"95891ef6-2a0c-4e85-beb2-199bda8c353c","added_by":"auto","created_at":"2024-07-04 04:30:15","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1030768,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4595082/v1/8c6945b4-676a-4142-acf0-7e91bfac13fa.pdf"},{"id":59628031,"identity":"43d9f259-0e6c-4f95-8a7b-64afd5e1b09c","added_by":"auto","created_at":"2024-07-04 04:22:14","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":19134,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarytables.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4595082/v1/3163b60bc44bf6935de872c5.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Exploring Alu-Driven DNA Transductions in the Primate Genomes","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eDiscovered by Barbara McClintock in the 1940s, mobile elements [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e], often termed transposable or transposed elements (TEs), are present in most, if not all eukaryotic genomes [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. In the human genome, the discernible TEs contribute to about 46% or over 1.4 Gb of the sequence [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Based on the transposition mode, TEs have been divided into Type I elements that are mobilized via a \u0026ldquo;copy-and-paste\u0026rdquo; mechanism and Type II elements that move via a \u0026ldquo;cut-and-paste\u0026rdquo; manner [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Notably, not all copy-and-paste TEs can give rise to new copies, instead, there are only a limited number of master, source, or founder elements that spread their copies within genomes [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Therefore, most TEs are not transposable but instead only transposed elements [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Whether a TE is transposable presumably depends on retention of intact internal open reading frames and expression, particularly in the germ line. While for the L1 elements, \u003cem\u003ebona fide\u003c/em\u003e source genes had been identified [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e], there is only sparse information on \u003cem\u003eAlu\u003c/em\u003e source genes [\u003cspan additionalcitationids=\"CR10 CR11\" citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Autonomous elements, such as L1s, encode most of the gene products that are required for their activity, while non-autonomous elements, such as SVAs and \u003cem\u003eAlu\u003c/em\u003es rely on the machinery of autonomous elements to proliferate within genomes [\u003cspan additionalcitationids=\"CR14\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. TE-families are subject to regular biological processes, including birth and death, i.e., a specific family or a subfamily of a TE appears in the genome and goes extinct after a period of activity. Consequently, only a small number of TE families are active within a certain evolutionary period. For instance, currently, only three non-LTR retroelements seem to be active in humans. These include one autonomous family (L1) and two non-autonomous families (\u003cem\u003eAlu\u003c/em\u003e and SVA) [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e] while older elements, such as mammalian-wide repeats (MIR), are still present and discernible, but their source genes are inactive [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Interestingly, all these TEs belong to type I and consequently, \u003cem\u003eAlu\u003c/em\u003es and SVAs most likely employ the L1 molecular machinery for their retrotransposition. Although there are numerous sequences related to Type II transposons in the human genome, there is no evidence of their recent activity in primates [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eAt least by their abundance, TEs leave a significant legacy behind by influencing the evolution of genomes, including genome structure, gene evolution, and gene expression [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. Jumping of TEs seems to be random, although there are some preferences concerning the genomic context into which the individual elements insert [\u003cspan additionalcitationids=\"CR22 CR23\" citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. Nevertheless, in most cases, insertion of a TE into a new location has either a neutral or negative effect on the host genome in agreement with the neutral theory of evolution [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. In fact, initial observations of the impact of retroposons on the human genome were made because of the disease phenotypes they caused [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e, \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]. Early examples where a TE could transduce a piece of DNA unrelated to the TE\u0026rsquo;s original (source) locus to the new integration site (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e) can be found in the literature [\u003cspan additionalcitationids=\"CR29 CR30\" citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. In 1999 John Moran and colleagues demonstrated that active L1 elements are capable of DNA transductions \u003cem\u003ein vitro\u003c/em\u003e [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Following these observations, several large-scale computational studies indicated the extent of the phenomenon at the genomic level [\u003cspan additionalcitationids=\"CR34\" citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e]. While the potential evolutionary consequences of this process were noted [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e], none of the studies confirmed exon shuffling caused by DNA transductions at fixed loci in the human genome. Unexpectedly, such a confirmation came from the studies of another active primate transposon, namely the SVA element [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e].\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThus far, there are no systematic studies of \u003cem\u003eAlu\u003c/em\u003e-driven DNA transductions. However, Kojima reported recent \u003cem\u003eAlu\u003c/em\u003e monomer activity and one of these events was accompanied by a short DNA transduction [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. Recently, Hoyt et al. reported that \u003cem\u003eAlu\u003c/em\u003e-mediated DNA transduction occurred in the origin of AluSx-WaluSat, one of the recent composite elements in the human genome [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Consequently, we were interested in finding out the scale of \u003cem\u003eAlu\u003c/em\u003e-driven DNA transductions in primates with a special focus on the human genome. Interestingly, despite over a million copies of \u003cem\u003eAlu\u003c/em\u003e elements in the human genome, we could not confirm any evidence supporting the involvement of \u003cem\u003eAlu\u003c/em\u003es in the transduction process. However, it must be stressed that our investigation concentrated on \u003cem\u003eAlu\u003c/em\u003eYs, and we have applied very strict criteria throughout the analysis.\u003c/p\u003e"},{"header":"2. Materials and methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1 Genome assemblies and annotations\u003c/h2\u003e \u003cp\u003eWe obtained the human reference genome (chm13v2.0.fa, accessed on August 15th, 2022) from the GitHub page of the T2T consortium (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/marbl/CHM13\u003c/span\u003e\u003cspan address=\"https://github.com/marbl/CHM13\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e).\u003c/span\u003e Moreover, the RepeatMasker (CHM13v2.0_RM-2022MAR23.out) and segmental duplication (T2T-CHM13v2.SDs.bed) annotations were downloaded from the same repository on the same date. The genome of the chimpanzee, panTro6, (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://hgdownload.soe.ucsc.edu/goldenPath/panTro6/bigZips/panTro6.fa.gz\u003c/span\u003e\u003cspan address=\"https://hgdownload.soe.ucsc.edu/goldenPath/panTro6/bigZips/panTro6.fa.gz\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e and the RepeatMasker annotation (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://hgdownload.soe.ucsc.edu/goldenPath/panTro6/bigZips/panTro6.fa.out.gz\u003c/span\u003e\u003cspan address=\"https://hgdownload.soe.ucsc.edu/goldenPath/panTro6/bigZips/panTro6.fa.out.gz\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e were obtained on 12.12.2022. Finally, we accessed two files related to the rhesus monkey (rheMac10) on 22.12.2022 from the following repositories: the reference genome (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://hgdownload.soe.ucsc.edu/goldenPath/rheMac10/bigZips/rheMac10.fa.gz\u003c/span\u003e\u003cspan address=\"https://hgdownload.soe.ucsc.edu/goldenPath/rheMac10/bigZips/rheMac10.fa.gz\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e and the RepeatMasker annotation (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://hgdownload.soe.ucsc.edu/goldenPath/rheMac10/bigZips/rheMac10.fa.out.gz\u003c/span\u003e\u003cspan address=\"https://hgdownload.soe.ucsc.edu/goldenPath/rheMac10/bigZips/rheMac10.fa.out.gz\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e).\u003c/span\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e2.2 Transduction identification and validation\u003c/h2\u003e \u003cp\u003eDue to their different mode of transcription (RNA polymerase III), the existing tools for profiling 3' transductions mediated by L1s seem inefficient for \u003cem\u003eAlu\u003c/em\u003es, as they often yield many false positives. Specifically, these tools lack a built-in verification step, requiring users to validate transductions by employing additional strategies. Hence, we developed a computational pipeline using a set of custom-built Python scripts to detect and confirm \u003cem\u003eAlu\u003c/em\u003e transduction events more accurately (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). This method considers the transcriptional termination signal for pol III (a stretch of at least four thymine residues) during the validation step [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. In addition, the method identifies the sequence and coordinates of target site duplications (TSDs) more precisely by constructing k-mers from segments around each \u003cem\u003eAlu\u003c/em\u003e sequence.\u003c/p\u003e \u003cp\u003eOur approach begins by extracting full-length (FL) \u003cem\u003eAlu\u003c/em\u003eY subfamily members from RepeatMasker annotations. FL \u003cem\u003eAlu\u003c/em\u003eYs are characterized as elements starting within four nucleotides of the consensus sequence and extending to, or beyond, 267 nucleotides (relative to the consensus sequence) in length [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. The script then filters out FL \u003cem\u003eAlu\u003c/em\u003eYs overlapping with segmental duplicates, provided such annotation exists.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFor each FL \u003cem\u003eAlu\u003c/em\u003eY, our script searches a region spanning 100 bp upstream and 4500 bp downstream to locate TSDs. This involves generating overlapping k-mers (5\u0026ndash;45 base pairs in length) and allowing a single mismatch to determine the FL \u003cem\u003eAlu\u003c/em\u003eY boundaries. The process includes evaluating the size and position of poly(A) tracts. An \u003cem\u003eAlu\u003c/em\u003eY is marked as an unverified potential transduction if the gap between the end of \u003cem\u003eAlu\u003c/em\u003e and the start of the poly(A) tract is over 70 nucleotides long. Any \u003cem\u003eAlu\u003c/em\u003es with a transduction segment comprising another \u003cem\u003eAlu\u003c/em\u003e, SVA, or L1 element is excluded to avoid any confounding conclusions downstream.\u003c/p\u003e \u003cp\u003eThe final step involves transduction validation, applying the following key criteria: (a) aligning each transduced segment to a database of potential source elements using pblat [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e] to identify the source element, and (b) ensuring that the transcription termination motif (at least four Ts) is situated downstream of the transduced DNA relative to the source element. This database comprises sequences spanning 4500 bp downstream of each FL \u003cem\u003eAlu\u003c/em\u003eY subfamily member. Alignment criteria with pblat include concordant orientation between hit and subject, a minimum 90% identity between query (offspring) and subject (source), alignment of at least 30% of the query (offspring), the alignment start position of the query (offspring) being within 20 nucleotides of the subject (source) start position. Additionally, both offspring (query) and source (subject) must belong to the same subfamily.\u003c/p\u003e \u003cp\u003eTo address ambiguities in parent-offspring relationships, particularly in the absence of segmental duplication annotations, we extended our analysis to include 1kb of upstream and downstream sequences of each parent and offspring sequences identified during the previous step. Subsequently, using YASS (command line version) [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e], the sequence of each progenitor and its corresponding offspring were aligned against each other, and dot-plots were generated with default settings except for an adjusted E-value of 1e-3. Sequences exhibiting duplicated patterns between offspring and parents were excluded. Ultimately, the final results were subjected to manual curation to generate a \u003cem\u003ebona fide Alu\u003c/em\u003e transduction catalog.\u003c/p\u003e \u003c/div\u003e"},{"header":"3. Results and Discussion","content":"\u003cp\u003eMany experiments have been conducted to study features of \u003cem\u003eAlu\u003c/em\u003e element insertions across different genomes. Kojima\u0026rsquo;s investigation on \u003cem\u003eAlu\u003c/em\u003e monomers in the human genome revealed eight recent insertions of monomer units, including one originating from another retrotransposed monomer with transduction of a short 3\u0026rsquo; flanking sequence [39]. Beyond that, Hoyt et al. identified a new repetitive element termed WaluSat located within the short arms of acrocentric chromosomes of the first complete human genome (T2T-CHM13) [3]. Interestingly, in some cases, these elements were immediately preceded by \u003cem\u003eAlu\u003c/em\u003eSx3, both flanked by target site duplications. The authors suggested the possibility of a transduction process contributing to forming these chimeric elements (\u003cem\u003eAlu\u003c/em\u003eSx3+WaluSat). Given the absence of prior studies examining genome-wide \u003cem\u003eAlu\u003c/em\u003e-mediated transductions, in contrast to the extensive analyses of similar events caused by L1s and SVAs, these observations motivated us to investigate the first complete human reference genome (T2T-CHM13) [42, 43] to discover and estimate the frequency of such occurrences mediated by \u003cem\u003eAlu\u003c/em\u003es. Specifically, we focused on the most recent \u003cem\u003eAlu\u003c/em\u003e family, \u003cem\u003eAlu\u003c/em\u003eYs, known to harbor active subfamilies capable of ongoing transpositional activity [44] because this ensured high confidence in detecting target site duplications, a prerequisite for the identification of true transductions.\u003c/p\u003e\n\u003cp\u003eThe initial count of \u003cem\u003eAlu\u003c/em\u003eY subfamily members from the RepeatMasker output (CHM13v2.0_RM-2022MAR23.out) provided by the T2T consortium was 166,483. However, as described in the Method section, we extracted only 128,695 full-length \u003cem\u003eAlu\u003c/em\u003eYs from the RepeatMasker output. Some of these full-length elements are expected to retain their transcriptional and retrotransposition capabilities\u0026nbsp;[\u003ca href=\"#_ENREF_3\" title=\"Hoyt, 2022 #56\"\u003e3\u003c/a\u003e]. In order to ensure the unambiguous assignment of the source element to each offspring, we excluded the full-length \u003cem\u003eAluY\u003c/em\u003es that were located within segmental duplication regions [3, 45]. Consequently, our analysis comprised 118,489 full-length \u003cem\u003eAlu\u003c/em\u003eYs (Figure 3 and Table 1) as input for our pipeline to detect transduction events mediated by these sequences. While allowing for one mismatch, we found and confirmed TSDs for 118,157 of the analyzed \u003cem\u003eAlu\u003c/em\u003eYs (Table 1). The length of TSDs ranged from 5 to 45 nucleotides, with a median of 11 bp (Table 1).\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"600\"\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\" colspan=\"5\"\u003e\n \u003cp\u003e\u003cstrong\u003eTable 1.\u003c/strong\u003e Comparative summary of \u003cem\u003eAlu\u003c/em\u003eY elements and associated target site duplications (TSDs)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd width=\"20%\" valign=\"top\"\u003e\n \u003cp\u003eSpecies (reference genome)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003eTotal count of \u003cem\u003eAlu\u003c/em\u003eYs within the RepeatMasker annotation\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003eFull-length \u003cem\u003eAlu\u003c/em\u003eYs count\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003eCount of confirmed TSD\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003eMedian length of TSDs (bp)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003eHuman (T2T-CHM13)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e166,483\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e118,489*\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e118,157\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003eChimpanzee (panTro6)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e131,610\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e106,386\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e105,973 **\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003eRhesus monkey (rheMac10)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e234,579\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e191,128\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e190,265 **\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"20%\"\u003e\n \u003cp\u003e13\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\" colspan=\"5\"\u003e\n \u003cul\u003e\n \u003cli\u003eFL \u003cem\u003eAlu\u003c/em\u003eYs located within segmental duplicates are excluded.\u003c/li\u003e\n \u003c/ul\u003e\n \u003cp\u003e** TSDs composed of homopolymers are excluded. Additionally, \u003cem\u003eAlu\u003c/em\u003es with unknown downstream sequences are filtered out, owing to incomplete genomic sequencing in panTro6 and rheMac10.\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n\u003c/table\u003e\n\u003cp\u003eOf 118,489 FL \u003cem\u003eAlu\u003c/em\u003eYs, 1118 (~ 0.94%) exhibited a sign of potential DNA transduction. This was indicated by the presence of an additional sequence located between the end of \u003cem\u003eAlu\u003c/em\u003e, as annotated by RepeatMasker, and the 3\u0026rsquo; TSD. Given that our transduction discovery was based on identifying a 3\u0026rsquo; TSD located far away from the end of an \u003cem\u003eAlu\u003c/em\u003e, we used Karlin-Altschul statistics [46, 47], specifically estimating the \u003cem\u003eE\u003c/em\u003e and \u003cem\u003eP\u003c/em\u003e-values of their TSDs, in order to ensure these 1118 elements represent true transduction events. It is important to note that this step was an extra measure not included in our primary pipeline. The results of this analysis revealed that the estimated values associated with these TSDs were exceptionally high, suggesting that the detected 3\u0026apos; TSDs, situated distantly from the \u003cem\u003eAlu\u003c/em\u003e element ends, could potentially be coincidental or random occurrences rather than associated with the transposition mechanism. However, to reduce the likelihood of whether these transduction signatures observed are merely artifacts due to the shortness and low complexity of TSDs (a frequent feature in the genome), we traced the origins of these additional DNA sequences in the 1118 \u003cem\u003eAlu\u003c/em\u003e elements by aligning them to the T2T-CHM13 human reference genome and identifying the presence of an \u003cem\u003eAlu\u003c/em\u003e transcription termination signal (at least four Ts) after these sequences. Therefore, this validation step was crucial in confirming the initially identified list of transductions and excluding false signals.\u003c/p\u003e\n\u003cp\u003eWhile potential sources were identified for 24 of the 1118 \u003cem\u003eAlu\u003c/em\u003eYs (Table 2), manual examination failed to verify any instances of transductions. Instead, the initially transduced segments identified in the previous step appear to be parts of poly(A) tails that have accumulated mutations, resulting in the formation of microsatellite-like sequences (Table 2). This observation is consistent with prior studies highlighting the contribution of \u003cem\u003eAlu\u003c/em\u003es to the generation of satellite-like repeats [15, 48-50]. Moreover, this finding is consistent with studies that suggest slippage by the L1 ORF2 polymerase during insertion can lead to the expansion of the A-rich tail [51, 52], a phenomenon that explain the observed distance of the 3\u0026rsquo; TSDs from the end of these 1118 \u003cem\u003eAlu\u003c/em\u003eYs. The poly(A) tail length variations and sequence heterogeneities suggest the dynamic nature of these regions [49], which add a layer of complexity to the analysis of \u003cem\u003eAlu\u0026nbsp;\u003c/em\u003eelements.\u003c/p\u003e\n\u003cp\u003eTable 2. Summary of potential\u003cem\u003e\u0026nbsp;Alu\u003c/em\u003eY transductions and source elements within T2T-CHM13 genome\u003csup\u003e1.\u003c/sup\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"606\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\" valign=\"top\"\u003e\n \u003cp\u003eOffspring / Strand\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003eSubfamily\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003eTransduction length\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eTSD sequence \u003csup\u003e2\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003eTransduction coordinate\u0026nbsp;\u003c/p\u003e\n \u003cp\u003e(relative to the source)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr1:191553811-191554066/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e76\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eTATTTRTGA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr10:36394757-36394824\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr2:34400942-34401222/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eYk3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e90\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAAATGAATCACATC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr4:14793755-14793835\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr2:38486281-38486586/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e81\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eGCTTAAACARA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr20:10196593-10196682\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr2:38693213-38693496/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e96\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAGAAATTTCCACTTTCT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr3:34334398-34334492\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr2:106821418-106821701/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e84\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAWTAAAATCAGCAAGCT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echrY:17686862-17686943\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr2:220606423-220606732/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e162\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAAGAAATGCAGAGCCTG\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr15:27226259-27226380\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr2:234786716-234787003/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e148\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAAAGAAAAATGGATCA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr14:73568464-73568570\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr4:118025027-118025320/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e106\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eGAAAACAGYAACT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr3:144088538-144088646\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr6:30920269-30920579/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eYh3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e76\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eGAAAATRTT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr8:30049556-30049630\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr6:53949956-53950260/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e84\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eARAAAGCCCTATACT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr3:136247709-136247785\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr6:121569044-121569334/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e73\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAWATCCATAGATC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr7:100591995-100592064\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr8:120513727-120514018/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e135\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAGAAAATGYTGCTCCA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr11:33498353-33498454\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr8:140576303-140576590/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e124\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAGAAATACAKAAAAAA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr3:52060842-52060933\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr9:32786267-32786577/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e88\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAAAGAAARAAAGAAAAGAAAGA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr2:234786608-234786700\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr10:8025040-8025358/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e106\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAAAGAAAKAAGG\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr4:135974140-135974249\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr12:55019214-55019515/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e74\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eARAAAAGATGA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echrX:149270515-149270587\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003eChr12:107816096-107816401/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e123\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eGAAAACTGTTCAAAGGC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr18:63988234-63988339\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr16:23725161-23725443/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e72\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAAMTAAAAAAGCTCCCACA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr8:23620977-23621049\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr17:79884367-79884646/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eYk4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e86\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAGTGTACCRTG\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr20:5713053-5713122\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr18:63987918-63988232/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e108\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAGAAATGCAAATGCT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr12:107816419-107816524\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr18:68672331-68672617/+\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e119\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eTAAAATACTAGAAACT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr13:61183495-61183612\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr19:48016097-48016379/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eAAAAAAATAAAATAAAAAATAAAMTA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echrY:17686854-17686950\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr19:59227951-59228233/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e73\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eTAACCAGCARCCTC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echrY:8313141-8313212\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"28.54785478547855%\"\u003e\n \u003cp\u003echr20:5713123-5713404/-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.221122112211221%\"\u003e\n \u003cp\u003e\u003cem\u003eAlu\u003c/em\u003eYk2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.706270627062706%\"\u003e\n \u003cp\u003e99\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"22.607260726072607%\"\u003e\n \u003cp\u003eTAAAAACAGATMT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"24.917491749174918%\"\u003e\n \u003cp\u003echr17:79884663-79884734\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"613\"\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eTable 2 (continued)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eSequence details \u003csup\u003e3\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003etaaaataaataaaaataaaaatacgataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaa\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003etaaaataaaataaaataaaataaaataaaataaataaatacataaatacataaataaataaataaataaataaataaataaataaataaa\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eaaagaaaagaaaagaaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaaagaaaga\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagagagaaagagagaaagagagaaagagagaaagaaagaaagaaagaat\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaataaaataaaataaaataaaataac\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eaaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaaagaaagaaagaaaaagaaagaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagacagacagacagacagacagacagagagagagagagaaag\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egaaagagagagagaaagaaagaaagagaaagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagagagagagaaagaaagaaggaaagaaagaaa\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egagaaaaggaaaaaagaaaagaaaaaaagaaagaaagaaaaaagaaagaaagaaagaaagaaagaaagaaaagagaaagaaaggaagaaagaaagaaagaaagaaa\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egagaaagaaggaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaaggaaggaaggaagg\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egaaaaagaaagaaagaaagaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaggaaagc\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003etaacaacataacataacataacataacataacataacataacataacataacataacataaataaaataaaat\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagagagagagagagagagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaaga\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egtcgaaaagaagaaaagaaagaaagaaagaaagaaagaaagaaagagagagagagagagagagagagagagagagagggagggagggagggagggagggagggagggagggaaagaaaagaaag\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eggaaagaaagagagagaaagaaagaagaaagaaagaaagaaagaaagaaagaagaaagagaaagtaagaaagaaagaaagaaagaaag\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egaaagaaagaaagaaagaaagagagagagagagagagagagagagagagagagagagagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaaga\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egaaagaaagaaagaaagaaagaaagaaagaaagaaagagaaagaaagaaagaaagaggaaggaaggaaggaagg\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003egagagagagagagagagaaagaaagaaagaaagagagagagaaagaaagaaagaaagagagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaag\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eaaaataaaattaaattaaaataaaataaaataaaataaaataaaataaaataaaataaaataaaaaataaat\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003etaaaataaaataaaataaaataaataaaataaaataaataaaataaaataaaataaataaaataaaataaaataataaaataaaat\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003etaaagaaagaaagaaagagagagagagagagagagagagagagagagagagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagaaagg\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003etaataaaataaaattaaaataaaattaaaataaaataaaataaaataaaataataaaataaaataaaaaataaaataaaataaaataaaataaataaaataaaataaaataaaataaat\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eaaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaaataaataaaataaaataaaataaaataaaataaaa\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eaataaaataaaataaaataaaataaaataaaataaataaaataaaataaaatataaaataaaataaaataaaa\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003eaataaaataaaataaaataaaataaataataaaataaaataataaaataaaataaaataaaataaaataaaataaaatataaaataaaataaaataaaa\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\"\u003e\n \u003cp\u003e\u003csup\u003e1\u003c/sup\u003e Note: we could not confirm any of the transductions listed in this table upon manual inspection.\u003c/p\u003e\n \u003cp\u003e\u003csup\u003e2\u003c/sup\u003e IUPAC nucleotide codes are employed in instances of mismatch within TSD pairs.\u003c/p\u003e\n \u003cp\u003e\u003csup\u003e3\u003c/sup\u003e Sequences initially identified as transductions, later confirmed as poly(A) tail components\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\u003cp\u003eWhile our primary focus was the human genome, we extended our scope to non-human primate genomes for a more comprehensive assessment of \u003cem\u003eAlu\u003c/em\u003e-mediated transductions. Our goal was to investigate whether the infrequency of \u003cem\u003eAlu\u003c/em\u003e-mediated transductions was unique to humans or a broader phenomenon across other primates in general. Therefore, we analyzed 106,386 \u003cem\u003eAlu\u003c/em\u003eY elements in the chimpanzee genome, panTro6, and 191,128 in the rhesus monkey genome, rheMac10, (Table 1 and Figure 1). It is important to highlight that these genomes were not complete telomere-to-telomere assemblies. Consequently, we were unable to examine 162 and 85 FL \u003cem\u003eAlu\u003c/em\u003eY elements in the chimpanzee and rhesus monkey genomes, respectively, due to gaps in the sequences downstream of these elements. Initially, approximately 1.3% of the \u003cem\u003eAlu\u003c/em\u003eY elements in the panTro6 genome and 1.1% in the rheMac10 genome appeared to display a transduction signature. However, similar to our findings in the human genome, closer inspection revealed that these signatures only consisted of heterogeneous poly(A) tails of varying lengths (Supplementary Tables 1 and 2).\u003c/p\u003e\n\u003cp\u003eUnlike L1s and SVAs which are transcribed by RNA polymerase II,\u003cem\u003e\u0026nbsp;Alu\u003c/em\u003es are predominantly transcribed by RNA polymerase III and thus use a distinct transcription termination mechanism, reflecting their unique interactions with RNA polymerases and subsequent genomic implications. In L1s and SVAs, transcription termination is mediated by a canonical polyadenylation signal (AATAAA) or a variant near the 3\u0026apos; end of the element, which occasionally is bypassed, resulting in the incorporation of downstream DNA leading to 3\u0026rsquo; transduction [3, 34, 45]. In contrast, \u003cem\u003eAlu\u003c/em\u003e elements employ a termination process characterized by a stretch of at least four thymine bases, oligo(T) [38]. Usually, \u003cem\u003eAlu\u003c/em\u003e transcripts extend into downstream sequences until they encounter a termination signal, which can result in the inclusion of additional non-\u003cem\u003eAlu\u003c/em\u003e sequences within their transcripts[15, 53]. However, our study finds that \u003cem\u003eAlu\u003c/em\u003e-mediated transductions are uncommon in the human genome, unlike transductions rendered by L1 and SVA elements. This finding suggests that the disparity in transduction frequency between \u003cem\u003eAlu\u003c/em\u003es and L1s or SVAs can be attributed to differences in post-transcriptional processes, particularly in the integration of the reverse transcript into the genome. It seems that \u003cem\u003eAlu\u003c/em\u003e transcripts with additional non-\u003cem\u003eAlu\u003c/em\u003e sequences produced by RNA polymerase III are not optimal substrates for the L1 machinery, impacting their amplification. This is consistent with evidence indicating that \u003cem\u003eAlu\u003c/em\u003e transcripts with extended non-\u003cem\u003eAlu\u003c/em\u003e sequences are inefficient templates for retroposition [15, 54]. It is also possible that transcriptionally and retropositionally active \u003cem\u003eAlu\u003c/em\u003es (\u003cem\u003eAlu\u003c/em\u003e master genes) have a strong termination signal adjacent to the element and their transcripts do not contain an extra DNA segment. Although we don\u0026rsquo;t really know which elements in our genome are master genes, 88 full-length young \u003cem\u003eAlu\u003c/em\u003e sequences in the chm13v2.0 reference genome are immediately followed by at least four Ts (Makalowski and Halabian, unpublished data).\u003c/p\u003e\n\u003cp\u003eIn conclusion, our research suggests that \u003cem\u003eAlu\u003c/em\u003e-mediated transductions in the genomes of human, chimpanzee, and rhesus monkey, are extremely rare. This is a notable deviation from the relatively frequent transductions observed within L1 and SVA elements. It is important to acknowledge that our analysis was centered on the youngest \u003cem\u003eAlu\u003c/em\u003e elements (\u003cem\u003eAlu\u003c/em\u003eYs), which are presumed to be transcribed by RNA polymerase III. Moreover, we applied strict criteria to our analyses to eliminate potential false positives. Thus, the existence of rare \u003cem\u003eAlu\u003c/em\u003e transductions by co-transcription via RNA polymerase II transcripts or derived from older \u003cem\u003eAlu\u003c/em\u003e subfamilies, i.e., \u003cem\u003eAlu\u003c/em\u003eS and \u003cem\u003eAlu\u003c/em\u003eJ elements, cannot be excluded. In summary, despite the fact that \u003cem\u003eAlu\u003c/em\u003es outnumber L1s and SVAs, transductions mediated by \u003cem\u003eAlu\u003c/em\u003e elements seem to be uncommon, if any at all. Moreover, since we analyzed three different primate genomes from a broad phylogenetic distribution, we predict our findings will likely apply to the primate phylum. However, biology is full of exceptions and surprises; therefore, we cannot rule out the possibility that rare \u003cem\u003eAlu\u003c/em\u003e-mediated cases might be discovered in the future.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eW.M., R.O. and J.B. initiated the study and drafted the research strategy. R.H. developed the software toolbox, analyzed the genomes for the Alu-driven transductions, generated figures and tables, and drafted the manuscript.J.M., S.H. and G.H. provided unpublished TE annotations of the genomes analyzed. R.H., J.S. J.B., and W.M. critically reviewed and discussed the ressults.R.H., J.B and W.M. wrote the final version of the manuscript.All authors reviewed the manuscript.\u003c/p\u003e\u003ch2\u003eData availability\u003c/h2\u003e \u003cp\u003eThe codes used for this paper have been deposited on GitHub and can be accessed through \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/IOB-Muenster/Transduction-Tracker-Verifier\u003c/span\u003e\u003cspan address=\"https://github.com/IOB-Muenster/Transduction-Tracker-Verifier\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eMcClintock, B., The origin and behavior of mutable loci in maize. \u003cem\u003eProc Natl Acad Sci U S A \u003c/em\u003e\u003cstrong\u003e1950,\u003c/strong\u003e \u003cem\u003e36\u003c/em\u003e (6), 344-55. https://doi.org/10.1073/pnas.36.6.344.\u003c/li\u003e\n\u003cli\u003eMakałowski, W.; Gotea, V.; Pande, A.; Makałowska, I., Transposable Elements: Classification, Identification, and Their Use As a Tool For Comparative Genomics. \u003cem\u003eMethods Mol Biol \u003c/em\u003e\u003cstrong\u003e2019,\u003c/strong\u003e \u003cem\u003e1910\u003c/em\u003e, 177-207. https://doi.org/10.1007/978-1-4939-9074-0_6.\u003c/li\u003e\n\u003cli\u003eHoyt, S. J.; Storer, J. M.; Hartley, G. A.; Grady, P. G. S.; Gershman, A.; de Lima, L. G.; Limouse, C.; Halabian, R.; Wojenski, L.; Rodriguez, M.; Altemose, N.; Rhie, A.; Core, L. J.; Gerton, J. L.; Makalowski, W.; Olson, D.; Rosen, J.; Smit, A. F. A.; Straight, A. F.; Vollger, M. R.; Wheeler, T. J.; Schatz, M. C.; Eichler, E. E.; Phillippy, A. M.; Timp, W.; Miga, K. H.; O\u0026apos;Neill, R. J., From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. \u003cem\u003eScience \u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e376\u003c/em\u003e (6588), eabk3112. https://doi.org/10.1126/science.abk3112.\u003c/li\u003e\n\u003cli\u003eFinnegan, D. J., Eukaryotic transposable elements and genome evolution. \u003cem\u003eTrends Genet \u003c/em\u003e\u003cstrong\u003e1989,\u003c/strong\u003e \u003cem\u003e5\u003c/em\u003e (4), 103-7. https://doi.org/10.1016/0168-9525(89)90039-5.\u003c/li\u003e\n\u003cli\u003eDeininger, P. L.; Batzer, M. A.; Hutchison, C. A., 3rd; Edgell, M. H., Master genes in mammalian repetitive DNA amplification. \u003cem\u003eTrends Genet \u003c/em\u003e\u003cstrong\u003e1992,\u003c/strong\u003e \u003cem\u003e8\u003c/em\u003e (9), 307-11. https://doi.org/10.1016/0168-9525(92)90262-3.\u003c/li\u003e\n\u003cli\u003eBrosius, J., The persistent contributions of RNA to eukaryotic gen(om)e architecture and cellular function. \u003cem\u003eCold Spring Harb Perspect Biol \u003c/em\u003e\u003cstrong\u003e2014,\u003c/strong\u003e \u003cem\u003e6\u003c/em\u003e (12), a016089. https://doi.org/10.1101/cshperspect.a016089.\u003c/li\u003e\n\u003cli\u003eBrouha, B.; Schustak, J.; Badge, R. M.; Lutz-Prigge, S.; Farley, A. H.; Moran, J. V.; Kazazian, H. H., Jr., Hot L1s account for the bulk of retrotransposition in the human population. \u003cem\u003eProc Natl Acad Sci U S A \u003c/em\u003e\u003cstrong\u003e2003,\u003c/strong\u003e \u003cem\u003e100\u003c/em\u003e (9), 5280-5. https://doi.org/10.1073/pnas.0831042100.\u003c/li\u003e\n\u003cli\u003eScott, E. C.; Gardner, E. J.; Masood, A.; Chuang, N. T.; Vertino, P. M.; Devine, S. E., A hot L1 retrotransposon evades somatic repression and initiates human colorectal cancer. \u003cem\u003eGenome Res \u003c/em\u003e\u003cstrong\u003e2016,\u003c/strong\u003e \u003cem\u003e26\u003c/em\u003e (6), 745-55. https://doi.org/10.1101/gr.201814.115.\u003c/li\u003e\n\u003cli\u003eShen, M. R.; Batzer, M. A.; Deininger, P. L., Evolution of the master Alu gene(s). \u003cem\u003eJ Mol Evol \u003c/em\u003e\u003cstrong\u003e1991,\u003c/strong\u003e \u003cem\u003e33\u003c/em\u003e (4), 311-20. https://doi.org/10.1007/bf02102862.\u003c/li\u003e\n\u003cli\u003eAlem\u0026aacute;n, C.; Roy-Engel, A. M.; Shaikh, T. H.; Deininger, P. L., Cis-acting influences on Alu RNA levels. \u003cem\u003eNucleic Acids Res \u003c/em\u003e\u003cstrong\u003e2000,\u003c/strong\u003e \u003cem\u003e28\u003c/em\u003e (23), 4755-61. https://doi.org/10.1093/nar/28.23.4755.\u003c/li\u003e\n\u003cli\u003eCordaux, R.; Hedges, D. J.; Batzer, M. A., Retrotransposition of Alu elements: how many sources? \u003cem\u003eTrends Genet \u003c/em\u003e\u003cstrong\u003e2004,\u003c/strong\u003e \u003cem\u003e20\u003c/em\u003e (10), 464-7. https://doi.org/10.1016/j.tig.2004.07.012.\u003c/li\u003e\n\u003cli\u003eTang, W.; Liang, P., Alu master copies serve as the drivers of differential SINE transposition in recent primate genomes. \u003cem\u003eAnal Biochem \u003c/em\u003e\u003cstrong\u003e2020,\u003c/strong\u003e \u003cem\u003e606\u003c/em\u003e, 113825. https://doi.org/10.1016/j.ab.2020.113825.\u003c/li\u003e\n\u003cli\u003eKazazian, H. H., Jr., Mobile elements: drivers of genome evolution. \u003cem\u003eScience \u003c/em\u003e\u003cstrong\u003e2004,\u003c/strong\u003e \u003cem\u003e303\u003c/em\u003e (5664), 1626-32. https://doi.org/10.1126/science.1089670.\u003c/li\u003e\n\u003cli\u003eHancks, D. C.; Kazazian, H. H., Jr., SVA retrotransposons: Evolution and genetic instability. \u003cem\u003eSemin Cancer Biol \u003c/em\u003e\u003cstrong\u003e2010,\u003c/strong\u003e \u003cem\u003e20\u003c/em\u003e (4), 234-45. https://doi.org/10.1016/j.semcancer.2010.04.001.\u003c/li\u003e\n\u003cli\u003eDeininger, P., Alu elements: know the SINEs. \u003cem\u003eGenome Biology \u003c/em\u003e\u003cstrong\u003e2011,\u003c/strong\u003e \u003cem\u003e12\u003c/em\u003e (12), 236. https://doi.org/10.1186/gb-2011-12-12-236.\u003c/li\u003e\n\u003cli\u003eMills, R. E.; Bennett, E. A.; Iskow, R. C.; Devine, S. E., Which transposable elements are active in the human genome? \u003cem\u003eTrends Genet \u003c/em\u003e\u003cstrong\u003e2007,\u003c/strong\u003e \u003cem\u003e23\u003c/em\u003e (4), 183-91. https://doi.org/10.1016/j.tig.2007.02.006.\u003c/li\u003e\n\u003cli\u003eSmit, A. F. A.; Riggs, A. D., MIRs are classic, tRNA-derived SINEs that amplified before the mammalian radiation. \u003cem\u003eNucleic Acids Res \u003c/em\u003e\u003cstrong\u003e1995,\u003c/strong\u003e \u003cem\u003e23\u003c/em\u003e (1), 98-102. https://doi.org/10.1093/nar/23.1.98.\u003c/li\u003e\n\u003cli\u003eJurka, J.; Zietkiewicz, E.; Labuda, D., Ubiquitous mammalian-wide interspersed repeats (MIRs) are molecular fossils from the mesozoic era. \u003cem\u003eNucleic Acids Res \u003c/em\u003e\u003cstrong\u003e1995,\u003c/strong\u003e \u003cem\u003e23\u003c/em\u003e (1), 170-5. https://doi.org/10.1093/nar/23.1.170.\u003c/li\u003e\n\u003cli\u003ePace, J. K., 2nd; Feschotte, C., The evolutionary history of human DNA transposons: evidence for intense activity in the primate lineage. \u003cem\u003eGenome Res \u003c/em\u003e\u003cstrong\u003e2007,\u003c/strong\u003e \u003cem\u003e17\u003c/em\u003e (4), 422-32. https://doi.org/10.1101/gr.5826307.\u003c/li\u003e\n\u003cli\u003eMakałowski, W., Genomic scrap yard: how genomes utilize all that junk. \u003cem\u003eGene \u003c/em\u003e\u003cstrong\u003e2000,\u003c/strong\u003e \u003cem\u003e259\u003c/em\u003e (1-2), 61-7. https://doi.org/10.1016/s0378-1119(00)00436-4.\u003c/li\u003e\n\u003cli\u003eBourque, G.; Burns, K. H.; Gehring, M.; Gorbunova, V.; Seluanov, A.; Hammell, M.; Imbeault, M.; Izsv\u0026aacute;k, Z.; Levin, H. L.; Macfarlan, T. S.; Mager, D. L.; Feschotte, C., Ten things you should know about transposable elements. \u003cem\u003eGenome Biology \u003c/em\u003e\u003cstrong\u003e2018,\u003c/strong\u003e \u003cem\u003e19\u003c/em\u003e (1), 199. https://doi.org/10.1186/s13059-018-1577-z.\u003c/li\u003e\n\u003cli\u003eCapy, P.; Van-Hua, A. l., \u003cem\u003eTransposable elements and genome evolution\u003c/em\u003e. ISTE Ltd / John Wiley and Sons Inc: Hoboken, 2023.\u003c/li\u003e\n\u003cli\u003eKorenberg, J. R.; Rykowski, M. C., Human genome organization: Alu, lines, and the molecular structure of metaphase chromosome bands. \u003cem\u003eCell \u003c/em\u003e\u003cstrong\u003e1988,\u003c/strong\u003e \u003cem\u003e53\u003c/em\u003e (3), 391-400. https://doi.org/10.1016/0092-8674(88)90159-6.\u003c/li\u003e\n\u003cli\u003eOvchinnikov, I.; Troxel, A. B.; Swergold, G. D., Genomic characterization of recent human LINE-1 insertions: evidence supporting random insertion. \u003cem\u003eGenome Res \u003c/em\u003e\u003cstrong\u003e2001,\u003c/strong\u003e \u003cem\u003e11\u003c/em\u003e (12), 2050-8. https://doi.org/10.1101/gr.194701.\u003c/li\u003e\n\u003cli\u003eKimura, M., \u003cem\u003eThe neutral theory of molecular evolution\u003c/em\u003e. Cambridge University Press: Cambridge Cambridgeshire ; New York, 1983; p xv, 367 p.\u003c/li\u003e\n\u003cli\u003eKazazian, H. H., Jr.; Wong, C.; Youssoufian, H.; Scott, A. F.; Phillips, D. G.; Antonarakis, S. E., Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man. \u003cem\u003eNature \u003c/em\u003e\u003cstrong\u003e1988,\u003c/strong\u003e \u003cem\u003e332\u003c/em\u003e (6160), 164-6. https://doi.org/10.1038/332164a0.\u003c/li\u003e\n\u003cli\u003eMitchell, G. A.; Labuda, D.; Fontaine, G.; Saudubray, J. M.; Bonnefont, J. P.; Lyonnet, S.; Brody, L. C.; Steel, G.; Obie, C.; Valle, D., Splice-mediated insertion of an Alu sequence inactivates ornithine delta-aminotransferase: a role for Alu elements in human mutation. \u003cem\u003eProc Natl Acad Sci U S A \u003c/em\u003e\u003cstrong\u003e1991,\u003c/strong\u003e \u003cem\u003e88\u003c/em\u003e (3), 815-9. https://doi.org/10.1073/pnas.88.3.815.\u003c/li\u003e\n\u003cli\u003eMiki, Y.; Nishisho, I.; Horii, A.; Miyoshi, Y.; Utsunomiya, J.; Kinzler, K. W.; Vogelstein, B.; Nakamura, Y., Disruption of the APC gene by a retrotransposal insertion of L1 sequence in a colon cancer. \u003cem\u003eCancer Res \u003c/em\u003e\u003cstrong\u003e1992,\u003c/strong\u003e \u003cem\u003e52\u003c/em\u003e (3), 643-5. https://www.ncbi.nlm.nih.gov/pubmed/1310068.\u003c/li\u003e\n\u003cli\u003eHolmes, S. E.; Dombroski, B. A.; Krebs, C. M.; Boehm, C. D.; Kazazian, H. H., Jr., A new retrotransposable human L1 element from the LRE2 locus on chromosome 1q produces a chimaeric insertion. \u003cem\u003eNat Genet \u003c/em\u003e\u003cstrong\u003e1994,\u003c/strong\u003e \u003cem\u003e7\u003c/em\u003e (2), 143-8. https://doi.org/10.1038/ng0694-143.\u003c/li\u003e\n\u003cli\u003eMcNaughton, J. C.; Hughes, G.; Jones, W. A.; Stockwell, P. A.; Klamut, H. J.; Petersen, G. B., The evolution of an intron: analysis of a long, deletion-prone intron in the human dystrophin gene. \u003cem\u003eGenomics \u003c/em\u003e\u003cstrong\u003e1997,\u003c/strong\u003e \u003cem\u003e40\u003c/em\u003e (2), 294-304. https://doi.org/10.1006/geno.1996.4543.\u003c/li\u003e\n\u003cli\u003eRozmahel, R.; Heng, H. H.; Duncan, A. M.; Shi, X. M.; Rommens, J. M.; Tsui, L. C., Amplification of CFTR exon 9 sequences to multiple locations in the human genome. \u003cem\u003eGenomics \u003c/em\u003e\u003cstrong\u003e1997,\u003c/strong\u003e \u003cem\u003e45\u003c/em\u003e (3), 554-61. https://doi.org/10.1006/geno.1997.4968.\u003c/li\u003e\n\u003cli\u003eMoran, J. V.; DeBerardinis, R. J.; Kazazian, H. H., Jr., Exon shuffling by L1 retrotransposition. \u003cem\u003eScience \u003c/em\u003e\u003cstrong\u003e1999,\u003c/strong\u003e \u003cem\u003e283\u003c/em\u003e (5407), 1530-4. https://doi.org/10.1126/science.283.5407.1530.\u003c/li\u003e\n\u003cli\u003eGoodier, J. L.; Ostertag, E. M.; Kazazian, H. H., Jr., Transduction of 3\u0026apos;-flanking sequences is common in L1 retrotransposition. \u003cem\u003eHum Mol Genet \u003c/em\u003e\u003cstrong\u003e2000,\u003c/strong\u003e \u003cem\u003e9\u003c/em\u003e (4), 653-7. https://doi.org/10.1093/hmg/9.4.653.\u003c/li\u003e\n\u003cli\u003ePickeral, O. K.; Makalowski, W.; Boguski, M. S.; Boeke, J. D., Frequent human genomic DNA transduction driven by LINE-1 retrotransposition. \u003cem\u003eGenome Res \u003c/em\u003e\u003cstrong\u003e2000,\u003c/strong\u003e \u003cem\u003e10\u003c/em\u003e (4), 411-5. https://doi.org/10.1101/gr.10.4.411.\u003c/li\u003e\n\u003cli\u003eSzak, S. T.; Pickeral, O. K.; Makalowski, W.; Boguski, M. S.; Landsman, D.; Boeke, J. D., Molecular archeology of L1 insertions in the human genome. \u003cem\u003eGenome Biol \u003c/em\u003e\u003cstrong\u003e2002,\u003c/strong\u003e \u003cem\u003e3\u003c/em\u003e (10), research0052. https://doi.org/10.1186/gb-2002-3-10-research0052.\u003c/li\u003e\n\u003cli\u003eEickbush, T., Exon shuffling in retrospect. \u003cem\u003eScience \u003c/em\u003e\u003cstrong\u003e1999,\u003c/strong\u003e \u003cem\u003e283\u003c/em\u003e (5407), 1465;1467. https://doi.org/10.1126/science.283.5407.1465.\u003c/li\u003e\n\u003cli\u003eXing, J.; Wang, H.; Belancio, V. P.; Cordaux, R.; Deininger, P. L.; Batzer, M. A., Emergence of primate genes by retrotransposon-mediated sequence transduction. \u003cem\u003eProc Natl Acad Sci U S A \u003c/em\u003e\u003cstrong\u003e2006,\u003c/strong\u003e \u003cem\u003e103\u003c/em\u003e (47), 17608-13. https://doi.org/10.1073/pnas.0603224103.\u003c/li\u003e\n\u003cli\u003eBogenhagen, D. F.; Brown, D. D., Nucleotide sequences in Xenopus 5S DNA required for transcription termination. \u003cem\u003eCell \u003c/em\u003e\u003cstrong\u003e1981,\u003c/strong\u003e \u003cem\u003e24\u003c/em\u003e (1), 261-70. https://doi.org/10.1016/0092-8674(81)90522-5.\u003c/li\u003e\n\u003cli\u003eKojima, K. K., Alu monomer revisited: recent generation of Alu monomers. \u003cem\u003eMol Biol Evol \u003c/em\u003e\u003cstrong\u003e2011,\u003c/strong\u003e \u003cem\u003e28\u003c/em\u003e (1), 13-5. https://doi.org/10.1093/molbev/msq218.\u003c/li\u003e\n\u003cli\u003eWang, M.; Kong, L., pblat: a multithread blat algorithm speeding up aligning sequences to genomes. \u003cem\u003eBMC Bioinformatics \u003c/em\u003e\u003cstrong\u003e2019,\u003c/strong\u003e \u003cem\u003e20\u003c/em\u003e (1), 28. https://doi.org/10.1186/s12859-019-2597-8.\u003c/li\u003e\n\u003cli\u003eNo\u0026eacute;, L.; Kucherov, G., YASS: enhancing the sensitivity of DNA similarity search. \u003cem\u003eNucleic Acids Res \u003c/em\u003e\u003cstrong\u003e2005,\u003c/strong\u003e \u003cem\u003e33\u003c/em\u003e (suppl_2), W540-W543. https://doi.org/10.1093/nar/gki478.\u003c/li\u003e\n\u003cli\u003eNurk, S.; Koren, S.; Rhie, A.; Rautiainen, M.; Bzikadze, A. V.; Mikheenko, A.; Vollger, M. R.; Altemose, N.; Uralsky, L.; Gershman, A.; Aganezov, S.; Hoyt, S. J.; Diekhans, M.; Logsdon, G. A.; Alonge, M.; Antonarakis, S. E.; Borchers, M.; Bouffard, G. G.; Brooks, S. Y.; Caldas, G. V.; Chen, N.-C.; Cheng, H.; Chin, C.-S.; Chow, W.; de Lima, L. G.; Dishuck, P. C.; Durbin, R.; Dvorkina, T.; Fiddes, I. T.; Formenti, G.; Fulton, R. S.; Fungtammasan, A.; Garrison, E.; Grady, P. G. S.; Graves-Lindsay, T. A.; Hall, I. M.; Hansen, N. F.; Hartley, G. A.; Haukness, M.; Howe, K.; Hunkapiller, M. W.; Jain, C.; Jain, M.; Jarvis, E. D.; Kerpedjiev, P.; Kirsche, M.; Kolmogorov, M.; Korlach, J.; Kremitzki, M.; Li, H.; Maduro, V. V.; Marschall, T.; McCartney, A. M.; McDaniel, J.; Miller, D. E.; Mullikin, J. C.; Myers, E. W.; Olson, N. D.; Paten, B.; Peluso, P.; Pevzner, P. A.; Porubsky, D.; Potapova, T.; Rogaev, E. I.; Rosenfeld, J. A.; Salzberg, S. L.; Schneider, V. A.; Sedlazeck, F. J.; Shafin, K.; Shew, C. J.; Shumate, A.; Sims, Y.; Smit, A. F. A.; Soto, D. C.; Sović, I.; Storer, J. M.; Streets, A.; Sullivan, B. A.; Thibaud-Nissen, F.; Torrance, J.; Wagner, J.; Walenz, B. P.; Wenger, A.; Wood, J. M. D.; Xiao, C.; Yan, S. M.; Young, A. C.; Zarate, S.; Surti, U.; McCoy, R. C.; Dennis, M. Y.; Alexandrov, I. A.; Gerton, J. L.; O\u0026rsquo;Neill, R. J.; Timp, W.; Zook, J. M.; Schatz, M. C.; Eichler, E. E.; Miga, K. H.; Phillippy, A. M., The complete sequence of a human genome. \u003cem\u003eScience \u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e376\u003c/em\u003e (6588), 44-53. https://doi.org/doi:10.1126/science.abj6987.\u003c/li\u003e\n\u003cli\u003eRhie, A.; Nurk, S.; Cechova, M.; Hoyt, S. J.; Taylor, D. J.; Altemose, N.; Hook, P. W.; Koren, S.; Rautiainen, M.; Alexandrov, I. A.; Allen, J.; Asri, M.; Bzikadze, A. V.; Chen, N.-C.; Chin, C.-S.; Diekhans, M.; Flicek, P.; Formenti, G.; Fungtammasan, A.; Garcia Giron, C.; Garrison, E.; Gershman, A.; Gerton, J. L.; Grady, P. G. S.; Guarracino, A.; Haggerty, L.; Halabian, R.; Hansen, N. F.; Harris, R.; Hartley, G. A.; Harvey, W. T.; Haukness, M.; Heinz, J.; Hourlier, T.; Hubley, R. M.; Hunt, S. E.; Hwang, S.; Jain, M.; Kesharwani, R. K.; Lewis, A. P.; Li, H.; Logsdon, G. A.; Lucas, J. K.; Makalowski, W.; Markovic, C.; Martin, F. J.; Mc Cartney, A. M.; McCoy, R. C.; McDaniel, J.; McNulty, B. M.; Medvedev, P.; Mikheenko, A.; Munson, K. M.; Murphy, T. D.; Olsen, H. E.; Olson, N. D.; Paulin, L. F.; Porubsky, D.; Potapova, T.; Ryabov, F.; Salzberg, S. L.; Sauria, M. E. G.; Sedlazeck, F. J.; Shafin, K.; Shepelev, V. A.; Shumate, A.; Storer, J. M.; Surapaneni, L.; Taravella Oill, A. M.; Thibaud-Nissen, F.; Timp, W.; Tomaszkiewicz, M.; Vollger, M. R.; Walenz, B. P.; Watwood, A. C.; Weissensteiner, M. H.; Wenger, A. M.; Wilson, M. A.; Zarate, S.; Zhu, Y.; Zook, J. M.; Eichler, E. E.; O\u0026rsquo;Neill, R. J.; Schatz, M. C.; Miga, K. H.; Makova, K. D.; Phillippy, A. M., The complete sequence of a human Y chromosome. \u003cem\u003eNature \u003c/em\u003e\u003cstrong\u003e2023\u003c/strong\u003e. https://doi.org/10.1038/s41586-023-06457-y.\u003c/li\u003e\n\u003cli\u003eBennett, E. A.; Keller, H.; Mills, R. E.; Schmidt, S.; Moran, J. V.; Weichenrieder, O.; Devine, S. E., Active Alu retrotransposons in the human genome. \u003cem\u003eGenome Res \u003c/em\u003e\u003cstrong\u003e2008,\u003c/strong\u003e \u003cem\u003e18\u003c/em\u003e (12), 1875-83. https://doi.org/10.1101/gr.081737.108.\u003c/li\u003e\n\u003cli\u003eHalabian, R.; Makałowski, W., A Map of 3\u0026rsquo; DNA Transduction Variants Mediated by Non-LTR Retroelements on 3202 Human Genomes. \u003cem\u003eBiology \u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e11\u003c/em\u003e (7), 1032. https://www.mdpi.com/2079-7737/11/7/1032.\u003c/li\u003e\n\u003cli\u003eReich, J. G.; Drabsch, H.; D\u0026auml;umler, A., On the statistical assessment of similarities in DNA sequences. \u003cem\u003eNucleic Acids Res \u003c/em\u003e\u003cstrong\u003e1984,\u003c/strong\u003e \u003cem\u003e12\u003c/em\u003e (13), 5529-43. https://doi.org/10.1093/nar/12.13.5529.\u003c/li\u003e\n\u003cli\u003eAltschul, S. F.; Erickson, B. W., Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. \u003cem\u003eMol Biol Evol \u003c/em\u003e\u003cstrong\u003e1985,\u003c/strong\u003e \u003cem\u003e2\u003c/em\u003e (6), 526-38. https://doi.org/10.1093/oxfordjournals.molbev.a040370.\u003c/li\u003e\n\u003cli\u003eArcot, S. S.; Wang, Z.; Weber, J. L.; Deininger, P. L.; Batzer, M. A., Alu Repeats: A Source for the Genesis of Primate Microsatellites. \u003cem\u003eGenomics \u003c/em\u003e\u003cstrong\u003e1995,\u003c/strong\u003e \u003cem\u003e29\u003c/em\u003e (1), 136-144. https://doi.org/https://doi.org/10.1006/geno.1995.1224.\u003c/li\u003e\n\u003cli\u003eRoy-Engel, A. M.; Salem, A. H.; Oyeniran, O. O.; Deininger, L.; Hedges, D. J.; Kilroy, G. E.; Batzer, M. A.; Deininger, P. L., Active Alu element \u0026quot;A-tails\u0026quot;: size does matter. \u003cem\u003eGenome Res \u003c/em\u003e\u003cstrong\u003e2002,\u003c/strong\u003e \u003cem\u003e12\u003c/em\u003e (9), 1333-44. https://doi.org/10.1101/gr.384802.\u003c/li\u003e\n\u003cli\u003eJurka, J.; Gentles, A. J., Origin and diversification of minisatellites derived from human Alu sequences. \u003cem\u003eGene \u003c/em\u003e\u003cstrong\u003e2006,\u003c/strong\u003e \u003cem\u003e365\u003c/em\u003e, 21-26. https://doi.org/https://doi.org/10.1016/j.gene.2005.09.029.\u003c/li\u003e\n\u003cli\u003eDewannieux, M.; Heidmann, T., Role of poly(A) tail length in Alu retrotransposition. \u003cem\u003eGenomics \u003c/em\u003e\u003cstrong\u003e2005,\u003c/strong\u003e \u003cem\u003e86\u003c/em\u003e (3), 378-381. https://doi.org/https://doi.org/10.1016/j.ygeno.2005.05.009.\u003c/li\u003e\n\u003cli\u003eWagstaff, B. J.; Hedges, D. J.; Derbes, R. S.; Campos Sanchez, R.; Chiaromonte, F.; Makova, K. D.; Roy-Engel, A. M., Rescuing Alu: recovery of new inserts shows LINE-1 preserves Alu activity through A-tail expansion. \u003cem\u003ePLoS Genet \u003c/em\u003e\u003cstrong\u003e2012,\u003c/strong\u003e \u003cem\u003e8\u003c/em\u003e (8), e1002842. https://doi.org/10.1371/journal.pgen.1002842.\u003c/li\u003e\n\u003cli\u003eCordaux, R.; Batzer, M. A., The impact of retrotransposons on human genome evolution. \u003cem\u003eNature Reviews Genetics \u003c/em\u003e\u003cstrong\u003e2009,\u003c/strong\u003e \u003cem\u003e10\u003c/em\u003e (10), 691-703. https://doi.org/10.1038/nrg2640.\u003c/li\u003e\n\u003cli\u003eComeaux, M. S.; Roy-Engel, A. M.; Hedges, D. J.; Deininger, P. L., Diverse cis factors controlling Alu retrotransposition: what causes Alu elements to die? \u003cem\u003eGenome Res \u003c/em\u003e\u003cstrong\u003e2009,\u003c/strong\u003e \u003cem\u003e19\u003c/em\u003e (4), 545-55. https://doi.org/10.1101/gr.089789.108.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"genome-biology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gbio","sideBox":"Learn more about [Genome Biology](https://genomebiology.biomedcentral.com/)","snPcode":"13059","submissionUrl":"https://submission.springernature.com/new-submission/13059/3","title":"Genome Biology","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"DNA transduction, Alu, transposed elements, retrosequences, human genome, human genomics, primate genomes","lastPublishedDoi":"10.21203/rs.3.rs-4595082/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4595082/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eLong terminal repeats (LTRs) and non-LTRs retrotransposons, aka retroelements, collectively occupy a substantial part of the human genome. Certain non-LTR retroelements, such as L1 and SVA, have the potential for DNA transduction, which involves the concurrent mobilization of flanking non-transposon DNA during retrotransposition. These events can be detected by computational approaches. Despite being the most abundant short interspersed sequences (SINEs) that are still active within the genomes of humans and other primates, the transduction rate caused by \u003cem\u003eAlu\u003c/em\u003e sequences remains unexplored. Therefore, we conducted an analysis to address this research gap and utilized an in-house program to probe for the presence of \u003cem\u003eAlu\u003c/em\u003e-related transductions in the human genome. We analyzed 118,489 full-length \u003cem\u003eAlu\u003c/em\u003eY subfamilies annotated within the first complete human reference genome, T2T-CHM13. For comparative insights, we extended our exploration to two non-human primate genomes, the chimpanzee and the rhesus monkey. After manual curation, our findings did not confirm any \u003cem\u003eAlu\u003c/em\u003e-mediated transductions, whose source genes are, unlike L1 or SVA, transcribed by RNA polymerase III, implying that they are infrequent or possibly absent not only in the human but also in chimpanzee and rhesus monkey genomes. Although we identified loci in which the 3\u0026rsquo; Target Site Duplication (TSD) was located distantly from the retrotransposed \u003cem\u003eAlu\u003c/em\u003eYs, a transduction hallmark, our study could not find further support for such events. The observation of these instances can be explained by the incorporation of other nucleotides into the poly(A) tails in conjunction with polymerase slippage.\u003c/p\u003e","manuscriptTitle":"Exploring Alu-Driven DNA Transductions in the Primate Genomes","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-07-04 04:14:09","doi":"10.21203/rs.3.rs-4595082/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2024-10-10T18:13:41+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-08-05T01:33:57+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-07-30T20:00:42+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"331375766063410529146910074371543542054","date":"2024-07-27T17:30:09+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"120661847562225665328898450025764979910","date":"2024-07-26T19:18:40+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-07-25T14:42:43+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-06-20T11:41:27+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-06-18T08:28:17+00:00","index":"","fulltext":""},{"type":"submitted","content":"Genome Biology","date":"2024-06-17T15:24:52+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"genome-biology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gbio","sideBox":"Learn more about [Genome Biology](https://genomebiology.biomedcentral.com/)","snPcode":"13059","submissionUrl":"https://submission.springernature.com/new-submission/13059/3","title":"Genome Biology","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"2465aa3a-21f9-485d-9954-c1576c0fb3a3","owner":[],"postedDate":"July 4th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"in-revision","subjectAreas":[],"tags":[],"updatedAt":"2024-10-10T18:23:20+00:00","versionOfRecord":[],"versionCreatedAt":"2024-07-04 04:14:09","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4595082","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4595082","identity":"rs-4595082","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00