Comparative analyses uncover a link between mRNA splicing, stability, and RNA covalent modifications in flowering plants

preprint OA: closed
Full text JSON View at publisher
Full text 163,143 characters · extracted from preprint-html · click to expand
Comparative analyses uncover a link between mRNA splicing, stability, and RNA covalent modifications in flowering plants | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Comparative analyses uncover a link between mRNA splicing, stability, and RNA covalent modifications in flowering plants Kyle Palos, Anna C. Nelson Dittrich, Eric H. Lyons, Brian D. Gregory, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4466769/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 12 Aug, 2024 Read the published version in BMC Plant Biology → Version 1 posted 11 You are reading this latest preprint version Abstract Background In recent years, covalent modifications on RNA nucleotides have emerged as pivotal moieties influencing the structure, function, and regulatory processes of RNA Polymerase II transcripts such as mRNAs and lncRNAs. However, our understanding of their biological roles and whether these roles are conserved across eukaryotes remains limited. Results In this study, we leveraged standard RNA-sequencing data to identify and characterize RNA modifications that introduce base-pairing errors into cDNA reads. Our investigation incorporated data from three Poaceae ( Zea mays , Sorghum bicolor , and Setaria italica ), as well as publicly available data from a range of stress and genetic contexts in Sorghum and Arabidopsis thaliana . We uncovered a strong enrichment of RNA covalent modifications (RCMs) deposited on a conserved core set of nuclear RNAs involved in photosynthesis and translation across these species. However, the cohort of modified transcripts changed based on environmental context and developmental program, a pattern that was also conserved across flowering plants. We determined that RCMs can partly explain accession-level differences in drought tolerance in Sorghum, with stress-associated genes receiving a higher level of RCMs in a drought tolerant accession. To address function, we determined that RCMs are significantly enriched near exon junctions within coding regions, suggesting an association with splicing. Intriguingly, we found that these base-pair disrupting RCMs are associated with stable mRNAs, are highly correlated with protein abundance, and thus likely associated with facilitating translation. Conclusions Our data point to a conserved role for RCMs in mRNA stability and translation across the flowering plant lineage. RNA covalent modifications RNA Comparative transcriptomics RNA decay Angiosperms Figures Figure 1 Figure 2 Figure 3 Figure 4 Background There are over 100 RNA covalent modifications (RCMs) that are deposited on all classes of RNAs at various stages of their lifecycle [ 1 ]. RCMs, collectively referred to as the “epitranscriptome” [ 2 , 3 ] are known to influence RNA stability, splicing, structure, intra- and intermolecular interactions, and translation [ 4 – 9 ]. While most RCMs have been observed on ribosomal and transfer RNAs, they have also been observed on messenger RNAs (mRNAs) and long non-coding RNAs (lncRNAs), albeit at lower levels [ 4 , 8 , 10 ]. Although pseudouridine is the most abundant modification among all classes of RNA [ 10 ], N 6 -methyladenosine (m 6 A) is the most abundant internal mRNA RCM targeting approximately 30% of all mRNAs and is present at ~ 0.4% of all adenosine nucleotides in cellular RNAs [ 11 , 12 ]. Additional internal RCMs, such as pseudouridine, N 1 -methyladenosine (m 1 A), and 5-methylcytidine (m 5 C), have been observed on mRNAs in a number of eukaryotic systems, although their functional significance on these RNA molecules is not always clear [ 13 – 15 ]. Much of what we do know about the epitranscriptome is drawn from studies on m 6 A in mammalian and plant systems, but can likely be applied to many of the other modification classes. With the notable exception of pseudouridine [ 16 ], RCMs are deposited through the enzymatic activity of highly conserved proteins called writers (e.g., methyltransferases), “interpreted”, or bound, by RNA binding proteins called readers, and removed by eraser enzymes (e.g., demethylases) [ 5 , 17 – 19 ]. Arabidopsis and mammals have a similar m 6 A frequency across the transcriptome (~ 1 m 6 A per 2000 nucleotides), which is targeted to a similar consensus motif RRACH (R = G or A; H = U, A, or C), and has a similar 3’ mRNA bias around the stop codon [ 11 , 20 ]. In plants and mammals, pseudouridine has been found to localize primarily to the CDS and 5’ UTR of mRNAs, prefers the first U of a triplet codon (e.g., UUC/UCU/UUU), and is overrepresented on transcripts targeted to the chloroplast or mitochondria [ 21 ]. Motif preferences for m5C can vary depending on the methyltransferase involved, but in humans are generally found on C/G rich regions of mRNAs [ 22 ]. While less is known about binding preferences of the other predominant RCMs, such as m 1 A and m 3 C, aberrant addition (or removal) has been observed as a hallmark for tumorigenesis [ 23 , 24 ]. Importantly, loss of many of the writers, readers, and erasers for these RCMs is lethal or causes severe developmental defects in all tested systems [ 14 , 25 – 28 ]. RCMs function through a variety of molecular mechanisms to regulate mRNA stability, structure, and splicing. RCMs induce changes in base-pairing properties and higher order structure, ostensibly allowing for increased flexibility in interactors, translational capacity, and structure associated stabilization [ 29 ]. There is a close association between splicing and m6A deposition [ 20 , 30 ], with recent studies revealing that m 6 A is specifically excluded from splice junctions due to physical occlusion by the exon junction complex, a process that ultimately impacts cytoplasmic mRNA decay [ 31 , 32 ]. Pseudouridine has also been implicated in mRNA splicing, stability, and translational efficiency [ 21 , 33 , 34 ], as well as the stability and maturation of lncRNAs such as the telomerase RNA [ 35 , 36 ]. Other modifications, such as m 1 A, m 3 C, and m 5 C are also believed to be important for mRNA stability, mostly due to phenotypes associated with loss of function mutants in their associated writer proteins [ 23 , 33 , 37 , 38 ]. Thus, RCMs are both a ubiquitous and critical aspect of an RNA’s lifecycle, but specific mechanistic details for many of them are still lacking. Given the high frequency with which essential RNAs such as rRNAs, tRNAs and snoRNAs are modified, it has historically been difficult to use genetic approaches to monitor the RCM status of mRNAs and lncRNAs. As a result, most studies have relied either on transcriptome-wide antibody-based (m 6 A), or modification-specific chemical-genomic approaches (e.g., bisulfite sequencing for m 5 C; [ 7 , 10 , 39 , 40 ]. An alternative sequencing-based approach relies on the propensity for RCMs that occur at the interface between the canonical base-pairing edge (i.e., the Watson-Crick-Franklin, or WCF base-pairing edge) to be misidentified by reverse transcriptases during the cDNA synthesis step [ 41 , 42 ]. This misidentification results in non-random misincorporations or deletions at modified residues. Multiple algorithms, including HAMR and ModTect [ 15 , 42 ], have been developed to infer modification status, and class, based on these “sequencing errors”. While this approach cannot detect RCMs outside of the WCF base-pairing edge, over 30 modification types can be identified, including m 1 A, pseudouridine, m 3 C, and m 5 C [ 15 , 42 – 44 ]. Importantly, where there is overlap (such as for m 5 C, m 3 C, and m 1 A), the antibody, chemical-genomic, and sequencing error-based approaches are largely in agreement [ 14 , 15 ]. Inferring modification status via HAMR or ModTect allows both for the repurposing of existing RNA-seq data, of which there are petabases available in NCBI’s SRA [ 45 ], and the side-by-side calculation of modification and transcript abundance. Here we utilized both HAMR and ModTect to analyze new and publicly available RNA-seq in Arabidopsis and three agronomically important and closely related grasses: Sorghum bicolor , Zea mays , and Setaria italica . We used these diverse stress, tissue-atlas, and developmental datasets in an attempt to better understand what some of these less characterized RCMs might be doing in plants and how they might be conserved, both functionally and in terms of their target genes. We present an in-depth comparative assessment of RCM dynamics across these plant species and uncover new insights into RNA splicing, RNA stability, and plant responses to stress. We demonstrate a level of conservation of target genes previously unseen for RCMs, and reveal that RCMs may participate in population-level variation in stress responses. Mechanistically, we identify a link between splicing and RCMs that is present in all examined systems. Finally, through the analysis of RNA decay pathways in Arabidopsis, we establish that RCMs are associated with unusually stable mRNAs, an aspect that may facilitate their continued translation, as we observe a link between modification state and protein abundance. Results Identification of Post-transcriptional RNA modifications in three Poaceae To determine the degree to which the epitranscriptome is conserved at a gene level, we performed poly-A enriched RNA-sequencing on soil grown seedlings of Zea mays , Sorghum bicolor , and Setaria italica (Maize, Sorghum, and Setaria, respectively, going forward) two weeks after germination. Paired-end, 150 base-pair RNA-sequencing was performed on aboveground tissue for two biological replicates (~ 20 million reads per replicate; Fig. 1 A). Following read mapping, modified sites were inferred using HAMR and ModTect (see Methods ). Both of these algorithms take advantage of the propensity for certain reverse transcriptases to misinterpret ribonucleotides that are modified along the canonical base-pairing edge, and as a result, incorporate non-reference nucleotides (i.e., SNPs) in the synthesized cDNA. This misincorporation is non-random, and both HAMR and ModTect use the resulting population of SNPs to infer modification class. RNA-seq data were fed into both algorithms and sites that were predicted by both algorithms, and both replicates were retained for subsequent analyses as high confidence Post-transcriptional RNA Modifications (RCMs). From this approach, 5,434-7,020 unique RCMs were identified in 1,944-2,542 transcripts in the three species (Fig. 1 B, C, Additional File 1 ). These modifications represent the seven major classes detected by HAMR, with the m1A|m1I|ms2i6A class being the most common in all three species (approximately 33% of all identified RCMs; Fig. S1 A ). Additionally, we find that mRNAs marked by RCMs are present at a substantially higher abundance relative to non-modified mRNAs (modified median TPM = 78.5, n = 2521 mRNAs, not modified median TPM = 10.2, n = 16,537, p < 2.2e-16, Fig. S1 E ). Enriched gene ontology terms of modified RNAs in each Poaceae demonstrates significant over-representation in photosynthesis and cytoplasmic translation pathways (Fig. 1 D). These data suggest that RCMs are targeted to conserved cohorts of photosynthesis and translation-associated RNAs in these grass seedlings. While HAMR and ModTect are able to account for the non-random nature of plastid RNA editing, there is a possibility that the identified RCMs might be enriched on plastid-localized transcripts and thus be an artifact of plastid RNA editing. While modifications were only predicted on nuclear encoded transcripts, many organellar transcripts are encoded within the nucleus [ 46 ]. We assessed this possibility in Maize, using sequence homology of all annotated plastid proteins to the nuclear encoded proteome, to determine if organellar transcripts that have been integrated into the nuclear genome are enriched for potential artifactual RCMs. We find no over-representation of organellar to nuclear integrated RNAs marked with RCMs (24/2,249 maize RCMA marked RNAs are plastid RNA nuclear hits, p = .875 Hypergeometric test for over-representation, Fig. S1 B ). These findings suggest that RCMs are not artifacts of plastid RNA-editing, but reflect events on bona-fide nuclear encoded RNAs that are involved in photosynthetic pathways. To determine if the conservation of RCM deposition goes beyond functional pathway and extends to the genes themselves, we next examined if RCMs are found on transcripts derived from orthologous genes using both sequence homology and synteny. Of the 1,912 possible modified orthologs (the smallest observed number of modified genes in Setaria), we observed 1,074 modified orthologs in one or both of the other species, and 474 orthologs that were modified in all three. This is significantly more orthologs than expected by chance, accounting for the number of annotated and expressed orthologs (p < 2.2e-16 multi-set hypergeometric test, Fig. S1 C ). We then assessed whether orthologous mRNAs are marked by a similar number of RCMs across species, allowing us to understand whether the RCM density of the targeted RNA is a conserved feature. Indeed, for all three combinations of comparisons between two sampled species, there is a significant positive relationship between the number of RCMs deposited on orthologous mRNAs (Sorghum:Maize r = .39, Sorghum:Setaria r = .43, Setaria:Maize: r = .32, p < 2.2e-16, Fig. S1 D ). These findings further demonstrate that the targeting and density of RCM deposition are conserved features in the sampled Grasses. RCMs are dynamically deposited on mRNAs based on tissue and abiotic stress Our previous analyses were limited to a single developmental time point across three species. Previous reports [ 14 , 28 , 47 ] suggest that RCMs may play important roles in post-transcriptional RNA regulation and thus would be dynamically deposited across development or environmental changes. Therefore, we chose to examine RCMs across diverse tissues and environmental contexts using RNA-seq from publicly available datasets [ 48 – 50 ]. We focused our efforts on Sorghum, first identifying RCMs in a large-scale tissue expression atlas by McCormick and co-authors [ 49 ] containing 137 sequencing samples across 48 tissues/stages/conditions (see Methods ). From these data we identified 266,710 modifications on 6,805 unique transcripts, representing 19.3% of the expressed (TPM > = 1) Sorghum transcriptome. To determine whether the same repertoire of transcripts are being targeted with RCMs as our seedling data from Fig. 1 , or if our data were biased towards identifying a distinct subset of RNAs, we compared the modified RNAs in each dataset. We found 1,813 of the 2,542 modified Sorghum seedling mRNAs are also modified in the McCormick et al., tissue expression atlas. This overlap is substantially more than expected by chance (p < 2.2e-16, Fig. S2 A ) and likely reflects similarity in molecular processes within tissues examined. We next aimed to characterize the tissue specificity of RCMs in the McCormick Sorghum tissue expression atlas. Using a modified calculation for Tau, a value typically used to calculate tissue specificity based on RNA abundance [ 51 ], we observed that most transcripts were modified in a very context-specific manner ( Fig. S2 B ). The transcripts with the lowest Tau value, and therefore modified under the broadest context, were mostly associated with core cellular processes such as translation, whereas those transcripts modified in the most specific context were associated with mRNA maturation and abiotic stress responses ( Fig. S2 C, D ). We also clustered transcripts based on the tissue type in which they were modified, the number of modifications identified, and the average number of modifications per transcript (Fig. 2 A). In our analysis we observed a strong clustering based on tissue similarity, with seed and roots being notable exceptions. We find a strong bias for RCMs in this dataset towards root samples where both the number of RCMs and number of mRNA targets is substantially higher than other tissues (Fig. 2 A and Fig. S2 E ). In contrast, seeds displayed the fewest number of modified transcripts, but the average number of modifications per transcript was very similar to that seen in leaf and root tissues where the number of modified sites and transcripts were much higher (Fig. 2 A). We observed a large number of transcripts that are modified in all tissues (n = 851; Fig. 2 B and Fig. S3 A ), highlighting the existence of a core repertoire of RCM-targeted mRNAs. As expected, based on their presence across a broad tissue and developmental context, these mRNAs are enriched for terms associated with cytoplasmic mRNA translation ( Fig. S3 B ). However, the majority of transcripts were more restricted in terms of the tissue or developmental context under which they were modified and were enriched for more specialized GO terms. For instance, root-specific modified mRNAs were enriched for rhizosphere-associated terms [ 52 ] such as oxidation management, generic methylation, and aromatic compound biosynthesis ( Fig. S3 C ). Leaf-specific modified mRNAs were enriched for photosynthetic terms, whereas seed-specific mRNAs were enriched for lipid storage, ABA response, and cold acclimation terms ( Fig. S3 D, E ). Importantly, the observed increase in context-specific modifications was not simply due to differences in the most abundant transcripts in each tissue. Indeed, we observed a low (although positive, r = .34) correlation between RNA abundance and modification levels (Fig. 2 C; top ) in the Sorghum tissue atlas. Thus, although there is a core set of modified Sorghum transcripts, most are targeted for RCMs in a context or developmentally-specific manner. The conservation of RCMs on orthologous genes in our grass seedling data suggest that the developmental and tissue contexts under which these marks are deposited might also be conserved. To address this contextual conservation, we examined RCMs in two publicly available Arabidopsis thaliana tissue atlases [ 48 , 50 ]. Both atlases examined similar developmental stages, but did so under slightly different conditions (e.g., constant light and sterile MS media for Mergner et al. [ 48 ] vs cycling light and soil for Klepikova et al. [ 50 ]). Importantly, while the Klepikova tissue atlas is primarily used by the community to examine transcript abundance in the Arabidopsis EFP browser [ 53 ], the work by Mergner et al. performed a paired assessment of transcript and protein abundance, which we used to examine the relationship between modifications and translation (see below). Due to differences in experimental design, we analyzed each of these atlases separately. A full breakdown of examined tissues, as well as total number of modifications identified, can be found in Additional File 2 . Like Sorghum, clustering of Arabidopsis tissues by RCM density placed similar tissues together (Mergner et al: Fig. S4 A, B , Klepikova et al: Fig. S5A, B ). While Arabidopsis root tissues did not display a significantly elevated level of RCMs as in Sorghum, Arabidopsis seed transcripts had a reduced pool of very highly modified transcripts in both atlases. Similar to our observations in Sorghum, the seed-specific modified transcripts were enriched for lipid, nutrient, and ABA-response terms ( Fig. S6A, B, C ). Thus, as in Sorghum, the Arabidopsis epitranscriptome is diverse, highly context-specific, and appears to be associated with transcripts critical for cellular function. Given the similar patterns of RCM abundance in Sorghum and Arabidopsis, we next examined whether orthologous mRNAs between Sorghum and Arabidopsis, which last shared a common ancestor ~ 150–250 million years ago [ 54 ], are targeted by RCMs. While we identified fewer mRNA targets of RCMs in both Arabidopsis expression atlases relative to the Sorghum expression atlas (Mergner: 1,324, Klepikova: 2,495 vs 6,845 in Sorghum), the number of modified orthologous mRNAs (658 Mergner vs McCormick; 1,180 Klepikova vs McCormick) was significantly more than expected by chance for all possible mRNA combinations ( Fig. S6D, E, F , p = .001 Hypergeometric test). We conclude that this class of RCMs target an ancient conserved repertoire of translation and photosynthetic related mRNAs. Given the developmental differences in RCM deposition in both Sorghum and Arabidopsis, as well as reports associating RNA modifications with plant stress responses [ 9 , 14 , 55 – 59 ], we next determined if RCMs are associated with drought stress in Sorghum. We utilized a publicly available field-grown Sorghum transcriptome dataset that sampled well watered and water limited (i.e., drought treatment) Sorghum leaves and roots from drought tolerant and susceptible genotypes across weekly timepoints [ 60 ]. We focused on a post-flowering time point (week 10) where one genotype (BTx642) is considered drought tolerant and the other genotype (RTx430) is drought susceptible. Counterintuitively, we observed a shift towards a more negative correlation between RNA and RCM abundance during drought stress relative to the Sorghum tissue atlas (Fig. 2 C, bottom). A heatmap comparing transcript and RCM abundance between the two genotypes and treatments clustered into four groups (Fig. 2 D). One of these clusters (Cluster 3) showed similar increases in transcript abundance levels between the two genotypes under water limiting conditions but showed an increase in RCMs specifically in the drought tolerant genotype (Fig. 2 D). An examination of enriched GO terms revealed that Cluster 3 contained both heat shock proteins as well as water transport proteins (Fig. 2 E), suggesting that RCMs may be associated with the drought response in the tolerant genotype. RCMs accumulate near exon-exon junctions and are associated with splicing events Thus far we have observed an association between RCMs and plant developmental and environmental transcriptional responses. To gain functional insight into RCMs, we analyzed their accumulation and distribution across mRNA bodies using our seedling RNA-seq data for all three species. RCMs were enriched across the CDS, with a bias towards the 3’ CDS, and in the 3’ UTR of mRNAs (Fig. 3 A). Like m6A [ 20 ], we also observed that RCMs are biased towards being deposited on abnormally long exons (median length of modified exons = 361 nts, median unmodified = 147 nts, median all exons = 148 nts, Fig. 3 B, p < 2.2e-16 ). Additionally, we observed a significantly higher proportion of expressed transcripts that are multi-exonic being targeted by RCMs, relative to mono-exonic transcripts (~ 11–14% vs ~ 5–8%, Fig. 3 C, p < 7.1e-15). This was also the case for long non-coding RNAs (lncRNAs) where 4.8% of mono-exonic lncRNAs are marked by RCMs (17/355) and 8.4% of multi-exonic lncRNAs are targeted by RCMs (59/704, p = .044; Fig. S7A ). A closer examination of distinguishing CDS features uncovered a dramatic buildup of RCMs on both 5’ and 3’ edges of exon-exon junctions (EJs) relative to start and stop codons (Fig. 3 D). Thus, these data suggest that RCMs likely play a role in the regulation of RNA splicing on diverse transcript types. This observation of RCMs preferentially occurring at EJs was initially made by Vandivier et al [ 14 ] on degrading mRNAs, whereas degrading mRNAs likely make up a small proportion of our dataset. Thus, these findings suggest a steady-state RCM enrichment at mRNA EJs. A transcriptome-wide 3’ bias was observed for these EJ-enriched RCMs (Fig. 3 E ) . This terminal EJ enrichment was not due to 3’ sequencing bias that is often observed in poly-A enriched transcriptome datasets ( Fig. S7B, C, D ). Interestingly, genes that express multiple isoforms are more likely to be modified than single isoform transcripts with similar numbers of exons (Fig. 3 F, p < 2.2e-16). However, we observe no significant difference in the buildup of modified sites at alternatively spliced junctions vs canonical splice sites ( Fig. S8 , see Methods ). These data suggest that the increase in modification frequency at genes with more isoforms is likely due to the presence of more exon junctions. Thus, RCMs appear to be positively associated with splicing in plants. RCMs are positively associated with stable and translating mRNAs Because Mergner et al [ 48 ] measured RNA and protein abundances from matched Arabidopsis samples they were able to correlate RNA:protein abundances across their samples. Therefore, we examined whether mRNAs marked with RCMs displayed a higher or lower RNA:protein correlation. A difference in RNA:protein correlation could suggest a RCM function in RNA stability and/or translation efficiency. mRNAs that are not marked with RCMs across the Mergner et al atlas showed the lowest median RNA:protein correlation (n = 3,361, r = .68), while RCM marked mRNAs showed a significantly higher RNA:protein correlation (n = 332, r > = .758, p < .05, Fig. S9A ). Due to this positive correlation between transcripts harboring base-pair disrupting modifications and their translation products, it is possible that these RCMs are positively influencing translation, either by reducing structural complexity or by stabilizing these transcripts. To test for a relationship between RCMs and mRNA decay, we first examined publicly available transcriptomic data from Arabidopsis lines deficient in cytoplasmic mRNA decay pathways [ 61 ] by Sorenson and co-authors. Cytoplasmic mRNAs usually decay through three pathways: decapping (5’ -> 3’), the RNA exosome (3’ -> 5’), or an exosome independent 3’ -> 5’ decay pathway. Decapping occurs through a multi-subunit complex that is scaffolded by VARICOSE in plants and metazoans (VCS,[ 62 ]). Meanwhile, the exosome independent 3’ -> 5’ decay pathways occurs through SUPPRESSOR OF VARICOSE (SOV), which contributes to the decay of decapped RNAs and is also known as DIS3L2 in other Eukaryotes [ 63 – 65 ]. Sorenson et al took an RNA-seq approach to examine mRNA decay dynamics in four Arabidopsis genotypes that vary in their cytoplasmic mRNA decay factors (wild-type, sov knockout, vcs knockout, and sov/vcs double knockout). If RCMs are primarily marking mRNAs for degradation (as initially suggested by Vandivier et al [ 14 ]), then a buildup should be observed after transcriptional arrest in genetic backgrounds deficient for mRNA decay. Despite the well-known relationship between RCMs and housekeeping RNAs (snoRNAs and tRNAs), the majority of modified transcripts at the beginning and end of the time series were mRNAs (Fig. 4 A). As expected from arresting transcription, each genotype, with the exception of the sov/vcs double mutant, displayed a ~ 25–50% decrease in the total number of observable protein-coding transcripts eight hours post-treatment (see Methods ; Fig. S9B ). The number of observed RCMs increased in all genotypes after arresting transcription, as did the proportion of modified mRNAs (Fig. 4 B and Fig. S9C ), suggesting two possibilities: 1 ) that modification abundance increases with transcript age, or 2 ) that non-modified transcripts are degraded more quickly leading to a higher proportion of transcripts detectable over time containing RCMs. To more directly test the association of RCMs with RNA degradation, we analyzed whether the pool of transcripts that are modified at time point 0 were still detectable and modified at subsequent time points. Surprisingly, mRNAs modified at time point 0 were nearly all detectable 8 hours after transcription arrest while mRNAs that were not modified at time point 0 declined by more than 40% (Fig. 4 C). These results indicate that RCMs mark mRNAs which degrade slower than the entire mRNA pool. Indeed, mRNAs that are not targeted by RCMs show a significantly larger magnitude of TPM decrease relative to mRNAs marked with RCMs at time point 0 (Fig. 4 D). These results strongly suggest that RCM marked mRNAs degrade at a slower rate relative to mRNAs that are not marked by RCMs. Sorenson et al also modeled the initial decay rate of all mRNAs (alpha decay rate) based on read abundance after transcription arrest. Based on these values, mRNAs that are modified at time point 0 in wild-type show a significantly lower (slower) alpha decay rate relative to all other mRNAs (Fig. 4 E). Sorenson et al used the decay rates across genotypes to assign mRNAs to genotype-dependent RNA degradation pathways (Fig. 4 F x-axis groups 1–14, see Fig. 1 C from Sorenson et al). For mRNAs that could not be assigned to the VARICOSE or SOV degradation pathways, Sorenson et al hypothesized that they were likely targets of the RNA exosome (x-axis group 15; Fig. 4 F). Interestingly, modified mRNAs are significantly biased towards being assigned to this group and thus are likely targets of the RNA exosome (Fig. 4 F p < 2.2e-16, Pearson's Chi-squared test). Thus, based on these data, RCMs appear to be marks of Pol-II transcript stability, rather than marks of degradation. RCMs are not associated with Nonsense-Mediated mRNA Decay To further investigate whether RCM marked mRNAs have an association with RNA degradation pathways, we next turned our focus to the Nonsense Mediated mRNA Decay (NMD) pathway. NMD is responsible for degrading aberrant mRNAs that have a premature termination codon. This is often recognized as a termination codon upstream of the exon junction complex (EJC, [ 66 ]). Given the accumulation of RCMs at exon-exon junctions and the recent report of RNA degradation intermediates accumulating near exon-exon junctions [ 67 ], we tested whether modified mRNAs were likely targets of NMD. SUPPRESSOR OF MORPHOLOGICAL DEFECTS ON GENITALIA7 (SMG7) is a critical component of early NMD signaling in most Eukaryotes [ 68 ]. Gloggnitzer and co-authors [ 69 ] performed RNA-seq in Arabidopsis with a loss-of-function smg7 mutant in a genetic background avoiding the strong autoimmune response of knocking out NMD ( pad4; [ 69 ]). We re-processed the RNA-seq data generated by Gloggnitzer et al. and identified RCMs from their data. The up-regulated mRNAs in the smg7 genotype represent both direct and indirect targets of NMD silencing. There is strong statistical evidence that NMD targets are actually under-represented (that is, depleted) from mRNAs containing RCMs (5,275 mRNAs containing RCMs, 656 up-regulated mRNAs in smg7 , overlap = 103, p < 2.2e-16, Fig. S10A ). Furthermore, we identify no significant differences RCM distribution at exon-exon junctions between pad4 and smg7-pad4 RCMs ( Fig. S10B , p = .986), or between smg7 RCMs in mRNAs up-regulated in smg7 vs those not differentially abundant in smg7 ( Fig. S10C , p = .144). In agreement with the smg7 results which examined a single tissue, there is no significant overlap between predicted NMD targets and the RCMs identified in the Mergner et al. tissue atlas ( Fig. S10D , p = .304). Collectively, these results suggest RCMs are not associated with the NMD pathway. Discussion In this study we used a bioinformatic workflow consisting of HAMR and ModTect [ 14 , 15 ] to predict RCMs in diverse species across the flowering plant lineage in order to clarify RCM distribution and putative function on mRNAs. Of note, because these algorithms rely on a certain minimum depth of expression in order to statistically call a transcript as being modified, and we tended towards a conservative definition of when to call an RCM (present at the same site in bio-replicates based on both algorithms), making conclusions about low abundance transcripts can be problematic. Despite these caveats, we observed a substantial number of transcripts whose modification status changes along developmental or environmental gradients. In perhaps the most extensive comparative analysis to date, we demonstrated that the RCMs detected by HAMR and ModTect (e.g., those that fall along the WCF base-pairing edge) are found on a large, yet contextually discrete set of transcripts. From these diverse analyses we believe that a number of conclusions can be made that provide lessons as to the function of RCMs in eukaryotes. RCMs are not found on all expressed transcripts, nor are they always on the most abundant transcripts, suggesting a contextual specificity. Indeed, there appear to be two classes of transcripts that receive RCMs in plants. There are a core set of transcripts associated with photosynthesis and cellular metabolism that are modified in each tested tissue or developmental context. These transcripts tend to be abundantly modified and are the targets of all of the major base-pair disrupting modifications. Due to their critical functions, these genes are conserved, as is their modification status, across the plant lineage. The other class of modified transcripts receive RCMs in a more context-specific manner. These transcripts may be expressed in multiple tissues but are generally targeted by modifications in a more limited subset of conditions. Functional enrichment suggests that these transcripts are modified in response to environmental or developmental cues. We observed similar patterns of RCM deposition on all transcripts, with a bias towards longer exons and transcripts containing multiple exons. In contrast to m 6 A, these RCMs are enriched at exon splice junctions. Interestingly, while we see an increase in RCM abundance on transcripts with multiple isoforms, we see no association between RCM status and alternative splicing or intron retention, suggesting that modifications are more of a general marker of splicing rather than a driver of isoform selection. We would propose that these modifications are deposited on the base-pairing edge of the newly transcribed mRNA to reduce structure at exon junctions in order to facilitate splicing, whereas m6A is occluded from these sites due to its interference with binding of the splicing machinery [ 31 , 32 ]. This model would further suggest that, like m 6 A, these other RCMs are deposited co-transcriptionally and thus their writers should appear in complexes with transcriptional and epigenetic machinery. In support of this hypothesis, the m 5 C methyltransferases NSUN3 and DNMT2 are known to interact with hnRNPK for Pol-II recruitment and modulating chromatin state in response to stress [ 70 ]. Previous work in Arabidopsis comparing RCM abundance in degrading and total mRNA populations found evidence for RCMs as a marker for mRNA degradation [ 14 ] whereas others have suggested that loss of RCMs leads to unstable mRNAs [ 10 ]. In an attempt to resolve these conflicting models, we turned to an elegant time-series RNA-seq experiment in Arabidopsis examining the impacts of mutants in the mRNA decay pathway on RNA turnover (Sorenson et al., 2018). We used these data to assess the relationship between degradation and RCM status. To our surprise, we found that RCMs are more likely to mark extremely stable mRNAs, rather than those that are rapidly degrading. Our finding that most of the RCMs fall on transcripts associated with core cellular processes fits with the notion that these transcripts would also be more stable, a finding reported by Sorenson et al. and others [ 71 ]. Interestingly, we also observe a positive relationship between RCM abundance and protein abundance in the Arabidopsis tissue atlas, further suggesting that RCMs, as a whole, have a positive influence on mRNA stability. Importantly, we cannot definitively say that degrading RNAs are not marked, just that the RCMs we can monitor are enriched on mRNAs with long half-lives. A key difference between our work and that of Vandivier et al [ 14 ] is in the depth of sequencing, both for the grass seedling experiments that we performed and for the additional datasets we reprocessed, and so further examination of the connection between these base-pair disrupting RCMs and RNA degradation is warranted. Conclusions In this work, we used a comparative transcriptomic approach to better understand the conservation and function of a poorly studied class of RNA modifications. Grass seedlings contained similar cohorts of modified transcripts, a finding that spurred us to look in other datasets that incorporated additional stress or developmental timepoints in both Sorghum and Arabidopsis. Again, we observed a conserved, core set of modified transcripts, but these expanded datapoints allowed us to uncover context specificity to the epitranscriptome, suggesting that some, but not all, transcripts are modified at key steps in their functional lifecycle. Attempts to better understand function of these base pair disrupting modifications revealed that they are likely important for splicing, translation, and intriguingly, for RNA stability. Similarities across the Angiosperm lineage, along with supporting literature in metazoans, suggests that these modifications, like m 6 A, are a deeply conserved aspect of RNA biology for which we still know very little. Methods Plant growth: Seed from Zea mays (acc. B73), Sorghum bicolor (acc. BTx642), and Setaria italica (acc. Yugu1) were sown onto damp soil approximately 2 cm below the surface, and stratified for one week in the dark at 4°C. Seeds were then transferred to a growth cabinet with lights configured to a long-day cycle (16 hours light, 8 hours dark). All species’ seeds germinated within a day of each other and were allowed to grow for two weeks once germinated. Seedling shoot tissue was collected and instantly frozen with liquid nitrogen. RNA-sequencing library preparation: RNA from two biological replicates in each species was isolated using TriZol per the manufacturer’s instructions (ThermoFisher #15596026). Paired-end and strand specific sequencing libraries were generated for each species using the YourSeq FT Strand-Specific mRNA library prep kit (Active Motif #23001). Libraries were sequenced on a Novogene Novaseq platform. Read processing and modification identification: Raw sequencing reads were trimmed for adapters and low-quality bases using fastp (v 0.23.4, [ 72 ]) with default settings for paired end reads. Trimmed reads were then used to call HAMR modifications using the PAMLINC workflow ( https://github.com/chosenobih/pamlinc ). Briefly, this workflow maps reads to each species’ reference genome (obtained from Ensembl Plants in January 2022) using BowTie2 (v 2.2.5, [ 73 ]) and retains unique alignments that don't overlap exon-junctions or deletions. The HAMR statistical model is then run on these filtered alignment files (BAM files) to predict RCMs. In addition, a similar algorithm, ModTect (v 1.7.5, [ 15 ]), was used to call RCMs. For ModTect prediction, trimmed reads were mapped to the same Ensembl Plants genomes and associated annotation files as above using STAR (v 2.7.10b, [ 74 ]). The following non-default STAR arguments were used for mapping: --outFilterIntronMotifs RemoveNoncanonical, --alignIntronMax 10000, --outSAMstrandField intronMotif. Raw BAM outputs were then used as input for ModTect with the following arguments: –minBaseQual 30, --readlength 150, and --regionFile genes.bed which is a four-column file of all gene regions in the annotation file. For sequencing data generated in this study, RCM sites were retained that overlapped both ModTect and HAMR predictions among both sequencing replicates (Fig. 1 B). For public sequencing data, RCMs were identified using only ModTect and RCMs were retained that were common among three or more experiments (i.e., three or more distinct SRAs). RNA abundance measurement: Trimmed fastq files were used to measure RNA abundance using Salmon (v 1.10.0, [ 75 ]) in “quant” mode against a decoy aware transcriptome index. The tximport (v 1.28, [ 76 ]) R package was then used to import Salmon quant files and generate gene or transcript level transcript per million values, specifying the following tximport option: countsFromAbundance = "lengthScaledTPM". RNAs used for downstream RCM and expression analyses were those with at least one transcript per million in one experiment. For the reanalysis of Sorenson et al 2018, we used RPKM (reads per kilobase per million mapped) values that the authors provide in their supplemental data which accounts for stable transcripts containing an increasing proportion of sequencing reads. Alternative splicing: Two approaches were used to infer alternative splicing. First, genes that express multiple isoforms can be inferred from Salmon [ 75 ] using the tximport [ 76 ] R package as described above. Specifying the tximport “txOut” argument gives isoform level quantifications. Isoforms expressed above one TPM were retained. Genes expressing more than one isoform were inferred to be undergoing alternative splicing. Second, we examined statistically significant changes in splicing patterns between experimental conditions using EventPointer [ 77 ]. GO term enrichment: GO terms were downloaded for each species from Ensembl Plants. GO term enrichment was performed using the clusterProfiler (v 4.8.2, [ 78 ]) R package, specifically the “enricher” function was used, specifying: (1) the input gene set, (2) a p-value cutoff of .05, (3) Benjamini and Hochberg multiple testing correction, (4) a background gene set (usually, all expressed genes (TPM > = 1)), (5) a minimum and maximum gene-set size of 10 and 500, respectively, and (6) a q-value cutoff of .05. Fold enrichment was calculated using the following functions: parse_ratio(GeneRatio) / parse_ratio(BgRatio). Data analyses: All analyses and statistical testing was performed using the R programming language (v 4.2.0, [ 79 ]). Tests for over or underrepresentation were performed using the phyper function (i.e., a hypergeometric test) in R. Correlations between RNA abundance and RCMs were performed by collecting the RNA abundance values and RCM density values (number of RCMs per transcript) from each sequencing experiment of interest. Pearson Correlation Coefficients were then calculated from the RNA abundance and RCM density using the “cor” function in R. To measure statistical significance of mean-separation between groups, the Student’s t-test was used for two group comparisons, while one-way Anova was used for more-than-two group comparisons. All visualizations were generated in R from various packages including: ggplot2 (v 3.4.4, [ 80 ]), ggvenn (v 0.1.10, [ 81 ]), enrichPlot (v 1.20.0, [ 82 ]), complexUpset (v 1.3.3, [ 83 ]), complexHeatmap (v 2.16., [ 84 ]), and the ggtree package (v 3.8.2, [ 85 ]) Declarations Ethics approval and consent to participate: Not applicable Consent for publication. All authors read and consent to this publication. Availability of data and materials: The accession numbers of RNA-sequencing data generated in this study can be found at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1116564. The scripts and intermediate files for analyses are available at: https://github.com/kylepalos/comparative-PRM-paper. Competing Interests: The authors declare no competing interests. Funding: We would like to acknowledge support from NSF IOS PGRP 2023310 (to ADLN, EHL, and BDG), NSF MCB 2120131 (to ADLN), and NSF-DBI-1743442 to EHL. Author Contributions: KP, ACND, EHL, BDG, and ADLN developed the project. ACDN grew plants, isolated RNA, and performed RNA-seq. KP analyzed data, EHL, BDG, and ADLN provided feedback on data analysis. KP and ADLN wrote the manuscript. All authors edited and approved the manuscript. Acknowledgements: We would like to thank Drs Duke Pauli (University of Arizona) and Susan Schroeder (Oklahoma University) for insightful comments during the development of the project. We would like to thank Drs. Rebecca Murphy (Centenary College of Louisiana) and Daryl Morishige for providing useful insight into the Sorghum tissue atlas. Finally, we would like to thank members of the Gregory and Nelson labs for constructive feedback. References Cappannini A, Ray A, Purta E, Mukherjee S, Boccaletto P, Moafinejad SN, et al. MODOMICS: a database of RNA modifications and related information. 2023 update. Nucleic Acids Res. 2024;52:D239–44. Baumer ZT, Erber L, Jolley E, Lawrence S, Lin C, Murakami S, et al. Defining the commonalities between post-transcriptional and post-translational modification communities. Trends Biochem Sci. 2024;49:185–8. Saletore Y, Meyer K, Korlach J, Vilfan ID, Jaffrey S, Mason CE. The birth of the Epitranscriptome: deciphering the function of RNA modifications. Genome Biol. 2012;13:175. Gilbert WV, Nachtergaele S. mRNA Regulation by RNA Modifications. Annu Rev Biochem. 2023;92:175–98. Lewis CJT, Pan T, Kalsotra A. RNA modifications and structures cooperate to guide RNA-protein interactions. Nat Rev Mol Cell Biol. 2017;18:202–10. Roy B. Effects of mRNA Modifications on Translation: An Overview. In: McMahon M, editor. RNA Modifications: Methods and Protocols. New York, NY: Springer US; 2021. pp. 327–56. Prall W, Ganguly DR, Gregory BD. The covalent nucleotide modifications within plant mRNAs: What we know, how we find them, and what should be done in the future. Plant Cell. 2023;35:1801–16. Sharma B, Prall W, Bhatia G, Gregory BD. The Diversity and Functions of Plant RNA Modifications: What We Know and Where We Go from Here. Annu Rev Plant Biol. 2023;74:53–85. Kramer MC, Janssen KA, Palos K, Nelson ADL, Vandivier LE, Garcia BA, et al. N6-methyladenosine and RNA secondary structure affect transcript stability and protein abundance during systemic salt stress in Arabidopsis. Plant Direct. 2020;4:e00239. Sun H, Li K, Liu C, Yi C. Regulation and functions of non-m6A mRNA modifications. Nat Rev Mol Cell Biol. 2023;24:714–31. Luo G-Z, MacQueen A, Zheng G, Duan H, Dore LC, Lu Z, et al. Unique features of the m6A methylome in Arabidopsis thaliana. Nat Commun. 2014;5:5630. Perry RP, Kelley DE, Friderici K, Rottman F. The methylated constituents of L cell messenger RNA: evidence for an unusual cluster at the 5’ terminus. Cell. 1975;4:387–94. Wiener D, Schwartz S. The epitranscriptome beyond m6A. Nat Rev Genet. 2021;22:119–31. Vandivier LE, Campos R, Kuksa PP, Silverman IM, Wang L-S, Gregory BD. Chemical Modifications Mark Alternatively Spliced and Uncapped Messenger RNAs in Arabidopsis. Plant Cell. 2015;27:3024–37. Tan K-T, Ding L-W, Wu C-S, Tenen DG, Yang H. Repurposing RNA sequencing for discovery of RNA modifications in clinical cohorts. Sci Adv [Internet]. 2021;7. http://dx.doi.org/10.1126/sciadv.abd2605 . Kannan K, Nelson ADL, Shippen DE. Dyskerin is a component of the Arabidopsis telomerase RNP required for telomere maintenance. Mol Cell Biol. 2008;28:2332–41. Yang Y, Hsu PJ, Chen Y-S, Yang Y-G. Dynamic transcriptomic m6A decoration: writers, erasers, readers and functions in RNA metabolism. Cell Res. 2018;28:616–24. Flamand MN, Tegowski M, Meyer KD. The Proteins of mRNA Modification: Writers, Readers, and Erasers. Annu Rev Biochem. 2023;92:145–73. Wilkinson E, Cui Y-H, He Y-Y. Roles of RNA Modifications in Diverse Cellular Functions. Front Cell Dev Biol. 2022;10:828683. Dominissini D, Moshitch-Moshkovitz S, Schwartz S, Salmon-Divon M, Ungar L, Osenberg S, et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature. 2012;485:201–6. Sun L, Xu Y, Bai S, Bai X, Zhu H, Dong H, et al. Transcriptome-wide analysis of pseudouridylation of mRNA and non-coding RNAs in Arabidopsis. J Exp Bot. 2019;70:5089–600. Li M, Tao Z, Zhao Y, Li L, Zheng J, Li Z, et al. 5-methylcytosine RNA methyltransferases and their potential roles in cancer. J Transl Med. 2022;20:214. Chen X, Li A, Sun B-F, Yang Y, Han Y-N, Yuan X, et al. 5-methylcytosine promotes pathogenesis of bladder cancer through stabilizing mRNAs. Nat Cell Biol. 2019;21:978–90. Barbieri I, Kouzarides T. Role of RNA modifications in cancer. Nat Rev Cancer. 2020;20:303–22. Zhong S, Li H, Bodi Z, Button J, Vespa L, Herzog M, et al. MTA is an Arabidopsis messenger RNA adenosine methylase and interacts with a homolog of a sex-specific splicing factor. Plant Cell. 2008;20:1278–88. Yang W, Meng J, Liu J, Ding B, Tan T, Wei Q, et al. The N1-Methyladenosine Methylome of Petunia mRNA. Plant Physiol. 2020;183:1710–24. Cui X, Liang Z, Shen L, Zhang Q, Bao S, Geng Y, et al. 5-Methylcytosine RNA Methylation in Arabidopsis Thaliana. Mol Plant. 2017;10:1387–99. David R, Burgess A, Parker B, Li J, Pulsford K, Sibbritt T, et al. Transcriptome-Wide Mapping of RNA 5-Methylcytosine in Arabidopsis mRNAs and Noncoding RNAs. Plant Cell. 2017;29:445–60. Decatur WA, Fournier MJ. rRNA modifications and ribosome function. Trends Biochem Sci. 2002;27:344–51. Zhou KI, Shi H, Lyu R, Wylder AC, Matuszek Ż, Pan JN, et al. Regulation of Co-transcriptional Pre-mRNA Splicing by m6A through the Low-Complexity Protein hnRNPG. Mol Cell. 2019;76:70–e819. Uzonyi A, Dierks D, Nir R, Kwon OS, Toth U, Barbosa I, et al. Exclusion of m6A from splice-site proximal regions by the exon junction complex dictates m6A topologies and mRNA stability. Mol Cell. 2023;83:237–e517. Yang X, Triboulet R, Liu Q, Sendinc E, Gregory RI. Exon junction complex shapes the m6A epitranscriptome. Nat Commun. 2022;13:7904. Karikó K, Muramatsu H, Welsh FA, Ludwig J, Kato H, Akira S, et al. Incorporation of pseudouridine into mRNA yields superior nonimmunogenic vector with increased translational capacity and biological stability. Mol Ther. 2008;16:1833–40. Eyler DE, Franco MK, Batool Z, Wu MZ, Dubuke ML, Dobosz-Bartoszek M, et al. Pseudouridinylation of mRNA coding sequences alters translation. Proc Natl Acad Sci U S A. 2019;116:23068–74. Zhang Q, Kim N-K, Feigon J. Architecture of human telomerase RNA. Proc Natl Acad Sci U S A. 2011;108:20325–32. Kim N-K, Theimer CA, Mitchell JR, Collins K, Feigon J. Effect of pseudouridylation on the structure and activity of the catalytically essential P6.1 hairpin in human telomerase RNA. Nucleic Acids Res. 2010;38:6746–56. Anderson BR, Muramatsu H, Jha BK, Silverman RH, Weissman D, Karikó K. Nucleoside modifications in RNA limit activation of 2’-5'-oligoadenylate synthetase and increase resistance to cleavage by RNase L. Nucleic Acids Res. 2011;39:9329–38. Yang Y, Wang L, Han X, Yang W-L, Zhang M, Ma H-L, et al. RNA 5-Methylcytosine Facilitates the Maternal-to-Zygotic Transition by Preventing Maternal mRNA Decay. Mol Cell. 2019;75:1188–e20211. Zhang Y, Lu L, Li X. Detection technologies for RNA modifications. Exp Mol Med. 2022;54:1601–16. Schaefer M, Pollex T, Hanna K, Lyko F. RNA cytosine methylation analysis by bisulfite sequencing. Nucleic Acids Res. 2009;37:e12. Woodson SA, Muller JG, Burrows CJ, Rokita SE. A primer extension assay for modification of guanine by Ni(II) complexes. Nucleic Acids Res. 1993;21:5524–5. Ryvkin P, Leung YY, Silverman IM, Childress M, Valladares O, Dragomir I, et al. HAMR: high-throughput annotation of modified ribonucleotides. RNA. 2013;19:1684–92. Todkari IA, Chandrasekaran AR, Punnoose JA, Mao S, Haruehanroengra P, Beckles C, et al. Resolving altered base-pairing of RNA modifications with DNA nanoswitches. Nucleic Acids Res. 2023;51:11291–7. Chiu NHL, Simpson JH, Wang H, Tannous BA. A theoretical perspective of the physical properties of different RNA modifications with respect to RNA duplexes. BBA Adv. 2021;1:100025. Katz K, Shutov O, Lapoint R, Kimelman M, Brister JR, O’Sullivan C. The Sequence Read Archive: a decade more of explosive growth. Nucleic Acids Res. 2022;50:D387–90. Sloan DB, Warren JM, Williams AM, Wu Z, Abdel-Ghany SE, Chicco AJ, et al. Cytonuclear integration and co-evolution. Nat Rev Genet. 2018;19:635–48. Wang Y, Li S, Zhao Y, You C, Le B, Gong Z, et al. NAD + -capped RNAs are widespread in the Arabidopsis transcriptome and can probably be translated. Proc Natl Acad Sci U S A. 2019;116:12094–102. Mergner J, Frejno M, List M, Papacek M, Chen X, Chaudhary A, et al. Mass-spectrometry-based draft of the Arabidopsis proteome. Nature. 2020;579:409–14. McCormick RF, Truong SK, Sreedasyam A, Jenkins J, Shu S, Sims D, et al. The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization. Plant J. 2018;93:338–54. Klepikova AV, Kasianov AS, Gerasimov ES, Logacheva MD, Penin AA. A high resolution map of the Arabidopsis thaliana developmental transcriptome based on RNA-seq profiling. Plant J. 2016;88:1058–70. Kryuchkova-Mostacci N, Robinson-Rechavi M. A benchmark of gene expression tissue-specificity metrics. Brief Bioinform. 2017;18:205–14. Matilla MA, Espinosa-Urgel M, Rodríguez-Herva JJ, Ramos JL, Ramos-González MI. Genomic analysis reveals the major driving forces of bacterial life in the rhizosphere. Genome Biol. 2007;8:R179. Sullivan A, Purohit PK, Freese NH, Pasha A, Esteban E, Waese J, et al. An eFP-Seq Browser for visualizing and exploring RNA sequencing data. Plant J. 2019;100:641–54. Murat F, Armero A, Pont C, Klopp C, Salse J. Reconstructing the genome of the most recent common ancestor of flowering plants. Nat Genet. 2017;49:490–6. Sharma B, Govindan G, Li Y, Sunkar R, Gregory BD. RNA N6-Methyladenosine Affects Copper-Induced Oxidative Stress Response in Arabidopsis thaliana. Noncoding RNA [Internet]. 2024;10. http://dx.doi.org/10.3390/ncrna10010008 . Prall W, Sheikh AH, Bazin J, Bigeard J, Almeida-Trapp M, Crespi M, et al. Pathogen-induced m6A dynamics affect plant immunity. Plant Cell. 2023;35:4155–72. Govindan G, Sharma B, Li Y-F, Armstrong CD, Merum P, Rohila JS, et al. mRNA N6 -methyladenosine is critical for cold tolerance in Arabidopsis. Plant J. 2022;111:1052–68. Cheng Q, Wang P, Wu G, Wang Y, Tan J, Li C, et al. Coordination of m6A mRNA methylation and gene transcriptome in rice response to cadmium stress. Rice. 2021;14:62. Hu J, Cai J, Park SJ, Lee K, Li Y, Chen Y, et al. N6 -Methyladenosine mRNA methylation is important for salt stress tolerance in Arabidopsis. Plant J. 2021;106:1759–75. Varoquaux N, Cole B, Gao C, Pierroz G, Baker CR, Patel D, et al. Transcriptomic analysis of field-droughted sorghum from seedling to maturity reveals biotic and metabolic responses. Proc Natl Acad Sci U S A. 2019;116:27124–32. Sorenson RS, Deshotel MJ, Johnson K, Adler FR, Sieburth LE. Arabidopsis mRNA decay landscape arises from specialized RNA decay substrates, decapping-mediated feedback, and redundancy. Proc Natl Acad Sci U S A. 2018;115:E1485–94. Deyholos MK, Cavaness GF, Hall B, King E, Punwani J, Van Norman J, et al. VARICOSE, a WD-domain protein, is required for leaf blade development. Development. 2003;130:6577–88. Zhang W, Murphy C, Sieburth LE. Conserved RNaseII domain protein functions in cytoplasmic mRNA decay and suppresses Arabidopsis decapping mutant phenotypes. Proc Natl Acad Sci U S A. 2010;107:15981–5. Lubas M, Damgaard CK, Tomecki R, Cysewski D, Jensen TH, Dziembowski A. Exonuclease hDIS3L2 specifies an exosome-independent 3’-5' degradation pathway of human cytoplasmic mRNA. EMBO J. 2013;32:1855–68. Malecki M, Viegas SC, Carneiro T, Golik P, Dressaire C, Ferreira MG, et al. The exoribonuclease Dis3L2 defines a novel eukaryotic RNA degradation pathway. EMBO J. 2013;32:1842–54. Hug N, Longman D, Cáceres JF. Mechanism and regulation of the nonsense-mediated decay pathway. Nucleic Acids Res. 2016;44:1483–95. Lee W-C, Hou B-H, Hou C-Y, Tsao S-M, Kao P, Chen H-M. Widespread Exon Junction Complex Footprints in the RNA Degradome Mark mRNA Degradation before Steady State Translation. Plant Cell. 2020;32:904–22. Unterholzner L, Izaurralde E. SMG7 acts as a molecular link between mRNA surveillance and mRNA decay. Mol Cell. 2004;16:587–96. Gloggnitzer J, Akimcheva S, Srinivasan A, Kusenda B, Riehs N, Stampfl H, et al. Nonsense-mediated mRNA decay modulates immune receptor levels to regulate plant antibacterial defense. Cell Host Microbe. 2014;16:376–90. Cheng JX, Chen L, Li Y, Cloe A, Yue M, Wei J, et al. RNA cytosine methylation and methyltransferases mediate chromatin organization and 5-azacytidine response and resistance in leukaemia. Nat Commun. 2018;9:1163. Li Y, Yi Y, Lv J, Gao X, Yu Y, Babu SS, et al. Low RNA stability signifies increased post-transcriptional regulation of cell identity genes. Nucleic Acids Res. 2023;51:6020–38. Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14:417–9. Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 2015;4:1521. Ferrer-Bonsoms JA, Gimeno M, Olaverri D, Sacristan P, Lobato C, Castilla C, et al. EventPointer 3.0: flexible and accurate splicing analysis that includes studying the differential usage of protein-domains. NAR Genom Bioinform. 2022;4:lqac067. Yu G, Wang L-G, Han Y, He Q-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–7. Team RC. Others. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www R-project org/ [Internet]. 2016; https://cir.nii.ac.jp/crid/1574231874043578752 . Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer; 2016. Yan L, ggvenn. Draw Venn Diagram by ggplot2. R Package Version. 2021;19. Yu G. Enrichplot: visualization of functional enrichment result. R package version. 2021;1. Krassowski M. ComplexUpset. ComplexUpset. 2020. Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32:2847–9. Yu G. Using ggtree to Visualize Data on Tree-Like Structures. Curr Protoc Bioinf. 2020;69:e96. FactR [Internet]. [cited 2024 Apr 24]. https://fursham-h.github.io/factR/ . Additional Declarations No competing interests reported. Supplementary Files AdditionalFile1.xlsx Additional File 1: Overview of RCMs identified in all three grass seedlings. AdditionalFile2.xlsx Additional File 2: Summary of RNA-seq datasets used and the number of RCMs identified across the tissue expression atlases. Supplementfinal.pdf SupplementalMaterials.docx Cite Share Download PDF Status: Published Journal Publication published 12 Aug, 2024 Read the published version in BMC Plant Biology → Version 1 posted Editorial decision: Revision requested 16 Jul, 2024 Reviews received at journal 15 Jul, 2024 Reviews received at journal 10 Jul, 2024 Reviewers agreed at journal 28 Jun, 2024 Reviewers agreed at journal 17 Jun, 2024 Reviewers agreed at journal 02 Jun, 2024 Reviewers invited by journal 30 May, 2024 Editor invited by journal 29 May, 2024 Submission checks completed at journal 29 May, 2024 Editor assigned by journal 29 May, 2024 First submitted to journal 23 May, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4466769","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":309859609,"identity":"b2efee81-a1dd-4d2f-8bc5-19f1fd77b0c5","order_by":0,"name":"Kyle Palos","email":"","orcid":"","institution":"The Boyce Thompson Institute","correspondingAuthor":false,"prefix":"","firstName":"Kyle","middleName":"","lastName":"Palos","suffix":""},{"id":309859610,"identity":"a6eba1bc-a510-4bbc-84be-277af3a2f42b","order_by":1,"name":"Anna C. Nelson Dittrich","email":"","orcid":"","institution":"The Boyce Thompson Institute","correspondingAuthor":false,"prefix":"","firstName":"Anna","middleName":"C. Nelson","lastName":"Dittrich","suffix":""},{"id":309859611,"identity":"0a8bd7d2-c63a-41e7-9502-1ab8bc526672","order_by":2,"name":"Eric H. Lyons","email":"","orcid":"","institution":"University of Arizona","correspondingAuthor":false,"prefix":"","firstName":"Eric","middleName":"H.","lastName":"Lyons","suffix":""},{"id":309859612,"identity":"2045ae1f-e1ea-4022-9cf3-a7083b0ecafc","order_by":3,"name":"Brian D. Gregory","email":"","orcid":"","institution":"University of Pennsylvania","correspondingAuthor":false,"prefix":"","firstName":"Brian","middleName":"D.","lastName":"Gregory","suffix":""},{"id":309859613,"identity":"79be9eb4-9913-4f81-8b9f-92dd4ae766d3","order_by":4,"name":"Andrew D. L. Nelson","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAxUlEQVRIiWNgGAWjYHCCxMc/KsAMxgMPgEQDEVoeGzOcgbAOJBCnhfGZNGMbKVr4ZzcnSBfOs5Mzn918AKjFRnbDAQJaJO4cSzCeuS3ZWAbIAGpJMyaoheFGTkIC77YDiTMkcgyAWg4nEtQifyP/wwHeOSAtQEYCw3/CWgxuJCQ28zaAbQF5/wBhLYY3EpIZZxxLNpaQSAM6zCDZeCYhLXI3EtJ/fKixk5OQSH744EOFnWwfIS3o7iRN+SgYBaNgFIwCHAAA6atLdDWL/iQAAAAASUVORK5CYII=","orcid":"","institution":"The Boyce Thompson Institute","correspondingAuthor":true,"prefix":"","firstName":"Andrew","middleName":"D. L.","lastName":"Nelson","suffix":""}],"badges":[],"createdAt":"2024-05-23 12:14:45","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4466769/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4466769/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12870-024-05486-7","type":"published","date":"2024-08-12T15:58:07+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":58215570,"identity":"eeec4e98-26d3-46a4-968f-126501022721","added_by":"auto","created_at":"2024-06-12 14:35:21","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":212151,"visible":true,"origin":"","legend":"\u003cp\u003eRCM identification in three model Grasses\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e(A) \u003c/strong\u003eSchematic of experimental design. Left and middle: Two biological replicates of grass seedlings from each species at a similar developmental timepoint were used to generate paired-end RNA-seq libraries. Right: Depiction of RCMs causing errors in reverse transcriptase nucleotide incorporation. Different modifications will result in a non-random pattern of SNPs or deletions at the modified site in the resulting cDNA. ML algorithms such as HAMR and ModTect can infer the modification class based on these errors.\u003cstrong\u003e (B) \u003c/strong\u003eVenn diagram showing the overlap of modified sites in the Sorghum transcriptome between biological replicates and two RCM detection algorithms. The union of all four categories (n = 7020) was kept for downstream analyses. \u003cstrong\u003e(C) \u003c/strong\u003eSummarizing the number of modified sites and number of modified transcripts in all three sampled Grasses. \u003cstrong\u003e(D) \u003c/strong\u003eEnriched GO terms of the modified RNAs in \u003cstrong\u003e(C)\u003c/strong\u003e.\u003c/p\u003e","description":"","filename":"Fig1.png","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/ee42311266f4d911e9ff8d1a.png"},{"id":58215016,"identity":"2ce969d3-5551-4a9e-beda-9f3b63b5f9f0","added_by":"auto","created_at":"2024-06-12 14:27:21","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":174878,"visible":true,"origin":"","legend":"\u003cp\u003eRCM dynamics across development and during stress\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e(A)\u003c/strong\u003e Summary of RCM findings from re-analysis of Sorghum tissue expression atlas [49]. Hierarchical clustering using the number of RCMs per transcripts as input. For ease of viewing, branches were collapsed based on tissue and modification similarity. See \u003cstrong\u003eFig. S2E\u003c/strong\u003e for un-collapsed tree. Tree tip labels denote the broad tissue category with the number after the tissue representing the number of SRAs used to summarize that broad tissue.\u003cstrong\u003e Modified sites:\u003c/strong\u003e Boxplots showing the total number of RCMs per SRA in each broad tissue category. \u003cstrong\u003eModified RNAs:\u003c/strong\u003e The total number of modified RNAs (genes) per SRA in each tissue category. \u003cstrong\u003eModifications per RNA:\u003c/strong\u003e The RCM density per RNA per SRA in each broad tissue category. \u003cstrong\u003e(B)\u003c/strong\u003e Upset plot quantifying the shared modified RNAs across the broad tissue categories in \u003cstrong\u003e(A)\u003c/strong\u003e. \u003cstrong\u003e(C)\u003c/strong\u003e RCM-RNA abundance correlation distribution of all modified RNAs across the McCormick et al tissue expression atlas [49] and the Varoquaux et al [60] Sorghum drought experiment. The dashed red line shows the mean Pearson Correlation Coefficient of each experiment (r = .34 Tissue, r = .18 Drought). \u003cstrong\u003e(D)\u003c/strong\u003eHeatmap of RNA abundance and RCM changes in Varoquaux et al re-analysis. Each row is a modified RNA that has been filtered for RNA abundance variability (see methods). Rows were clustered based on their abundance and RCM density values. The heatmap was grouped into four distinct clusters using k-means. SC = drought susceptible control, SD = drought susceptible treatment, TC = drought tolerant control, TD = drought tolerant treatment. At the time point analyzed, the drought susceptible genotype is RTx430 and the tolerant genotype is BTx642. \u003cstrong\u003e(E)\u003c/strong\u003e Enriched GO terms of the modified RNAs in each of the four clusters in \u003cstrong\u003e(D)\u003c/strong\u003e.\u003c/p\u003e","description":"","filename":"Fig2.png","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/f85890fdf47633f9dfb99146.png"},{"id":58215019,"identity":"3ccbf134-3008-4aa0-b75d-ba2de84419bd","added_by":"auto","created_at":"2024-06-12 14:27:21","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":124355,"visible":true,"origin":"","legend":"\u003cp\u003eRCMs accumulate at exon-exon junctions and are associated with splicing\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e(A)\u003c/strong\u003e Metagene plot showing the genic distribution of grass RCMs identified in \u003cstrong\u003eFig. 1\u003c/strong\u003e. Ten nucleotide windows are plotted where the signal is the sum of RCMs in that window and plotted as density. \u003cstrong\u003e(B) \u003c/strong\u003eDensity distribution of exon lengths for RCM marked exons vs exons not marked by RCMs. (Student’s t-test, p \u0026lt; 2.2e-16). \u003cstrong\u003e(C) \u003c/strong\u003eProportion of expressed transcripts that receive RCMs in grass seedlings between mono-exonic and multi-exonic transcripts. Multi-exonic transcripts are significantly more likely to be modified than mono-exonic (Chi-squared test, all p-values \u0026lt; 7.1e-15). \u003cstrong\u003e(D)\u003c/strong\u003e Density of Sorghum seedling RCMs at start codons, stop codons, and 5’/3 exon-exon junctions. Density curves are plotted over a density histogram, both using a two-nucleotide window. The dotted line represents the first nucleotide of the start and stop codon, and the first nucleotide of the intron for 5’/3’ exon junction panels. \u003cstrong\u003e(E)\u003c/strong\u003e Histogram quantifying the number of Sorghum seedling RCMs accumulating at terminal exon-exon junctions vs the first exon-exon junction. \u003cstrong\u003e(F)\u003c/strong\u003e Box Plot and dot plot overlad quantifying the proportion of RNAs that are marked by RCMs or not in each SRAs in the McCormick et al Sorghum tissue expression atlas. The x-axis splits the data by the number of isoforms a gene is expressing. RCM proportion increases as the number of isoforms expressed increases (p \u0026lt; 2.2e-16 one-way ANOVA).\u003c/p\u003e","description":"","filename":"Fig3.png","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/c24c8707666c430aaa867240.png"},{"id":58215023,"identity":"2b4b245d-8c1d-48b9-911f-9a97cfb4b1ea","added_by":"auto","created_at":"2024-06-12 14:27:21","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":138124,"visible":true,"origin":"","legend":"\u003cp\u003eRCMs mark mRNAs that degrade slowly\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e(A\u003c/strong\u003e) Reanalysis of Sorenson et al 2018 Arabidopsis dataset. Distribution of Arabidopsis gene types that receive RCMs at time point 0 and 8 hours after arresting transcription. tRNA = transfer RNA, sn/snoRNA = small nuclear/small nucleolar RNA, rRNA = ribosomal RNA, mRNA = messenger RNA, lncRNA = long noncoding RNA. \u003cstrong\u003e(B) \u003c/strong\u003eNumber of Arabidopsis RCMs predicted at each time point after arresting transcription and in each RNA degradation genotype. \u003cstrong\u003e(C)\u003c/strong\u003eChange in number of detectable RNAs over time after arresting transcription. Solid lines represent mRNAs that have a RCM at time point 0 while dashed lines represent mRNAs that have no RCMs at any time point. Expressed genes are those that have a normalized expression value \u0026gt;= 1 (see methods).\u003cstrong\u003e (D) \u003c/strong\u003eComparing RNA abundance change after transcription arrest at each time point. Red boxplots (left grouping) are mRNAs with an RCM at time point 0, blue boxplots (right group) are mRNAs that have no RCMs at any time point. Comparisons at time points 7, 15, and 30, p \u0026gt; .05, all time points starting at 60 minutes and afterwards, p \u0026lt; .001 by one-way ANOVA. \u003cstrong\u003e(E)\u003c/strong\u003e Boxplot showing the distribution of ɑ-decay rates between RCM marked mRNAs at time point 0 and mRNAs that have no RCMs at any time point.\u003cstrong\u003e(F)\u003c/strong\u003e Modified figure from Sorenson et al 2018 \u003cstrong\u003eFig. 1C\u003c/strong\u003e showing the number of mRNAs that belong to different decay categories. Categories 1-14 are inconsequential for our conclusions. Group 15 represents genotype independent decay, (decay rates are not affected by the genotypes in the Sorenson et al study). This category may represent targets of the exosome. * P-value between modified/never modified for Group 15 calculated by Chi-squared test.\u003c/p\u003e","description":"","filename":"Fig4.png","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/936bab7bd2ffc6cefdc97a3e.png"},{"id":63071352,"identity":"6ea7c389-8839-49aa-bb3d-f9bf24623cc9","added_by":"auto","created_at":"2024-08-22 20:06:42","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1428829,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/8015645c-6bef-4a89-bd1a-fe02e237eb02.pdf"},{"id":58215021,"identity":"bb725f5b-7c93-4167-8bf2-6e8892fa46a0","added_by":"auto","created_at":"2024-06-12 14:27:21","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":2876967,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAdditional File 1:\u003c/strong\u003e Overview of RCMs identified in all three grass seedlings.\u003c/p\u003e","description":"","filename":"AdditionalFile1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/4f1f25852fc7dcface18302a.xlsx"},{"id":58215018,"identity":"789913fc-1762-4a88-b510-ba13e808cc9a","added_by":"auto","created_at":"2024-06-12 14:27:21","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":27208,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAdditional File 2:\u003c/strong\u003e Summary of RNA-seq datasets used and the number of RCMs identified across the tissue expression atlases.\u003c/p\u003e","description":"","filename":"AdditionalFile2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/862850c0b48fdb9d4934fa6c.xlsx"},{"id":58215022,"identity":"4f77a102-f9b1-44f4-ae80-24feba2534ac","added_by":"auto","created_at":"2024-06-12 14:27:21","extension":"pdf","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":1493920,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementfinal.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/db7025dfe04d424b5f65949f.pdf"},{"id":58215020,"identity":"a8cb5cca-03c0-4326-a4e7-52a51b484350","added_by":"auto","created_at":"2024-06-12 14:27:21","extension":"docx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":17401,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementalMaterials.docx","url":"https://assets-eu.researchsquare.com/files/rs-4466769/v1/40cdc57fddb298044bd3806c.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Comparative analyses uncover a link between mRNA splicing, stability, and RNA covalent modifications in flowering plants","fulltext":[{"header":"Background","content":"\u003cp\u003eThere are over 100 RNA covalent modifications (RCMs) that are deposited on all classes of RNAs at various stages of their lifecycle [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. RCMs, collectively referred to as the \u0026ldquo;epitranscriptome\u0026rdquo; [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e] are known to influence RNA stability, splicing, structure, intra- and intermolecular interactions, and translation [\u003cspan additionalcitationids=\"CR5 CR6 CR7 CR8\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. While most RCMs have been observed on ribosomal and transfer RNAs, they have also been observed on messenger RNAs (mRNAs) and long non-coding RNAs (lncRNAs), albeit at lower levels [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. Although pseudouridine is the most abundant modification among all classes of RNA [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e], \u003cem\u003eN\u003c/em\u003e\u003csup\u003e6\u003c/sup\u003e-methyladenosine (m\u003csup\u003e6\u003c/sup\u003eA) is the most abundant internal \u003cem\u003emRNA\u003c/em\u003e RCM targeting approximately 30% of all mRNAs and is present at ~\u0026thinsp;0.4% of all adenosine nucleotides in cellular RNAs [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Additional internal RCMs, such as pseudouridine, \u003cem\u003eN\u003c/em\u003e\u003csup\u003e1\u003c/sup\u003e-methyladenosine (m\u003csup\u003e1\u003c/sup\u003eA), and 5-methylcytidine (m\u003csup\u003e5\u003c/sup\u003eC), have been observed on mRNAs in a number of eukaryotic systems, although their functional significance on these RNA molecules is not always clear [\u003cspan additionalcitationids=\"CR14\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eMuch of what we do know about the epitranscriptome is drawn from studies on m\u003csup\u003e6\u003c/sup\u003eA in mammalian and plant systems, but can likely be applied to many of the other modification classes. With the notable exception of pseudouridine [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e], RCMs are deposited through the enzymatic activity of highly conserved proteins called writers (e.g., methyltransferases), \u0026ldquo;interpreted\u0026rdquo;, or bound, by RNA binding proteins called readers, and removed by eraser enzymes (e.g., demethylases) [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan additionalcitationids=\"CR18\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. Arabidopsis and mammals have a similar m\u003csup\u003e6\u003c/sup\u003eA frequency across the transcriptome (~\u0026thinsp;1 m\u003csup\u003e6\u003c/sup\u003eA per 2000 nucleotides), which is targeted to a similar consensus motif RRACH (R\u0026thinsp;=\u0026thinsp;G or A; H\u0026thinsp;=\u0026thinsp;U, A, or C), and has a similar 3\u0026rsquo; mRNA bias around the stop codon [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. In plants and mammals, pseudouridine has been found to localize primarily to the CDS and 5\u0026rsquo; UTR of mRNAs, prefers the first U of a triplet codon (e.g., UUC/UCU/UUU), and is overrepresented on transcripts targeted to the chloroplast or mitochondria [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Motif preferences for m5C can vary depending on the methyltransferase involved, but in humans are generally found on C/G rich regions of mRNAs [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. While less is known about binding preferences of the other predominant RCMs, such as m\u003csup\u003e1\u003c/sup\u003eA and m\u003csup\u003e3\u003c/sup\u003eC, aberrant addition (or removal) has been observed as a hallmark for tumorigenesis [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. Importantly, loss of many of the writers, readers, and erasers for these RCMs is lethal or causes severe developmental defects in all tested systems [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan additionalcitationids=\"CR26 CR27\" citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eRCMs function through a variety of molecular mechanisms to regulate mRNA stability, structure, and splicing. RCMs induce changes in base-pairing properties and higher order structure, ostensibly allowing for increased flexibility in interactors, translational capacity, and structure associated stabilization [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. There is a close association between splicing and m6A deposition [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e], with recent studies revealing that m\u003csup\u003e6\u003c/sup\u003eA is specifically excluded from splice junctions due to physical occlusion by the exon junction complex, a process that ultimately impacts cytoplasmic mRNA decay [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e, \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Pseudouridine has also been implicated in mRNA splicing, stability, and translational efficiency [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e], as well as the stability and maturation of lncRNAs such as the telomerase RNA [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. Other modifications, such as m\u003csup\u003e1\u003c/sup\u003eA, m\u003csup\u003e3\u003c/sup\u003eC, and m\u003csup\u003e5\u003c/sup\u003eC are also believed to be important for mRNA stability, mostly due to phenotypes associated with loss of function mutants in their associated writer proteins [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e, \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e, \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. Thus, RCMs are both a ubiquitous and critical aspect of an RNA\u0026rsquo;s lifecycle, but specific mechanistic details for many of them are still lacking.\u003c/p\u003e \u003cp\u003eGiven the high frequency with which essential RNAs such as rRNAs, tRNAs and snoRNAs are modified, it has historically been difficult to use genetic approaches to monitor the RCM status of mRNAs and lncRNAs. As a result, most studies have relied either on transcriptome-wide antibody-based (m\u003csup\u003e6\u003c/sup\u003eA), or modification-specific chemical-genomic approaches (e.g., bisulfite sequencing for m\u003csup\u003e5\u003c/sup\u003eC; [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e, \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. An alternative sequencing-based approach relies on the propensity for RCMs that occur at the interface between the canonical base-pairing edge (i.e., the Watson-Crick-Franklin, or WCF base-pairing edge) to be misidentified by reverse transcriptases during the cDNA synthesis step [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e, \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. This misidentification results in non-random misincorporations or deletions at modified residues. Multiple algorithms, including HAMR and ModTect [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e], have been developed to infer modification status, and class, based on these \u0026ldquo;sequencing errors\u0026rdquo;. While this approach cannot detect RCMs outside of the WCF base-pairing edge, over 30 modification types can be identified, including m\u003csup\u003e1\u003c/sup\u003eA, pseudouridine, m\u003csup\u003e3\u003c/sup\u003eC, and m\u003csup\u003e5\u003c/sup\u003eC [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan additionalcitationids=\"CR43\" citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]. Importantly, where there is overlap (such as for m\u003csup\u003e5\u003c/sup\u003eC, m\u003csup\u003e3\u003c/sup\u003eC, and m\u003csup\u003e1\u003c/sup\u003eA), the antibody, chemical-genomic, and sequencing error-based approaches are largely in agreement [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. Inferring modification status via HAMR or ModTect allows both for the repurposing of existing RNA-seq data, of which there are petabases available in NCBI\u0026rsquo;s SRA [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e], and the side-by-side calculation of modification and transcript abundance.\u003c/p\u003e \u003cp\u003eHere we utilized both HAMR and ModTect to analyze new and publicly available RNA-seq in Arabidopsis and three agronomically important and closely related grasses: \u003cem\u003eSorghum bicolor\u003c/em\u003e, \u003cem\u003eZea mays\u003c/em\u003e, and \u003cem\u003eSetaria italica\u003c/em\u003e. We used these diverse stress, tissue-atlas, and developmental datasets in an attempt to better understand what some of these less characterized RCMs might be doing in plants and how they might be conserved, both functionally and in terms of their target genes. We present an in-depth comparative assessment of RCM dynamics across these plant species and uncover new insights into RNA splicing, RNA stability, and plant responses to stress. We demonstrate a level of conservation of target genes previously unseen for RCMs, and reveal that RCMs may participate in population-level variation in stress responses. Mechanistically, we identify a link between splicing and RCMs that is present in all examined systems. Finally, through the analysis of RNA decay pathways in Arabidopsis, we establish that RCMs are associated with unusually stable mRNAs, an aspect that may facilitate their continued translation, as we observe a link between modification state and protein abundance.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eIdentification of Post-transcriptional RNA modifications in three Poaceae\u003c/h2\u003e \u003cp\u003eTo determine the degree to which the epitranscriptome is conserved at a gene level, we performed poly-A enriched RNA-sequencing on soil grown seedlings of \u003cem\u003eZea mays\u003c/em\u003e, \u003cem\u003eSorghum bicolor\u003c/em\u003e, and \u003cem\u003eSetaria italica\u003c/em\u003e (Maize, Sorghum, and Setaria, respectively, going forward) two weeks after germination. Paired-end, 150 base-pair RNA-sequencing was performed on aboveground tissue for two biological replicates (~\u0026thinsp;20\u0026nbsp;million reads per replicate; Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). Following read mapping, modified sites were inferred using HAMR and ModTect (see \u003cb\u003eMethods\u003c/b\u003e). Both of these algorithms take advantage of the propensity for certain reverse transcriptases to misinterpret ribonucleotides that are modified along the canonical base-pairing edge, and as a result, incorporate non-reference nucleotides (i.e., SNPs) in the synthesized cDNA. This misincorporation is non-random, and both HAMR and ModTect use the resulting population of SNPs to infer modification class. RNA-seq data were fed into both algorithms and sites that were predicted by both algorithms, and both replicates were retained for subsequent analyses as high confidence Post-transcriptional RNA Modifications (RCMs). From this approach, 5,434-7,020 unique RCMs were identified in 1,944-2,542 transcripts in the three species (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e1\u003c/span\u003eB, C, \u003cb\u003eAdditional File 1\u003c/b\u003e). These modifications represent the seven major classes detected by HAMR, with the m1A|m1I|ms2i6A class being the most common in all three species (approximately 33% of all identified RCMs; \u003cb\u003eFig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eA\u003c/b\u003e). Additionally, we find that mRNAs marked by RCMs are present at a substantially higher abundance relative to non-modified mRNAs (modified median TPM\u0026thinsp;=\u0026thinsp;78.5, n\u0026thinsp;=\u0026thinsp;2521 mRNAs, not modified median TPM\u0026thinsp;=\u0026thinsp;10.2, n\u0026thinsp;=\u0026thinsp;16,537, p\u0026thinsp;\u0026lt;\u0026thinsp;2.2e-16, \u003cb\u003eFig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eE\u003c/b\u003e). Enriched gene ontology terms of modified RNAs in each Poaceae demonstrates significant over-representation in photosynthesis and cytoplasmic translation pathways (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e1\u003c/span\u003eD). These data suggest that RCMs are targeted to conserved cohorts of photosynthesis and translation-associated RNAs in these grass seedlings.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eWhile HAMR and ModTect are able to account for the non-random nature of plastid RNA editing, there is a possibility that the identified RCMs might be enriched on plastid-localized transcripts and thus be an artifact of plastid RNA editing. While modifications were only predicted on nuclear encoded transcripts, many organellar transcripts are encoded within the nucleus [\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e]. We assessed this possibility in Maize, using sequence homology of all annotated plastid proteins to the nuclear encoded proteome, to determine if organellar transcripts that have been integrated into the nuclear genome are enriched for potential artifactual RCMs. We find no over-representation of organellar to nuclear integrated RNAs marked with RCMs (24/2,249 maize RCMA marked RNAs are plastid RNA nuclear hits, p\u0026thinsp;=\u0026thinsp;.875 Hypergeometric test for over-representation, \u003cb\u003eFig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eB\u003c/b\u003e). These findings suggest that RCMs are not artifacts of plastid RNA-editing, but reflect events on bona-fide nuclear encoded RNAs that are involved in photosynthetic pathways.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo determine if the conservation of RCM deposition goes beyond functional pathway and extends to the genes themselves, we next examined if RCMs are found on transcripts derived from orthologous genes using both sequence homology and synteny. Of the 1,912 possible modified orthologs (the smallest observed number of modified genes in Setaria), we observed 1,074 modified orthologs in one or both of the other species, and 474 orthologs that were modified in all three. This is significantly more orthologs than expected by chance, accounting for the number of annotated and expressed orthologs (p\u0026thinsp;\u0026lt;\u0026thinsp;2.2e-16 multi-set hypergeometric test, \u003cb\u003eFig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eC\u003c/b\u003e). We then assessed whether orthologous mRNAs are marked by a similar number of RCMs across species, allowing us to understand whether the RCM density of the targeted RNA is a conserved feature. Indeed, for all three combinations of comparisons between two sampled species, there is a significant positive relationship between the number of RCMs deposited on orthologous mRNAs (Sorghum:Maize r\u0026thinsp;=\u0026thinsp;.39, Sorghum:Setaria r\u0026thinsp;=\u0026thinsp;.43, Setaria:Maize: r\u0026thinsp;=\u0026thinsp;.32, p\u0026thinsp;\u0026lt;\u0026thinsp;2.2e-16, \u003cb\u003eFig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eD\u003c/b\u003e). These findings further demonstrate that the targeting and density of RCM deposition are conserved features in the sampled Grasses.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eRCMs are dynamically deposited on mRNAs based on tissue and abiotic stress\u003c/h2\u003e \u003cp\u003eOur previous analyses were limited to a single developmental time point across three species. Previous reports [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e, \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e] suggest that RCMs may play important roles in post-transcriptional RNA regulation and thus would be dynamically deposited across development or environmental changes. Therefore, we chose to examine RCMs across diverse tissues and environmental contexts using RNA-seq from publicly available datasets [\u003cspan additionalcitationids=\"CR49\" citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. We focused our efforts on Sorghum, first identifying RCMs in a large-scale tissue expression atlas by McCormick and co-authors [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e] containing 137 sequencing samples across 48 tissues/stages/conditions (see \u003cb\u003eMethods\u003c/b\u003e). From these data we identified 266,710 modifications on 6,805 unique transcripts, representing 19.3% of the expressed (TPM\u0026thinsp;\u0026gt;\u0026thinsp;=\u0026thinsp;1) Sorghum transcriptome. To determine whether the same repertoire of transcripts are being targeted with RCMs as our seedling data from Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e1\u003c/span\u003e, or if our data were biased towards identifying a distinct subset of RNAs, we compared the modified RNAs in each dataset. We found 1,813 of the 2,542 modified Sorghum seedling mRNAs are also modified in the McCormick et al., tissue expression atlas. This overlap is substantially more than expected by chance (p\u0026thinsp;\u0026lt;\u0026thinsp;2.2e-16, \u003cb\u003eFig. \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003eA\u003c/b\u003e) and likely reflects similarity in molecular processes within tissues examined.\u003c/p\u003e \u003cp\u003eWe next aimed to characterize the tissue specificity of RCMs in the McCormick Sorghum tissue expression atlas. Using a modified calculation for Tau, a value typically used to calculate tissue specificity based on RNA abundance [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e], we observed that most transcripts were modified in a very context-specific manner (\u003cb\u003eFig. \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003eB\u003c/b\u003e). The transcripts with the lowest Tau value, and therefore modified under the broadest context, were mostly associated with core cellular processes such as translation, whereas those transcripts modified in the most specific context were associated with mRNA maturation and abiotic stress responses (\u003cb\u003eFig. \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003eC, D\u003c/b\u003e). We also clustered transcripts based on the tissue type in which they were modified, the number of modifications identified, and the average number of modifications per transcript (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). In our analysis we observed a strong clustering based on tissue similarity, with seed and roots being notable exceptions. We find a strong bias for RCMs in this dataset towards root samples where both the number of RCMs and number of mRNA targets is substantially higher than other tissues (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eA \u003cb\u003eand Fig. \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003eE\u003c/b\u003e). In contrast, seeds displayed the fewest number of modified transcripts, but the average number of modifications per transcript was very similar to that seen in leaf and root tissues where the number of modified sites and transcripts were much higher (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). We observed a large number of transcripts that are modified in all tissues (n\u0026thinsp;=\u0026thinsp;851; Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eB and \u003cb\u003eFig. \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003eA\u003c/b\u003e), highlighting the existence of a core repertoire of RCM-targeted mRNAs. As expected, based on their presence across a broad tissue and developmental context, these mRNAs are enriched for terms associated with cytoplasmic mRNA translation (\u003cb\u003eFig. \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003eB\u003c/b\u003e). However, the majority of transcripts were more restricted in terms of the tissue or developmental context under which they were modified and were enriched for more specialized GO terms. For instance, root-specific modified mRNAs were enriched for rhizosphere-associated terms [\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e] such as oxidation management, generic methylation, and aromatic compound biosynthesis (\u003cb\u003eFig. \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003eC\u003c/b\u003e). Leaf-specific modified mRNAs were enriched for photosynthetic terms, whereas seed-specific mRNAs were enriched for lipid storage, ABA response, and cold acclimation terms (\u003cb\u003eFig. \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003eD, E\u003c/b\u003e). Importantly, the observed increase in context-specific modifications was not simply due to differences in the most abundant transcripts in each tissue. Indeed, we observed a low (although positive, r\u0026thinsp;=\u0026thinsp;.34) correlation between RNA abundance and modification levels (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eC; \u003cb\u003etop\u003c/b\u003e) in the Sorghum tissue atlas. Thus, although there is a core set of modified Sorghum transcripts, most are targeted for RCMs in a context or developmentally-specific manner.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe conservation of RCMs on orthologous genes in our grass seedling data suggest that the developmental and tissue contexts under which these marks are deposited might also be conserved. To address this contextual conservation, we examined RCMs in two publicly available \u003cem\u003eArabidopsis thaliana\u003c/em\u003e tissue atlases [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e, \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. Both atlases examined similar developmental stages, but did so under slightly different conditions (e.g., constant light and sterile MS media for Mergner et al. [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e] vs cycling light and soil for Klepikova et al. [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]). Importantly, while the Klepikova tissue atlas is primarily used by the community to examine transcript abundance in the Arabidopsis EFP browser [\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e], the work by Mergner et al. performed a paired assessment of transcript and protein abundance, which we used to examine the relationship between modifications and translation (see below). Due to differences in experimental design, we analyzed each of these atlases separately. A full breakdown of examined tissues, as well as total number of modifications identified, can be found in \u003cb\u003eAdditional File 2\u003c/b\u003e. Like Sorghum, clustering of Arabidopsis tissues by RCM density placed similar tissues together (Mergner et al: \u003cb\u003eFig. \u003cspan refid=\"MOESM4\" class=\"InternalRef\"\u003eS4\u003c/span\u003eA, B\u003c/b\u003e, Klepikova et al: \u003cb\u003eFig. S5A, B\u003c/b\u003e). While Arabidopsis root tissues did not display a significantly elevated level of RCMs as in Sorghum, Arabidopsis seed transcripts had a reduced pool of very highly modified transcripts in both atlases. Similar to our observations in Sorghum, the seed-specific modified transcripts were enriched for lipid, nutrient, and ABA-response terms (\u003cb\u003eFig. S6A, B, C\u003c/b\u003e). Thus, as in Sorghum, the Arabidopsis epitranscriptome is diverse, highly context-specific, and appears to be associated with transcripts critical for cellular function.\u003c/p\u003e \u003cp\u003eGiven the similar patterns of RCM abundance in Sorghum and Arabidopsis, we next examined whether orthologous mRNAs between Sorghum and Arabidopsis, which last shared a common ancestor\u0026thinsp;~\u0026thinsp;150\u0026ndash;250\u0026nbsp;million years ago [\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e], are targeted by RCMs. While we identified fewer mRNA targets of RCMs in both Arabidopsis expression atlases relative to the Sorghum expression atlas (Mergner: 1,324, Klepikova: 2,495 vs 6,845 in Sorghum), the number of modified orthologous mRNAs (658 Mergner vs McCormick; 1,180 Klepikova vs McCormick) was significantly more than expected by chance for all possible mRNA combinations (\u003cb\u003eFig. S6D, E, F\u003c/b\u003e, p\u0026thinsp;=\u0026thinsp;.001 Hypergeometric test). We conclude that this class of RCMs target an ancient conserved repertoire of translation and photosynthetic related mRNAs.\u003c/p\u003e \u003cp\u003eGiven the developmental differences in RCM deposition in both Sorghum and Arabidopsis, as well as reports associating RNA modifications with plant stress responses [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan additionalcitationids=\"CR56 CR57 CR58\" citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e], we next determined if RCMs are associated with drought stress in Sorghum. We utilized a publicly available field-grown Sorghum transcriptome dataset that sampled well watered and water limited (i.e., drought treatment) Sorghum leaves and roots from drought tolerant and susceptible genotypes across weekly timepoints [\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e]. We focused on a post-flowering time point (week 10) where one genotype (BTx642) is considered drought tolerant and the other genotype (RTx430) is drought susceptible. Counterintuitively, we observed a shift towards a more negative correlation between RNA and RCM abundance during drought stress relative to the Sorghum tissue atlas (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eC, bottom). A heatmap comparing transcript and RCM abundance between the two genotypes and treatments clustered into four groups (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). One of these clusters (Cluster 3) showed similar increases in transcript abundance levels between the two genotypes under water limiting conditions but showed an increase in RCMs specifically in the drought tolerant genotype (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). An examination of enriched GO terms revealed that Cluster 3 contained both heat shock proteins as well as water transport proteins (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e2\u003c/span\u003eE), suggesting that RCMs may be associated with the drought response in the tolerant genotype.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eRCMs accumulate near exon-exon junctions and are associated with splicing events\u003c/h2\u003e \u003cp\u003eThus far we have observed an association between RCMs and plant developmental and environmental transcriptional responses. To gain functional insight into RCMs, we analyzed their accumulation and distribution across mRNA bodies using our seedling RNA-seq data for all three species. RCMs were enriched across the CDS, with a bias towards the 3\u0026rsquo; CDS, and in the 3\u0026rsquo; UTR of mRNAs (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). Like m6A [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e], we also observed that RCMs are biased towards being deposited on abnormally long exons (median length of modified exons\u0026thinsp;=\u0026thinsp;361 nts, median unmodified\u0026thinsp;=\u0026thinsp;147 nts, median all exons\u0026thinsp;=\u0026thinsp;148 nts, Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e3\u003c/span\u003eB, p\u0026thinsp;\u003cb\u003e\u0026lt;\u0026thinsp;2.2e-16\u003c/b\u003e). Additionally, we observed a significantly higher proportion of expressed transcripts that are multi-exonic being targeted by RCMs, relative to mono-exonic transcripts (~\u0026thinsp;11\u0026ndash;14% vs\u0026thinsp;~\u0026thinsp;5\u0026ndash;8%, Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e3\u003c/span\u003eC, p\u0026thinsp;\u0026lt;\u0026thinsp;7.1e-15). This was also the case for long non-coding RNAs (lncRNAs) where 4.8% of mono-exonic lncRNAs are marked by RCMs (17/355) and 8.4% of multi-exonic lncRNAs are targeted by RCMs (59/704, p\u0026thinsp;=\u0026thinsp;.044; \u003cb\u003eFig. S7A\u003c/b\u003e). A closer examination of distinguishing CDS features uncovered a dramatic buildup of RCMs on both 5\u0026rsquo; and 3\u0026rsquo; edges of exon-exon junctions (EJs) relative to start and stop codons (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e3\u003c/span\u003eD). Thus, these data suggest that RCMs likely play a role in the regulation of RNA splicing on diverse transcript types.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThis observation of RCMs preferentially occurring at EJs was initially made by Vandivier et al [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] on degrading mRNAs, whereas degrading mRNAs likely make up a small proportion of our dataset. Thus, these findings suggest a steady-state RCM enrichment at mRNA EJs. A transcriptome-wide 3\u0026rsquo; bias was observed for these EJ-enriched RCMs (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e3\u003c/span\u003eE\u003cb\u003e)\u003c/b\u003e. This terminal EJ enrichment was not due to 3\u0026rsquo; sequencing bias that is often observed in poly-A enriched transcriptome datasets (\u003cb\u003eFig. S7B, C, D\u003c/b\u003e). Interestingly, genes that express multiple isoforms are more likely to be modified than single isoform transcripts with similar numbers of exons (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e3\u003c/span\u003eF, p\u0026thinsp;\u0026lt;\u0026thinsp;2.2e-16). However, we observe no significant difference in the buildup of modified sites at alternatively spliced junctions vs canonical splice sites (\u003cb\u003eFig. S8\u003c/b\u003e, see \u003cb\u003eMethods\u003c/b\u003e). These data suggest that the increase in modification frequency at genes with more isoforms is likely due to the presence of more exon junctions. Thus, RCMs appear to be positively associated with splicing in plants.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eRCMs are positively associated with stable and translating mRNAs\u003c/h2\u003e \u003cp\u003eBecause Mergner et al [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e] measured RNA and protein abundances from matched Arabidopsis samples they were able to correlate RNA:protein abundances across their samples. Therefore, we examined whether mRNAs marked with RCMs displayed a higher or lower RNA:protein correlation. A difference in RNA:protein correlation could suggest a RCM function in RNA stability and/or translation efficiency. mRNAs that are not marked with RCMs across the Mergner et al atlas showed the lowest median RNA:protein correlation (n\u0026thinsp;=\u0026thinsp;3,361, r\u0026thinsp;=\u0026thinsp;.68), while RCM marked mRNAs showed a significantly higher RNA:protein correlation (n\u0026thinsp;=\u0026thinsp;332, r\u0026thinsp;\u0026gt;\u0026thinsp;=\u0026thinsp;.758, p\u0026thinsp;\u0026lt;\u0026thinsp;.05, \u003cb\u003eFig. S9A\u003c/b\u003e). Due to this positive correlation between transcripts harboring base-pair disrupting modifications and their translation products, it is possible that these RCMs are positively influencing translation, either by reducing structural complexity or by stabilizing these transcripts.\u003c/p\u003e \u003cp\u003eTo test for a relationship between RCMs and mRNA decay, we first examined publicly available transcriptomic data from Arabidopsis lines deficient in cytoplasmic mRNA decay pathways [\u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e] by Sorenson and co-authors. Cytoplasmic mRNAs usually decay through three pathways: decapping (5\u0026rsquo; -\u0026gt; 3\u0026rsquo;), the RNA exosome (3\u0026rsquo; -\u0026gt; 5\u0026rsquo;), or an exosome independent 3\u0026rsquo; -\u0026gt; 5\u0026rsquo; decay pathway. Decapping occurs through a multi-subunit complex that is scaffolded by VARICOSE in plants and metazoans (VCS,[\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e]). Meanwhile, the exosome independent 3\u0026rsquo; -\u0026gt; 5\u0026rsquo; decay pathways occurs through SUPPRESSOR OF VARICOSE (SOV), which contributes to the decay of decapped RNAs and is also known as DIS3L2 in other Eukaryotes [\u003cspan additionalcitationids=\"CR64\" citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e]. Sorenson et al took an RNA-seq approach to examine mRNA decay dynamics in four Arabidopsis genotypes that vary in their cytoplasmic mRNA decay factors (wild-type, \u003cem\u003esov\u003c/em\u003e knockout, \u003cem\u003evcs\u003c/em\u003e knockout, and \u003cem\u003esov/vcs\u003c/em\u003e double knockout). If RCMs are primarily marking mRNAs for degradation (as initially suggested by Vandivier et al [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]), then a buildup should be observed after transcriptional arrest in genetic backgrounds deficient for mRNA decay.\u003c/p\u003e \u003cp\u003eDespite the well-known relationship between RCMs and housekeeping RNAs (snoRNAs and tRNAs), the majority of modified transcripts at the beginning and end of the time series were mRNAs (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e4\u003c/span\u003eA). As expected from arresting transcription, each genotype, with the exception of the sov/vcs double mutant, displayed a\u0026thinsp;~\u0026thinsp;25\u0026ndash;50% decrease in the total number of observable protein-coding transcripts eight hours post-treatment (see \u003cb\u003eMethods\u003c/b\u003e; \u003cb\u003eFig. S9B\u003c/b\u003e). The number of observed RCMs increased in all genotypes after arresting transcription, as did the proportion of modified mRNAs (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e4\u003c/span\u003eB \u003cb\u003eand Fig. S9C\u003c/b\u003e), suggesting two possibilities: \u003cb\u003e1\u003c/b\u003e) that modification abundance increases with transcript age, or \u003cb\u003e2\u003c/b\u003e) that non-modified transcripts are degraded more quickly leading to a higher proportion of transcripts detectable over time containing RCMs.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo more directly test the association of RCMs with RNA degradation, we analyzed whether the pool of transcripts that are modified at time point 0 were still detectable and modified at subsequent time points. Surprisingly, mRNAs modified at time point 0 were nearly all detectable 8 hours after transcription arrest while mRNAs that were not modified at time point 0 declined by more than 40% (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e4\u003c/span\u003eC). These results indicate that RCMs mark mRNAs which degrade slower than the entire mRNA pool. Indeed, mRNAs that are not targeted by RCMs show a significantly larger magnitude of TPM decrease relative to mRNAs marked with RCMs at time point 0 (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e4\u003c/span\u003eD). These results strongly suggest that RCM marked mRNAs degrade at a slower rate relative to mRNAs that are not marked by RCMs.\u003c/p\u003e \u003cp\u003eSorenson et al also modeled the initial decay rate of all mRNAs (alpha decay rate) based on read abundance after transcription arrest. Based on these values, mRNAs that are modified at time point 0 in wild-type show a significantly lower (slower) alpha decay rate relative to all other mRNAs (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e4\u003c/span\u003eE). Sorenson et al used the decay rates across genotypes to assign mRNAs to genotype-dependent RNA degradation pathways (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e4\u003c/span\u003eF x-axis groups 1\u0026ndash;14, see Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e1\u003c/span\u003eC from Sorenson et al). For mRNAs that could not be assigned to the VARICOSE or SOV degradation pathways, Sorenson et al hypothesized that they were likely targets of the RNA exosome (x-axis group 15; Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e4\u003c/span\u003eF). Interestingly, modified mRNAs are significantly biased towards being assigned to this group and thus are likely targets of the RNA exosome (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e4\u003c/span\u003eF p\u0026thinsp;\u0026lt;\u0026thinsp;2.2e-16, Pearson's Chi-squared test). Thus, based on these data, RCMs appear to be marks of Pol-II transcript stability, rather than marks of degradation.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eRCMs are not associated with Nonsense-Mediated mRNA Decay\u003c/h2\u003e \u003cp\u003eTo further investigate whether RCM marked mRNAs have an association with RNA degradation pathways, we next turned our focus to the Nonsense Mediated mRNA Decay (NMD) pathway. NMD is responsible for degrading aberrant mRNAs that have a premature termination codon. This is often recognized as a termination codon upstream of the exon junction complex (EJC, [\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e]). Given the accumulation of RCMs at exon-exon junctions and the recent report of RNA degradation intermediates accumulating near exon-exon junctions [\u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e], we tested whether modified mRNAs were likely targets of NMD. SUPPRESSOR OF MORPHOLOGICAL DEFECTS ON GENITALIA7 (SMG7) is a critical component of early NMD signaling in most Eukaryotes [\u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e]. Gloggnitzer and co-authors [\u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e] performed RNA-seq in Arabidopsis with a loss-of-function \u003cem\u003esmg7\u003c/em\u003e mutant in a genetic background avoiding the strong autoimmune response of knocking out NMD (\u003cem\u003epad4;\u003c/em\u003e [\u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e]). We re-processed the RNA-seq data generated by Gloggnitzer et al. and identified RCMs from their data. The up-regulated mRNAs in the \u003cem\u003esmg7\u003c/em\u003e genotype represent both direct and indirect targets of NMD silencing. There is strong statistical evidence that NMD targets are actually under-represented (that is, depleted) from mRNAs containing RCMs (5,275 mRNAs containing RCMs, 656 up-regulated mRNAs in \u003cem\u003esmg7\u003c/em\u003e, overlap\u0026thinsp;=\u0026thinsp;103, p\u0026thinsp;\u0026lt;\u0026thinsp;2.2e-16, \u003cb\u003eFig. S10A\u003c/b\u003e). Furthermore, we identify no significant differences RCM distribution at exon-exon junctions between \u003cem\u003epad4\u003c/em\u003e and \u003cem\u003esmg7-pad4\u003c/em\u003e RCMs (\u003cb\u003eFig. S10B\u003c/b\u003e, p\u0026thinsp;=\u0026thinsp;.986), or between \u003cem\u003esmg7\u003c/em\u003e RCMs in mRNAs up-regulated in \u003cem\u003esmg7\u003c/em\u003e vs those not differentially abundant in \u003cem\u003esmg7\u003c/em\u003e (\u003cb\u003eFig. S10C\u003c/b\u003e, p\u0026thinsp;=\u0026thinsp;.144). In agreement with the \u003cem\u003esmg7\u003c/em\u003e results which examined a single tissue, there is no significant overlap between predicted NMD targets and the RCMs identified in the Mergner et al. tissue atlas (\u003cb\u003eFig. S10D\u003c/b\u003e, p\u0026thinsp;=\u0026thinsp;.304). Collectively, these results suggest RCMs are not associated with the NMD pathway.\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this study we used a bioinformatic workflow consisting of HAMR and ModTect [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e] to predict RCMs in diverse species across the flowering plant lineage in order to clarify RCM distribution and putative function on mRNAs. Of note, because these algorithms rely on a certain minimum depth of expression in order to statistically call a transcript as being modified, and we tended towards a conservative definition of when to call an RCM (present at the same site in bio-replicates based on both algorithms), making conclusions about low abundance transcripts can be problematic. Despite these caveats, we observed a substantial number of transcripts whose modification status changes along developmental or environmental gradients. In perhaps the most extensive comparative analysis to date, we demonstrated that the RCMs detected by HAMR and ModTect (e.g., those that fall along the WCF base-pairing edge) are found on a large, yet contextually discrete set of transcripts. From these diverse analyses we believe that a number of conclusions can be made that provide lessons as to the function of RCMs in eukaryotes.\u003c/p\u003e \u003cp\u003eRCMs are not found on all expressed transcripts, nor are they always on the most abundant transcripts, suggesting a contextual specificity. Indeed, there appear to be two classes of transcripts that receive RCMs in plants. There are a core set of transcripts associated with photosynthesis and cellular metabolism that are modified in each tested tissue or developmental context. These transcripts tend to be abundantly modified and are the targets of all of the major base-pair disrupting modifications. Due to their critical functions, these genes are conserved, as is their modification status, across the plant lineage. The other class of modified transcripts receive RCMs in a more context-specific manner. These transcripts may be expressed in multiple tissues but are generally targeted by modifications in a more limited subset of conditions. Functional enrichment suggests that these transcripts are modified in response to environmental or developmental cues.\u003c/p\u003e \u003cp\u003eWe observed similar patterns of RCM deposition on all transcripts, with a bias towards longer exons and transcripts containing multiple exons. In contrast to m\u003csup\u003e6\u003c/sup\u003eA, these RCMs are enriched at exon splice junctions. Interestingly, while we see an increase in RCM abundance on transcripts with multiple isoforms, we see no association between RCM status and alternative splicing or intron retention, suggesting that modifications are more of a general marker of splicing rather than a driver of isoform selection. We would propose that these modifications are deposited on the base-pairing edge of the newly transcribed mRNA to reduce structure at exon junctions in order to facilitate splicing, whereas m6A is occluded from these sites due to its interference with binding of the splicing machinery [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e, \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. This model would further suggest that, like m\u003csup\u003e6\u003c/sup\u003eA, these other RCMs are deposited co-transcriptionally and thus their writers should appear in complexes with transcriptional and epigenetic machinery. In support of this hypothesis, the m\u003csup\u003e5\u003c/sup\u003eC methyltransferases NSUN3 and DNMT2 are known to interact with hnRNPK for Pol-II recruitment and modulating chromatin state in response to stress [\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e].\u003c/p\u003e \u003cp\u003ePrevious work in Arabidopsis comparing RCM abundance in degrading and total mRNA populations found evidence for RCMs as a marker for mRNA degradation [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] whereas others have suggested that loss of RCMs leads to unstable mRNAs [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. In an attempt to resolve these conflicting models, we turned to an elegant time-series RNA-seq experiment in Arabidopsis examining the impacts of mutants in the mRNA decay pathway on RNA turnover (Sorenson et al., 2018). We used these data to assess the relationship between degradation and RCM status. To our surprise, we found that RCMs are more likely to mark extremely stable mRNAs, rather than those that are rapidly degrading. Our finding that most of the RCMs fall on transcripts associated with core cellular processes fits with the notion that these transcripts would also be more stable, a finding reported by Sorenson et al. and others [\u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e]. Interestingly, we also observe a positive relationship between RCM abundance and protein abundance in the Arabidopsis tissue atlas, further suggesting that RCMs, as a whole, have a positive influence on mRNA stability. Importantly, we cannot definitively say that degrading RNAs are not marked, just that the RCMs we can monitor are enriched on mRNAs with long half-lives. A key difference between our work and that of Vandivier et al [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] is in the depth of sequencing, both for the grass seedling experiments that we performed and for the additional datasets we reprocessed, and so further examination of the connection between these base-pair disrupting RCMs and RNA degradation is warranted.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eIn this work, we used a comparative transcriptomic approach to better understand the conservation and function of a poorly studied class of RNA modifications. Grass seedlings contained similar cohorts of modified transcripts, a finding that spurred us to look in other datasets that incorporated additional stress or developmental timepoints in both Sorghum and Arabidopsis. Again, we observed a conserved, core set of modified transcripts, but these expanded datapoints allowed us to uncover context specificity to the epitranscriptome, suggesting that some, but not all, transcripts are modified at key steps in their functional lifecycle. Attempts to better understand function of these base pair disrupting modifications revealed that they are likely important for splicing, translation, and intriguingly, for RNA stability. Similarities across the Angiosperm lineage, along with supporting literature in metazoans, suggests that these modifications, like m\u003csup\u003e6\u003c/sup\u003eA, are a deeply conserved aspect of RNA biology for which we still know very little.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003ePlant growth:\u003c/h2\u003e \u003cp\u003eSeed from \u003cem\u003eZea mays\u003c/em\u003e (acc. B73), \u003cem\u003eSorghum bicolor\u003c/em\u003e (acc. BTx642), and \u003cem\u003eSetaria italica\u003c/em\u003e (acc. Yugu1) were sown onto damp soil approximately 2 cm below the surface, and stratified for one week in the dark at 4\u0026deg;C. Seeds were then transferred to a growth cabinet with lights configured to a long-day cycle (16 hours light, 8 hours dark). All species\u0026rsquo; seeds germinated within a day of each other and were allowed to grow for two weeks once germinated. Seedling shoot tissue was collected and instantly frozen with liquid nitrogen.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eRNA-sequencing library preparation:\u003c/h2\u003e \u003cp\u003eRNA from two biological replicates in each species was isolated using TriZol per the manufacturer\u0026rsquo;s instructions (ThermoFisher #15596026). Paired-end and strand specific sequencing libraries were generated for each species using the YourSeq FT Strand-Specific mRNA library prep kit (Active Motif #23001). Libraries were sequenced on a Novogene Novaseq platform.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eRead processing and modification identification:\u003c/h2\u003e \u003cp\u003eRaw sequencing reads were trimmed for adapters and low-quality bases using fastp (v 0.23.4, [\u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e72\u003c/span\u003e]) with default settings for paired end reads. Trimmed reads were then used to call HAMR modifications using the PAMLINC workflow (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/chosenobih/pamlinc\u003c/span\u003e\u003cspan address=\"https://github.com/chosenobih/pamlinc\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Briefly, this workflow maps reads to each species\u0026rsquo; reference genome (obtained from Ensembl Plants in January 2022) using BowTie2 (v 2.2.5, [\u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e]) and retains unique alignments that don't overlap exon-junctions or deletions. The HAMR statistical model is then run on these filtered alignment files (BAM files) to predict RCMs.\u003c/p\u003e \u003cp\u003eIn addition, a similar algorithm, ModTect (v 1.7.5, [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]), was used to call RCMs. For ModTect prediction, trimmed reads were mapped to the same Ensembl Plants genomes and associated annotation files as above using STAR (v 2.7.10b, [\u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e74\u003c/span\u003e]). The following non-default STAR arguments were used for mapping: --outFilterIntronMotifs RemoveNoncanonical, --alignIntronMax 10000, --outSAMstrandField intronMotif. Raw BAM outputs were then used as input for ModTect with the following arguments: \u0026ndash;minBaseQual 30, --readlength 150, and --regionFile genes.bed which is a four-column file of all gene regions in the annotation file.\u003c/p\u003e \u003cp\u003eFor sequencing data generated in this study, RCM sites were retained that overlapped both ModTect and HAMR predictions among both sequencing replicates (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). For public sequencing data, RCMs were identified using only ModTect and RCMs were retained that were common among three or more experiments (i.e., three or more distinct SRAs).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eRNA abundance measurement:\u003c/h2\u003e \u003cp\u003eTrimmed fastq files were used to measure RNA abundance using Salmon (v 1.10.0, [\u003cspan citationid=\"CR75\" class=\"CitationRef\"\u003e75\u003c/span\u003e]) in \u0026ldquo;quant\u0026rdquo; mode against a decoy aware transcriptome index. The tximport (v 1.28, [\u003cspan citationid=\"CR76\" class=\"CitationRef\"\u003e76\u003c/span\u003e]) R package was then used to import Salmon quant files and generate gene or transcript level transcript per million values, specifying the following tximport option: countsFromAbundance = \"lengthScaledTPM\". RNAs used for downstream RCM and expression analyses were those with at least one transcript per million in one experiment. For the reanalysis of Sorenson et al 2018, we used RPKM (reads per kilobase per million mapped) values that the authors provide in their supplemental data which accounts for stable transcripts containing an increasing proportion of sequencing reads.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eAlternative splicing:\u003c/h2\u003e \u003cp\u003eTwo approaches were used to infer alternative splicing. First, genes that express multiple isoforms can be inferred from Salmon [\u003cspan citationid=\"CR75\" class=\"CitationRef\"\u003e75\u003c/span\u003e] using the tximport [\u003cspan citationid=\"CR76\" class=\"CitationRef\"\u003e76\u003c/span\u003e] R package as described above. Specifying the tximport \u0026ldquo;txOut\u0026rdquo; argument gives isoform level quantifications. Isoforms expressed above one TPM were retained. Genes expressing more than one isoform were inferred to be undergoing alternative splicing. Second, we examined statistically significant changes in splicing patterns between experimental conditions using EventPointer [\u003cspan citationid=\"CR77\" class=\"CitationRef\"\u003e77\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eGO term enrichment:\u003c/h2\u003e \u003cp\u003eGO terms were downloaded for each species from Ensembl Plants. GO term enrichment was performed using the clusterProfiler (v 4.8.2, [\u003cspan citationid=\"CR78\" class=\"CitationRef\"\u003e78\u003c/span\u003e]) R package, specifically the \u0026ldquo;enricher\u0026rdquo; function was used, specifying: (1) the input gene set, (2) a p-value cutoff of .05, (3) Benjamini and Hochberg multiple testing correction, (4) a background gene set (usually, all expressed genes (TPM\u0026thinsp;\u0026gt;\u0026thinsp;=\u0026thinsp;1)), (5) a minimum and maximum gene-set size of 10 and 500, respectively, and (6) a q-value cutoff of .05. Fold enrichment was calculated using the following functions: parse_ratio(GeneRatio) / parse_ratio(BgRatio).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eData analyses:\u003c/h2\u003e \u003cp\u003eAll analyses and statistical testing was performed using the R programming language (v 4.2.0, [\u003cspan citationid=\"CR79\" class=\"CitationRef\"\u003e79\u003c/span\u003e]). Tests for over or underrepresentation were performed using the phyper function (i.e., a hypergeometric test) in R. Correlations between RNA abundance and RCMs were performed by collecting the RNA abundance values and RCM density values (number of RCMs per transcript) from each sequencing experiment of interest. Pearson Correlation Coefficients were then calculated from the RNA abundance and RCM density using the \u0026ldquo;cor\u0026rdquo; function in R. To measure statistical significance of mean-separation between groups, the Student\u0026rsquo;s t-test was used for two group comparisons, while one-way Anova was used for more-than-two group comparisons.\u003c/p\u003e \u003cp\u003eAll visualizations were generated in R from various packages including: ggplot2 (v 3.4.4, [\u003cspan citationid=\"CR80\" class=\"CitationRef\"\u003e80\u003c/span\u003e]), ggvenn (v 0.1.10, [\u003cspan citationid=\"CR81\" class=\"CitationRef\"\u003e81\u003c/span\u003e]), enrichPlot (v 1.20.0, [\u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e82\u003c/span\u003e]), complexUpset (v 1.3.3, [\u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e83\u003c/span\u003e]), complexHeatmap (v 2.16., [\u003cspan citationid=\"CR84\" class=\"CitationRef\"\u003e84\u003c/span\u003e]), and the ggtree package (v 3.8.2, [\u003cspan citationid=\"CR85\" class=\"CitationRef\"\u003e85\u003c/span\u003e])\u003c/p\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003ch2\u003eEthics approval and consent to participate:\u003c/h2\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003ch2\u003eConsent for publication.\u003c/h2\u003e\n\u003cp\u003eAll authors read and consent to this publication.\u003c/p\u003e\n\u003ch2\u003eAvailability of data and materials:\u003c/h2\u003e\n\u003cp\u003eThe accession numbers of RNA-sequencing data generated in this study can be found at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1116564. The scripts and intermediate files for analyses are available at: https://github.com/kylepalos/comparative-PRM-paper.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eCompeting Interests:\u003c/h2\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003ch2\u003eFunding:\u003c/h2\u003e\n\u003cp\u003eWe would like to acknowledge support from NSF IOS PGRP 2023310 (to ADLN, EHL, and BDG), NSF MCB 2120131 (to ADLN), and NSF-DBI-1743442 to EHL.\u003c/p\u003e\n\u003ch2\u003eAuthor Contributions:\u003c/h2\u003e\n\u003cp\u003eKP, ACND, EHL, BDG, and ADLN developed the project. ACDN grew plants, isolated RNA, and performed RNA-seq. KP analyzed data, EHL, BDG, and ADLN provided feedback on data analysis. KP and ADLN wrote the manuscript. All authors edited and approved the manuscript.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003eAcknowledgements:\u003c/h2\u003e\n\u003cp\u003eWe would like to thank Drs Duke Pauli (University of Arizona) and Susan Schroeder (Oklahoma University) for insightful comments during the development of the project. We would like to thank Drs. Rebecca Murphy (Centenary College of Louisiana) and Daryl Morishige for providing useful insight into the Sorghum tissue atlas. Finally, we would like to thank members of the Gregory and Nelson labs for constructive feedback.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eCappannini A, Ray A, Purta E, Mukherjee S, Boccaletto P, Moafinejad SN, et al. MODOMICS: a database of RNA modifications and related information. 2023 update. Nucleic Acids Res. 2024;52:D239\u0026ndash;44.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBaumer ZT, Erber L, Jolley E, Lawrence S, Lin C, Murakami S, et al. Defining the commonalities between post-transcriptional and post-translational modification communities. Trends Biochem Sci. 2024;49:185\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSaletore Y, Meyer K, Korlach J, Vilfan ID, Jaffrey S, Mason CE. The birth of the Epitranscriptome: deciphering the function of RNA modifications. Genome Biol. 2012;13:175.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGilbert WV, Nachtergaele S. mRNA Regulation by RNA Modifications. Annu Rev Biochem. 2023;92:175\u0026ndash;98.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLewis CJT, Pan T, Kalsotra A. RNA modifications and structures cooperate to guide RNA-protein interactions. Nat Rev Mol Cell Biol. 2017;18:202\u0026ndash;10.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRoy B. Effects of mRNA Modifications on Translation: An Overview. In: McMahon M, editor. RNA Modifications: Methods and Protocols. New York, NY: Springer US; 2021. pp. 327\u0026ndash;56.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePrall W, Ganguly DR, Gregory BD. The covalent nucleotide modifications within plant mRNAs: What we know, how we find them, and what should be done in the future. Plant Cell. 2023;35:1801\u0026ndash;16.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma B, Prall W, Bhatia G, Gregory BD. The Diversity and Functions of Plant RNA Modifications: What We Know and Where We Go from Here. Annu Rev Plant Biol. 2023;74:53\u0026ndash;85.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKramer MC, Janssen KA, Palos K, Nelson ADL, Vandivier LE, Garcia BA, et al. N6-methyladenosine and RNA secondary structure affect transcript stability and protein abundance during systemic salt stress in Arabidopsis. Plant Direct. 2020;4:e00239.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun H, Li K, Liu C, Yi C. Regulation and functions of non-m6A mRNA modifications. Nat Rev Mol Cell Biol. 2023;24:714\u0026ndash;31.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLuo G-Z, MacQueen A, Zheng G, Duan H, Dore LC, Lu Z, et al. Unique features of the m6A methylome in Arabidopsis thaliana. Nat Commun. 2014;5:5630.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePerry RP, Kelley DE, Friderici K, Rottman F. The methylated constituents of L cell messenger RNA: evidence for an unusual cluster at the 5\u0026rsquo; terminus. Cell. 1975;4:387\u0026ndash;94.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWiener D, Schwartz S. The epitranscriptome beyond m6A. Nat Rev Genet. 2021;22:119\u0026ndash;31.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVandivier LE, Campos R, Kuksa PP, Silverman IM, Wang L-S, Gregory BD. Chemical Modifications Mark Alternatively Spliced and Uncapped Messenger RNAs in Arabidopsis. Plant Cell. 2015;27:3024\u0026ndash;37.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTan K-T, Ding L-W, Wu C-S, Tenen DG, Yang H. Repurposing RNA sequencing for discovery of RNA modifications in clinical cohorts. Sci Adv [Internet]. 2021;7. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://dx.doi.org/10.1126/sciadv.abd2605\u003c/span\u003e\u003cspan address=\"10.1126/sciadv.abd2605\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKannan K, Nelson ADL, Shippen DE. Dyskerin is a component of the Arabidopsis telomerase RNP required for telomere maintenance. Mol Cell Biol. 2008;28:2332\u0026ndash;41.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang Y, Hsu PJ, Chen Y-S, Yang Y-G. Dynamic transcriptomic m6A decoration: writers, erasers, readers and functions in RNA metabolism. Cell Res. 2018;28:616\u0026ndash;24.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFlamand MN, Tegowski M, Meyer KD. The Proteins of mRNA Modification: Writers, Readers, and Erasers. Annu Rev Biochem. 2023;92:145\u0026ndash;73.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWilkinson E, Cui Y-H, He Y-Y. Roles of RNA Modifications in Diverse Cellular Functions. Front Cell Dev Biol. 2022;10:828683.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDominissini D, Moshitch-Moshkovitz S, Schwartz S, Salmon-Divon M, Ungar L, Osenberg S, et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature. 2012;485:201\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun L, Xu Y, Bai S, Bai X, Zhu H, Dong H, et al. Transcriptome-wide analysis of pseudouridylation of mRNA and non-coding RNAs in Arabidopsis. J Exp Bot. 2019;70:5089\u0026ndash;600.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi M, Tao Z, Zhao Y, Li L, Zheng J, Li Z, et al. 5-methylcytosine RNA methyltransferases and their potential roles in cancer. J Transl Med. 2022;20:214.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen X, Li A, Sun B-F, Yang Y, Han Y-N, Yuan X, et al. 5-methylcytosine promotes pathogenesis of bladder cancer through stabilizing mRNAs. Nat Cell Biol. 2019;21:978\u0026ndash;90.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBarbieri I, Kouzarides T. Role of RNA modifications in cancer. Nat Rev Cancer. 2020;20:303\u0026ndash;22.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhong S, Li H, Bodi Z, Button J, Vespa L, Herzog M, et al. MTA is an Arabidopsis messenger RNA adenosine methylase and interacts with a homolog of a sex-specific splicing factor. Plant Cell. 2008;20:1278\u0026ndash;88.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang W, Meng J, Liu J, Ding B, Tan T, Wei Q, et al. The N1-Methyladenosine Methylome of Petunia mRNA. Plant Physiol. 2020;183:1710\u0026ndash;24.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCui X, Liang Z, Shen L, Zhang Q, Bao S, Geng Y, et al. 5-Methylcytosine RNA Methylation in Arabidopsis Thaliana. Mol Plant. 2017;10:1387\u0026ndash;99.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDavid R, Burgess A, Parker B, Li J, Pulsford K, Sibbritt T, et al. Transcriptome-Wide Mapping of RNA 5-Methylcytosine in Arabidopsis mRNAs and Noncoding RNAs. Plant Cell. 2017;29:445\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDecatur WA, Fournier MJ. rRNA modifications and ribosome function. Trends Biochem Sci. 2002;27:344\u0026ndash;51.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhou KI, Shi H, Lyu R, Wylder AC, Matuszek Ż, Pan JN, et al. Regulation of Co-transcriptional Pre-mRNA Splicing by m6A through the Low-Complexity Protein hnRNPG. Mol Cell. 2019;76:70\u0026ndash;e819.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUzonyi A, Dierks D, Nir R, Kwon OS, Toth U, Barbosa I, et al. Exclusion of m6A from splice-site proximal regions by the exon junction complex dictates m6A topologies and mRNA stability. Mol Cell. 2023;83:237\u0026ndash;e517.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang X, Triboulet R, Liu Q, Sendinc E, Gregory RI. Exon junction complex shapes the m6A epitranscriptome. Nat Commun. 2022;13:7904.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKarik\u0026oacute; K, Muramatsu H, Welsh FA, Ludwig J, Kato H, Akira S, et al. Incorporation of pseudouridine into mRNA yields superior nonimmunogenic vector with increased translational capacity and biological stability. Mol Ther. 2008;16:1833\u0026ndash;40.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEyler DE, Franco MK, Batool Z, Wu MZ, Dubuke ML, Dobosz-Bartoszek M, et al. Pseudouridinylation of mRNA coding sequences alters translation. Proc Natl Acad Sci U S A. 2019;116:23068\u0026ndash;74.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang Q, Kim N-K, Feigon J. Architecture of human telomerase RNA. Proc Natl Acad Sci U S A. 2011;108:20325\u0026ndash;32.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim N-K, Theimer CA, Mitchell JR, Collins K, Feigon J. Effect of pseudouridylation on the structure and activity of the catalytically essential P6.1 hairpin in human telomerase RNA. Nucleic Acids Res. 2010;38:6746\u0026ndash;56.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAnderson BR, Muramatsu H, Jha BK, Silverman RH, Weissman D, Karik\u0026oacute; K. Nucleoside modifications in RNA limit activation of 2\u0026rsquo;-5'-oligoadenylate synthetase and increase resistance to cleavage by RNase L. Nucleic Acids Res. 2011;39:9329\u0026ndash;38.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang Y, Wang L, Han X, Yang W-L, Zhang M, Ma H-L, et al. RNA 5-Methylcytosine Facilitates the Maternal-to-Zygotic Transition by Preventing Maternal mRNA Decay. Mol Cell. 2019;75:1188\u0026ndash;e20211.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang Y, Lu L, Li X. Detection technologies for RNA modifications. Exp Mol Med. 2022;54:1601\u0026ndash;16.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchaefer M, Pollex T, Hanna K, Lyko F. RNA cytosine methylation analysis by bisulfite sequencing. Nucleic Acids Res. 2009;37:e12.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWoodson SA, Muller JG, Burrows CJ, Rokita SE. A primer extension assay for modification of guanine by Ni(II) complexes. Nucleic Acids Res. 1993;21:5524\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRyvkin P, Leung YY, Silverman IM, Childress M, Valladares O, Dragomir I, et al. HAMR: high-throughput annotation of modified ribonucleotides. RNA. 2013;19:1684\u0026ndash;92.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTodkari IA, Chandrasekaran AR, Punnoose JA, Mao S, Haruehanroengra P, Beckles C, et al. Resolving altered base-pairing of RNA modifications with DNA nanoswitches. Nucleic Acids Res. 2023;51:11291\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChiu NHL, Simpson JH, Wang H, Tannous BA. A theoretical perspective of the physical properties of different RNA modifications with respect to RNA duplexes. BBA Adv. 2021;1:100025.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKatz K, Shutov O, Lapoint R, Kimelman M, Brister JR, O\u0026rsquo;Sullivan C. The Sequence Read Archive: a decade more of explosive growth. Nucleic Acids Res. 2022;50:D387\u0026ndash;90.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSloan DB, Warren JM, Williams AM, Wu Z, Abdel-Ghany SE, Chicco AJ, et al. Cytonuclear integration and co-evolution. Nat Rev Genet. 2018;19:635\u0026ndash;48.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang Y, Li S, Zhao Y, You C, Le B, Gong Z, et al. NAD\u003csup\u003e+\u003c/sup\u003e-capped RNAs are widespread in the \u003cem\u003eArabidopsis\u003c/em\u003e transcriptome and can probably be translated. Proc Natl Acad Sci U S A. 2019;116:12094\u0026ndash;102.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMergner J, Frejno M, List M, Papacek M, Chen X, Chaudhary A, et al. Mass-spectrometry-based draft of the Arabidopsis proteome. Nature. 2020;579:409\u0026ndash;14.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcCormick RF, Truong SK, Sreedasyam A, Jenkins J, Shu S, Sims D, et al. The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization. Plant J. 2018;93:338\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKlepikova AV, Kasianov AS, Gerasimov ES, Logacheva MD, Penin AA. A high resolution map of the Arabidopsis thaliana developmental transcriptome based on RNA-seq profiling. Plant J. 2016;88:1058\u0026ndash;70.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKryuchkova-Mostacci N, Robinson-Rechavi M. A benchmark of gene expression tissue-specificity metrics. Brief Bioinform. 2017;18:205\u0026ndash;14.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMatilla MA, Espinosa-Urgel M, Rodr\u0026iacute;guez-Herva JJ, Ramos JL, Ramos-Gonz\u0026aacute;lez MI. Genomic analysis reveals the major driving forces of bacterial life in the rhizosphere. Genome Biol. 2007;8:R179.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSullivan A, Purohit PK, Freese NH, Pasha A, Esteban E, Waese J, et al. An eFP-Seq Browser for visualizing and exploring RNA sequencing data. Plant J. 2019;100:641\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMurat F, Armero A, Pont C, Klopp C, Salse J. Reconstructing the genome of the most recent common ancestor of flowering plants. Nat Genet. 2017;49:490\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma B, Govindan G, Li Y, Sunkar R, Gregory BD. RNA N6-Methyladenosine Affects Copper-Induced Oxidative Stress Response in Arabidopsis thaliana. Noncoding RNA [Internet]. 2024;10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://dx.doi.org/10.3390/ncrna10010008\u003c/span\u003e\u003cspan address=\"10.3390/ncrna10010008\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePrall W, Sheikh AH, Bazin J, Bigeard J, Almeida-Trapp M, Crespi M, et al. Pathogen-induced m6A dynamics affect plant immunity. Plant Cell. 2023;35:4155\u0026ndash;72.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGovindan G, Sharma B, Li Y-F, Armstrong CD, Merum P, Rohila JS, et al. mRNA N6 -methyladenosine is critical for cold tolerance in Arabidopsis. Plant J. 2022;111:1052\u0026ndash;68.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCheng Q, Wang P, Wu G, Wang Y, Tan J, Li C, et al. Coordination of m6A mRNA methylation and gene transcriptome in rice response to cadmium stress. Rice. 2021;14:62.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHu J, Cai J, Park SJ, Lee K, Li Y, Chen Y, et al. N6 -Methyladenosine mRNA methylation is important for salt stress tolerance in Arabidopsis. Plant J. 2021;106:1759\u0026ndash;75.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVaroquaux N, Cole B, Gao C, Pierroz G, Baker CR, Patel D, et al. Transcriptomic analysis of field-droughted sorghum from seedling to maturity reveals biotic and metabolic responses. Proc Natl Acad Sci U S A. 2019;116:27124\u0026ndash;32.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSorenson RS, Deshotel MJ, Johnson K, Adler FR, Sieburth LE. \u003cem\u003eArabidopsis\u003c/em\u003e mRNA decay landscape arises from specialized RNA decay substrates, decapping-mediated feedback, and redundancy. Proc Natl Acad Sci U S A. 2018;115:E1485\u0026ndash;94.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDeyholos MK, Cavaness GF, Hall B, King E, Punwani J, Van Norman J, et al. VARICOSE, a WD-domain protein, is required for leaf blade development. Development. 2003;130:6577\u0026ndash;88.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang W, Murphy C, Sieburth LE. Conserved RNaseII domain protein functions in cytoplasmic mRNA decay and suppresses Arabidopsis decapping mutant phenotypes. Proc Natl Acad Sci U S A. 2010;107:15981\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLubas M, Damgaard CK, Tomecki R, Cysewski D, Jensen TH, Dziembowski A. Exonuclease hDIS3L2 specifies an exosome-independent 3\u0026rsquo;-5' degradation pathway of human cytoplasmic mRNA. EMBO J. 2013;32:1855\u0026ndash;68.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMalecki M, Viegas SC, Carneiro T, Golik P, Dressaire C, Ferreira MG, et al. The exoribonuclease Dis3L2 defines a novel eukaryotic RNA degradation pathway. EMBO J. 2013;32:1842\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHug N, Longman D, C\u0026aacute;ceres JF. Mechanism and regulation of the nonsense-mediated decay pathway. Nucleic Acids Res. 2016;44:1483\u0026ndash;95.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee W-C, Hou B-H, Hou C-Y, Tsao S-M, Kao P, Chen H-M. Widespread Exon Junction Complex Footprints in the RNA Degradome Mark mRNA Degradation before Steady State Translation. Plant Cell. 2020;32:904\u0026ndash;22.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUnterholzner L, Izaurralde E. SMG7 acts as a molecular link between mRNA surveillance and mRNA decay. Mol Cell. 2004;16:587\u0026ndash;96.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGloggnitzer J, Akimcheva S, Srinivasan A, Kusenda B, Riehs N, Stampfl H, et al. Nonsense-mediated mRNA decay modulates immune receptor levels to regulate plant antibacterial defense. Cell Host Microbe. 2014;16:376\u0026ndash;90.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCheng JX, Chen L, Li Y, Cloe A, Yue M, Wei J, et al. RNA cytosine methylation and methyltransferases mediate chromatin organization and 5-azacytidine response and resistance in leukaemia. Nat Commun. 2018;9:1163.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi Y, Yi Y, Lv J, Gao X, Yu Y, Babu SS, et al. Low RNA stability signifies increased post-transcriptional regulation of cell identity genes. Nucleic Acids Res. 2023;51:6020\u0026ndash;38.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884\u0026ndash;90.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLangmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15\u0026ndash;21.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePatro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14:417\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSoneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 2015;4:1521.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFerrer-Bonsoms JA, Gimeno M, Olaverri D, Sacristan P, Lobato C, Castilla C, et al. EventPointer 3.0: flexible and accurate splicing analysis that includes studying the differential usage of protein-domains. NAR Genom Bioinform. 2022;4:lqac067.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu G, Wang L-G, Han Y, He Q-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTeam RC. Others. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www R-project org/ [Internet]. 2016; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://cir.nii.ac.jp/crid/1574231874043578752\u003c/span\u003e\u003cspan address=\"https://cir.nii.ac.jp/crid/1574231874043578752\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWickham H. ggplot2: Elegant Graphics for Data Analysis. Springer; 2016.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYan L, ggvenn. Draw Venn Diagram by ggplot2. R Package Version. 2021;19.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu G. Enrichplot: visualization of functional enrichment result. R package version. 2021;1.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKrassowski M. ComplexUpset. ComplexUpset. 2020.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32:2847\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu G. Using ggtree to Visualize Data on Tree-Like Structures. Curr Protoc Bioinf. 2020;69:e96.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFactR [Internet]. [cited 2024 Apr 24]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://fursham-h.github.io/factR/\u003c/span\u003e\u003cspan address=\"https://fursham-h.github.io/factR/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-plant-biology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"pbio","sideBox":"Learn more about [BMC Plant Biology](http://bmcplantbiol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/pbio/default.aspx","title":"BMC Plant Biology","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"RNA covalent modifications, RNA, Comparative transcriptomics, RNA decay, Angiosperms","lastPublishedDoi":"10.21203/rs.3.rs-4466769/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4466769/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eIn recent years, covalent modifications on RNA nucleotides have emerged as pivotal moieties influencing the structure, function, and regulatory processes of RNA Polymerase II transcripts such as mRNAs and lncRNAs. However, our understanding of their biological roles and whether these roles are conserved across eukaryotes remains limited.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eIn this study, we leveraged standard RNA-sequencing data to identify and characterize RNA modifications that introduce base-pairing errors into cDNA reads. Our investigation incorporated data from three Poaceae (\u003cem\u003eZea mays\u003c/em\u003e, \u003cem\u003eSorghum bicolor\u003c/em\u003e, and \u003cem\u003eSetaria italica\u003c/em\u003e), as well as publicly available data from a range of stress and genetic contexts in Sorghum and \u003cem\u003eArabidopsis thaliana\u003c/em\u003e. We uncovered a strong enrichment of RNA covalent modifications (RCMs) deposited on a conserved core set of nuclear RNAs involved in photosynthesis and translation across these species. However, the cohort of modified transcripts changed based on environmental context and developmental program, a pattern that was also conserved across flowering plants. We determined that RCMs can partly explain accession-level differences in drought tolerance in Sorghum, with stress-associated genes receiving a higher level of RCMs in a drought tolerant accession. To address function, we determined that RCMs are significantly enriched near exon junctions within coding regions, suggesting an association with splicing. Intriguingly, we found that these base-pair disrupting RCMs are associated with stable mRNAs, are highly correlated with protein abundance, and thus likely associated with facilitating translation.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eOur data point to a conserved role for RCMs in mRNA stability and translation across the flowering plant lineage.\u003c/p\u003e","manuscriptTitle":"Comparative analyses uncover a link between mRNA splicing, stability, and RNA covalent modifications in flowering plants","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-06-12 14:27:16","doi":"10.21203/rs.3.rs-4466769/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2024-07-16T07:19:29+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-07-15T15:07:37+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-07-10T21:23:16+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"147934136793052625812888516351398047306","date":"2024-06-28T16:14:23+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"170697338083578329078832488627025019346","date":"2024-06-17T12:32:44+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"51960731604394287516452669664220240170","date":"2024-06-02T08:44:37+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-05-31T01:11:28+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2024-05-29T23:05:15+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-05-29T23:02:19+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-05-29T23:02:19+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Plant Biology","date":"2024-05-23T12:13:32+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-plant-biology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"pbio","sideBox":"Learn more about [BMC Plant Biology](http://bmcplantbiol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/pbio/default.aspx","title":"BMC Plant Biology","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"2da0a69b-cacd-4846-a091-35afe043ef04","owner":[],"postedDate":"June 12th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2024-08-22T19:37:22+00:00","versionOfRecord":{"articleIdentity":"rs-4466769","link":"https://doi.org/10.1186/s12870-024-05486-7","journal":{"identity":"bmc-plant-biology","isVorOnly":false,"title":"BMC Plant Biology"},"publishedOn":"2024-08-12 15:58:07","publishedOnDateReadable":"August 12th, 2024"},"versionCreatedAt":"2024-06-12 14:27:16","video":"","vorDoi":"10.1186/s12870-024-05486-7","vorDoiUrl":"https://doi.org/10.1186/s12870-024-05486-7","workflowStages":[]},"version":"v1","identity":"rs-4466769","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4466769","identity":"rs-4466769","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00