The genomes of flowering plants reveal contrasting evolutionary paths in monocots and eudicots

preprint OA: closed
Full text JSON View at publisher
Full text 90,545 characters · extracted from preprint-html · click to expand
The genomes of flowering plants reveal contrasting evolutionary paths in monocots and eudicots | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article The genomes of flowering plants reveal contrasting evolutionary paths in monocots and eudicots Alexander M.C. Bowles, Jordi Paps This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7464600/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Flowering plants (angiosperms) emerged over 150 million years ago 1 – 4 , leading to the origin of two major groups, the monocots and eudicots. Accompanying this rapid species diversification was a period of dynamic genome evolution, as evidenced by their conflicting evolutionary relationships 5 , 6 . However, the genomic trends governing the evolution of flowering plants remain poorly understood. Here, starting with 1,181 genomes, we have selected and analysed 273 archaeplastid genomes, to produce a novel, robustly supported angiosperm phylogeny. With this phylogeny, our analyses identify unprecedented rates of gene loss and duplication. The origin of monocots was accompanied by a period of reductive genome evolution while the first eudicot genomes experienced modest rates of gene duplication. Lost genes in the first monocots support the morphological simplification of the cotyledon, leaf venation patterning and root system architecture. Contrastingly, genome expansion in eudicots were associated to floral development and plant reproduction. Individual orders were characterised by pervasive gene loss, coupled with modest gene duplication. This suggests that angiosperms reached a core genomic diversity early in their evolutionary history, corresponding to their high floral diversity. This work highlights the importance of loss as well as gain of function in the diversification of the most speciose plant group. Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Main Flowering plants (angiosperms) are the foundation of most of Earth’s terrestrial ecosystems and the basis of human agricultural success 7 , 8 . They are of significant socioeconomic importance, with major crops and ornamental plants found within monocots (e.g. wheat, rice, orchids), eudicots (e.g. legumes, potatoes, sunflowers, roses), and other angiosperms (e.g. magnolias, avocado, pepper, water lilies). As the most successful lineage of land plants, they exhibit a remarkable diversity of form, function, and ecology 9 , 10 . Their origin and radiation have shaped climates 11 , 12 , the structure of the biosphere 13 , and driven the evolution of other organisms — such as fungi, insects, and birds 9 — with whom they have also co-evolved. Terrestrial biodiversity is therefore inherently intertwined with angiosperm diversification, highlighting their critical role in promoting ecosystem complexity. The improving picture of flowering plant interrelationships has highlighted the diversification of the main lineages of angiosperms, namely the ANA grade, Ceratophyllales, Magnoliids, Monocotyledons, Chloranthales, and Eudicotyledons 6 , 14 . The two largest groups, the monocots and eudicots, comprise over 95% of flowering plant diversity 15 , but have key differences in their morphology, ecology and life histories. However, the genomic basis of these differences and their evolution is not fully understood. The dramatic increase of whole genome sampling offers a timely opportunity to understand this iconic episode of plant evolution. Pinpointing how these two lineages evolved to be so different requires a detailed reconstruction of the genomic changes underpinning flowering plant evolution. Here we produce a statistically supported angiosperm tree of life using an extensive sampling of complete genomes 16 – 18 , with a phylogenetically broad outgroup. With this tree, we then investigate the role of gene duplication and loss in the origin of distinct angiosperm groups. Our analyses reveal that the evolution of monocots was driven by major reductions in their gene complement, while the emergence of eudicots was underlined by gene duplication. Analysing flowering plant genome evolution Defining the genomic changes accompanying flowering plant evolution is key to unravelling the molecular basis of their biological innovations. We curated a dataset of 1181 flowering plant whole genomes comprising 50 out of 64 angiosperm orders representing 98% of flowering plant species (Data S1) 15 . With variable genome quality and some groups overrepresented in databases 18 , five genomes per order were selected based on protein-coding gene completeness and phylogenetic distribution to produce a high quality and taxonomically balanced dataset. Genomes for non-flowering land plants and other archaeplastids (red, green, and streptophyte algae) to use as outgroup taxa were added. The final dataset consists of 273 genomes (Data S2), including a broad representation of algae (36), bryophytes (14), lycophytes (3), ferns (7) and gymnosperms (11) and angiosperms (202). To the best of our knowledge, this constitutes the largest, most taxonomically comprehensive dataset of angiosperm genomes to date. The 273 archaeplastid genomes, altogether containing 10.2 million proteins, were clustered into 223,333 orthogroups, sets of evolutionarily related genes which may include orthologs and paralogs. Though improving, the topology of the flowering plant phylogeny is a long-standing topic in evolutionary biology 2 , 5 , 6 , 14 . In order to infer the evolutionary relationships amongst these 273 species, this genome dataset was queried to identify orthogroups with a 60% taxonomic occupancy, resulting in 765 orthogroups being selected. Using site-heterogeneous models, phylogenetic analysis of these genes was used to gain an understanding of flowering plant relations (Fig. 1 ). Our analysis recovers the well-established placement of Amborellales and Nymphaeales as successive sister groups to the mesangiosperms. Other such relationships have historically been more uncertain (e.g. the placement of Ceratophyllales). Similar to Zuntini et al 6 , we recover Ceratophyllales as sister to all other mesangiosperms. Akin to the topology of the 1KP project 5 , our analyses recover monocots as sister to all remaining mesangiosperms as well as Magnoliids and Chloranthales as the closest relatives of eudicots. Statistical testing rejected other alternative flowering plant topologies (Fig. 1 , Figure S1 & S2) 5 , 6 , 14 , giving further support to our phylogeny. As such, we use this species tree to understand the evolution of angiosperm gene content. Additionally, within these major groups, order level relationships are largely consistent with published phylogenies 5 , 6 , 14 . Large differences in gene content between monocots and eudicots We determined broad-scale differences in gene content in the morphologically divergent monocots and eudicots using sophisticated gene tree-species tree reconciliation methods (Fig. 2 , Data S3 & S4, Figure S3). We investigated the duplication and loss of genes within orthogroups in all the main lineages of flowering plants using AleRax 19 . These analyses revealed large scale genome reduction with the origin of monocots, with gene loss estimated in 9480 orthogroups. These rates of gene loss were over ten-fold greater compared to gene duplication (Fig. 2 ). In contrast, the origin of eudicots was accompanied by modest gene duplication (3356 orthogroups), with rates of gene duplication compared to gene loss estimated at about two-fold greater (Fig. 2 ). Patterns of gene evolution across the other major groups of flowering plants also revealed high rates of gene duplication in angiosperms and core eudicots (Fig. 2 ). Both nodes have been previously linked to whole genome duplication (WGD) events 20 . In addition, patterns of reductive evolution continued within the monocots, in the commelinids, correlating with a reduction in ancestral genome size 21 . We also investigated the role of novel core genes in the origin and early evolution of angiosperms (Data S5). These are defined as orthogroups that are present in all (or all bar one) members of a clade and absent in all outgroup taxa. A single novel core orthogroup is found across all flowering plant genomes which is involved in plant immunity, containing a NOI (RIN4-like) domain 22 ; this domain plays a crucial role in plant immune signalling, particularly in effector-triggered immunity, by binding to virulence factors from infecting bacteria. For monocots and eudicots, no conserved gene novelties were observed (Data S5). These analyses agree with recent findings highlighting a period of exceptional diversification in the early evolution of angiosperms, in which 80% of extant orders originated 6 , with high rates of gene tree conflict 6 and whole genome duplication 20 . This contrasts to older plant lineages which experienced two early bursts of gene novelty in land plants and multicellular streptophytes respectively (Data S5) 23 and corresponds to patterns seen in animals and fungi whose genomes are characterised by early conserved gene novelty, followed by more recent periods of dynamic genome evolution 24 – 27 . Reductive evolution in monocots Functional analysis suggested that genomic loss (identified by AleRax) underpins the morphology of monocots (Fig. 3 ). Gene Ontology (GO) terms suggested that this is underpinned by widespread gene loss across numerous functional categories, with especially high losses in gene regulation, protein synthesis, and signal transduction. These losses likely contributed to their streamlined metabolism and distinct developmental architecture. Monocots have a single cotyledon, parallel leaf venation, complex vascular organisation, fibrous roots and trimerous flowers 28 – 32 . Eudicots by comparison have two cotyledons, reticulate leaf venation, vascular tissue organised in rings, a taproot and tetra- to pentamerous flowers 28 – 32 . The genetic networks underpinning the development of these innovations were queried against the list of lost genes (Fig. 3 , Data S6) 30 , 33 – 35 . Several gene families lost from the first monocots have been functionally shown to have single or fused cotyledon phenotypes in Arabidopsis , offering potential mechanistic insights into monocotyledons. These include CUP-SHAPED COTYLEDON (CUC), SHOOTMERISTEMLESS (STM) and members of the TIR (TOLL/INTERLEUKIN-1 RECEPTOR-LIKE) family. Leaf morphogenesis genes were lost at every stage of leaf development, ranging from establishment to transition, through modification and senescence (Fig. 3 , Data S6 & S7). This suggests that the first monocots constructed their leaves in a distinct way to eudicots. Key components of primary root development were lost in the first monocots, including members of the WOX (WUSCHEL-RELATED HOMEOBOX), TIRs (as above) and EIN3-4 (ETHYLENE INSENSITIVE) orthgroups, which coincides with their fibrous root habit (Fig. 3 , Data S6 & S7). Finally, fundamental regulators in floral development exhibited subfamily loss in monocots, including SEPALLATA, APETALAs, AGAMOUS and PISTILLATA (Fig. 3 , Data S6 & S7). Evolution by genome expansion in eudicots Gene duplication, as identified by AleRax, is found in the origin of eudicots, with the largest number of genes being related to gene regulation and signalling (Fig. 3 ). Eudicots show a radiation of floral development genes such as SEPALLATA, APETALA, PISTILLATA and AGAMOUS gene groups (Data S8 & S9). GO terms identified these genes function in floral whorl development, meristem identity and the maintenance of floral organ identity, potentially linked to the first tetramerous flowers (Fig. 3 , Data S8 & S9). Genes associated with vernalisation responses were classified in these expansions (FRIGIDA, VERNALISATION1-2). Further duplications were seen in the evolution of reproductive structures (anther wall development, male gamete generation), coinciding with origin of tricolpate pollen in flowering plants 36 (Fig. 3 , Data S8 & S9). Furthermore, the regulation of cell wall organisation was associated with these duplications including callose synthase, expansin and pectinerase gene families (Fig. 3 , Data S8 & S9). The biosynthetic pathways of several compounds expanded during this period of eudicot radiation, including beta-D-glucan and flavonol biosynthesis (Fig. 3 , Data S8 & S9). Gene diversity early in angiosperm evolution Gene family diversification within orders pointed to a history of gene loss, compensated by modest gene duplication (Fig. 4 ). This suggests that flowering plant orders evolved through a dynamic process involving both the reduction and expansion of specific gene families. Additionally, novel core orthogroups were observed at the order level (Data S5), with NO APICAL MERISTEM (NAM) proteins (19) being a common observation. This protein domain controls boundary formation and lateral organ separation, vital for leaf and flower patterning 37 . These could have contributed to the development of elaborate plant architecture seen within angiosperm orders. To understand the diversity of flowering plant genomes early in their evolutionary history, we analysed plant gene disparity, based on the shared presence and absence of genes (Fig. 4 ). The distances between species in this space shows the dissimilarity in gene content. Non-metric multidimensional scaling revealed that angiosperms comprise a third of the theoretical gene space of plants; non-flowering land plants and archaeplastid algae consist of the other two thirds. The majority of angiosperm gene space is tightly clustered within this third, with several plant specialists occupying the extremes (e.g. Sapria himalayana , a rare holoparasite) 38 . This suggests that angiosperms reached a core genomic diversity early in their evolutionary history (Fig. 4 ). This corresponds to their high floral morphological diversity which had emerged by the early Cretaceous 39 . Contrasting genomic paths Together, the emerging picture from our analyses indicates that dynamic genome evolution has shaped the origin of eudicots and monocots (Fig. 5 ). The first monocot genomes are characterised by gene reduction (Fig. 5 ). By contrast, eudicots accumulated by a diversification of genes (Fig. 5 ). These two strategies have most likely led to the dominance of eudicots and monocots over the other groups of flowering plants. Despite these initial contrasting genomic paths, taxa within these lineages have continued to diversify, with an accumulation of genetic changes that postdate the divergence of monocots and eudicots. For example, complex patterns of genome duplication are seen within the grasses and orchids 40 , 41 . Recent studies have used taxonomically comprehensive transcriptomic data to demonstrate high diversification rates and gene family expansion in the major plant groups 5 , 42 . As a consensus of flowering plant relationships is reached, the extent to which patterns of dynamic genome evolution persist in the Magnoliids, Chloranthales, Ceratophyllales and ANA grade will be garnered. The large-scale gene loss seen in monocots (Fig. 2 ) as well as flowering plant orders (Fig. 4 ) challenges the view of plant evolution being solely driven by WGD. Indeed, our patterns of gene loss did not appear to be associated with these known whole genome duplication events (Figure S4). While clearly an important mechanism of angiosperm diversification, WGD is but one force shaping plant genomes. The significance of these patterns and processes are also being realised in other nodes of the tree of life (e.g. annelids, carnivorous plants) 43 , 44 . During early animal evolution, the origin of Deuterostomes and Ecdysozoa is dominated by gene loss 24 . Collectively, these works are highlighting the loss as well as gain of genes throughout evolution. The drivers behind the divergence of these two main groups remains unclear. Some hypotheses have suggested that adaptation to aquatic environments could explain the divergence of monocots 45 . This is largely based on the life histories of the first two splitting lineages, Acorales, a group mostly found in wetlands and Alismatales, containing the duckweeds and seagrasses. The change to a fibrous root system, parallel leaf venation and reorganisation of vascular tissues could potentially be beneficial for adaptations to an aquatic environment. In this context, the simplification of structures such as the taproot might have reduced energy expenditure in monocots, enabling them to allocate resources towards other survival-enhancing traits. Whatever the driver, this divergence, which occurred over 125 million years ago during the early Cretaceous 3 , reflects a deeper evolutionary shift in plant architecture and functionality. Gene family contractions in monocots likely involved not only the deletion of entire genes but also the loss of gene function through rapid mutations and pseudogenization 46 , 47 . Chromosomal rearrangements and genome size variation likely contributed to their morphological divergence from eudicots 48 . In addition, this study solely focuses on protein-coding genes; however, non-coding genes, regulatory regions, and epigenetic modifications most likely contributed to the diversification of plant life. The analysis presented here, which incorporates genomic data for 273 plants from across the tree of life, provides new insight into the composition of flowering plant genomes and emphasizes the role of genome evolution in the radiation of angiosperms. Methods Compiling genomic dataset Broad taxonomic sampling of genomic data was implemented to accurately infer the origin and diversification of angiosperm gene content (Data S1). In total, over 1000 flowering plant genomes were identified in the literature, by 31st January 2024 (Data S1). BUSCO analysis was used to assess the quality of genome annotation, using a threshold of < 30% missing genes in the BUSCO Eukaryota dataset as a benchmark to accept a genome for further analysis (Data S1) 49 . To improve computational tractability, a maximum of five species per flowering plant order were chosen. In total, 273 archaeplastid genome were downloaded equating to 10,227,129 predicted proteins, including 202 flowering plants and 71 archaeplastid outgroup taxa (Data S2). Orthology inference OrthoFinder (v.2.3.7) was used to cluster protein-coding genes into orthogroups 50 , based on sequence divergence, using default settings (orthofinder -f data_folder). Orthofinder was launched on 31st January 2024 and therefore any genomes published after this date were not included in the analysis. Species tree analysis Single copy orthologs were identified using a previously described python script 51 , which removes paralogous genes from orthogroups. The script enables the user to specify a minimum taxonomic occupancy of each orthogroup, set at 60% to select 765 genes. Single copy orthologs were aligned using Mafft 52 using –auto parameter and trimmed with Trimal 53 using the –automated1 parameter. Multiple sequence alignments were concatenated using Phyutility 54 to create a supermatrix. A bootstrapped maximum likelihood phylogeny was inferred using IQ-Tree 55 using the Bayesian Information Criterion (BIC) to select best fitting substitution model and empirical profile mixture models (C10–C60). 1000 ultrafast bootstrap replicates were used. Identifying duplicated and lost gene groups To understand the role of gene duplication and loss in the diversification of flowering plants, gene tree-species tree reconciliation was used. Protein sequences of viable orthogroups (those with 4 or more members) were aligned with MAFFT 52 using gap threshold 0.2, due to the large size of gene families. Aligned sequences were trimmed with TRIMAL 53 using –automated1 parameter. Maximum Likelihood trees were inferred with IQTREE 55 as above. Additionally, 1000 ultrafast bootstrap replicates were specified to estimate uncertainty as well as the option to write these to ufboot file with branch lengths. These ufboot tree files were used as input into AleRax 56 to infer ancestral gene content and instances of gene duplication and loss. Results were visualised in R 57 using packages tidyr 58 and GGplot2 59 . Trees were plotted in iToL 60 . To obtain a functional description for various flowering plant gene groups, proteins were assessed using Gene Ontology 61 . Identifying gained and lost gene groups The OrthoFinder gene count table (Orthogroups.GeneCount.tsv) was analysed to identify the origin and complete loss of orthologous groups of proteins (OGs) based upon their taxonomic occupancy (Data S4). Different sets of OGs can be analyzed (initially defined in Paps and Holland 62 ); Ancestral (OGs present in the Last Common Ancestor of a clade), Ancestral Core (OGs present in every representative species within a clade or absent only in one genome), Novel (OGs present in the Last Common Ancestor of a clade and absent in all outgroup taxa), Novel Core (OGs present in every representative species within a clade or absent only once and absent in all outgroup taxa), Lost (OGs lost in the Last Common Ancestor of a clade). Clustering of gene content To understand genome diversity during early flowering plant evolution, we clustered the count data for the protein coding genes. Count data was transformed to presence and absence and NMDS coordinates were calculated in R with Vegan and plotted with ggplot 59 . Declarations Data availability Data will be available on FigShare References Silvestro, D. et al. Fossil data support a pre-Cretaceous origin of flowering plants. Nat Ecol Evol 5 , 449–457 (2021). Li, H. T. et al. Origin of angiosperms and the puzzle of the Jurassic gap. Nat Plants 5 , 461–470 (2019). Jiao, Y. et al. Ancestral polyploidy in seed plants and angiosperms. Nature 473 , 97–100 (2011). Clark, J. W. & Donoghue, P. C. J. Uncertainty in the timing of diversification of flowering plants rests with equivocal interpretation of their fossil record. R Soc Open Sci 12 , (2025). Leebens-Mack, J. H. et al. One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574 , 679–685 (2019). Zuntini, A. R. et al. Phylogenomics and the rise of the angiosperms. Nature (2024). Diamond, J. Evolution, consequences and future of plant and animal domestication. Nature 418 , 700–707 (2002). Govaerts, R., Nic Lughadha, E., Black, N., Turner, R. & Paton, A. The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity. Scientific Data 2021 8:1 8 , 1–10 (2021). Benton, M. J., Wilf, P. & Sauquet, H. The Angiosperm Terrestrial Revolution and the origins of modern biodiversity. New Phytologist 233 , 2017–2035 (2021). Moyroud, E. & Glover, B. J. The Evolution of Diverse Floral Morphologies. Current Biology 27 , R941–R951 (2017). Holbourn, A. E. et al. Late Miocene climate cooling and intensification of southeast Asian winter monsoon. Nat Commun 9 , 1–13 (2018). Gurung, K. et al. Climate windows of opportunity for plant expansion during the Phanerozoic. Nat Commun 13 , 1–9 (2022). Pennington, R. T., Crook, Q. C. B. & Richardson, J. A. Introduction and synthesis: Plant phylogeny and the origin of major biomes. Philosophical Transactions of the Royal Society B: Biological Sciences 359 , 1455–1464 (2004). Chase, M. W. et al. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Botanical Journal of the Linnean Society 181 , 1–20 (2016). Christenhusz, M. J. M. & Byng, J. W. The number of known plant species in the world and its annual increase. Phytotaxa 261 , 201–217 (2016). Marks, R. A., Hotaling, S., Frandsen, P. B. & VanBuren, R. Representation and participation across 20 years of plant genome sequencing. Nat Plants 7 , 1571–1578 (2021). Bernal-Gallardo, J. J. & Folter, S. de. Plant genome information facilitates plant functional genomics. Planta 259 , (2024). Schwacke, R., Bolger, M. E. & Usadel, B. PubPlant – a continuously updated online resource for sequenced and published plant genomes. Front Plant Sci 16 , 1603547 (2025). Morel, B., Williams, T. A., Stamatakis, A. & Szöllősi, G. J. AleRax: a tool for gene and species tree co-estimation and reconciliation under a probabilistic model of gene duplication, transfer, and loss. Bioinformatics 40 , (2024). Clark, J. W. & Donoghue, P. C. J. Whole-Genome Duplication and Plant Macroevolution. Trends Plant Sci 23 , 933–945 (2018). Carta, A., Bedini, G. & Peruzzi, L. A deep dive into the ancestral chromosome number and genome size of flowering plants. New Phytologist 228 , 1097–1106 (2020). Afzal, A. J., Kim, J. H. & Mackey, D. The role of NOI-domain containing proteins in plant immune signaling. BMC Genomics 14 , 327 (2013). Bowles, A. M. C., Bechtold, U. & Paps, J. The Origin of Land Plants Is Rooted in Two Bursts of Genomic Novelty. Current Biology 30 , 530–536 (2020). Guijarro-Clarke, C., Holland, P. W. H. & Paps, J. Widespread patterns of gene loss in the evolution of the animal kingdom. Nat Ecol Evol 4 , 519–523 (2020). Ocaña-Pallarès, E. et al. Divergent genomic trajectories predate the origin of animals and fungi. Nature 609 , 747–753 (2022). Fernández, R. & Gabaldón, T. Gene gain and loss across the metazoan tree of life. Nat Ecol Evol 4 , 524–533 (2020). Paps, J., Rossi, M. E., Bowles, A. M. C. & Alvarez-Presas, M. Assembling animals: trees, genomes, cells, and contrast to plants. Front Ecol Evol 11 , (2023). Perico, C., Tan, S. & Langdale, J. A. Developmental regulation of leaf venation patterns: monocot versus eudicots and the role of auxin. New Phytologist 234 , 783–803 (2022). Sauquet, H. et al. The ancestral flower of angiosperms and its early diversification. Nat Commun 8 , (2017). John W. Chandler. Cotyledon organogenesis. J Exp Bot 59 , 2917–2931 (2008). Burger, W. C. The Question of Cotyledon Homology in Angiosperms. The Botanical Review 64 , 356–371 (1998). Scarpella, E. & Meijer, A. H. Pattern formation in the vascular system of monocot and dicot plant species. New Phytologist 164 , 209–242 (2004). Chen, C. & Du, X. LEAFY COTYLEDONs: Connecting different stages of plant development. Front Plant Sci 13 , (2022). Jung, J. K. H. H. & McCouch, S. Getting to the roots of it: Genetic and hormonal control of root architecture. Front Plant Sci 4 , (2013). Chanderbali, A. S., Berger, B. A., Howarth, D. G., Soltis, P. S. & Soltis, D. E. Evolving ideas on the origin and evolution of flowers: New perspectives in the genomic era. Genetics 202 , 1255–1265 (2016). Doyle, J. A. Molecular and Fossil Evidence on the Origin of Angiosperms. Annu Rev Earth Planet Sci 40 , 301–326 (2012). Cheng, X. et al. NO APICAL MERISTEM (MtNAM) regulates floral organ identity and lateral organ separation in Medicago truncatula. New Phytologist 195 , 71–84 (2012). Cai, L. et al. Deeply Altered Genome Architecture in the Endoparasitic Flowering Plant Sapria himalayana. Current Biology 31 , 1002–1011 (2021). López-Martínez, A. M. et al. Angiosperm flowers reached their highest morphological diversity early in their evolutionary history. New Phytologist 241 , 1348–1360 (2023). Zhang, T. et al. Phylogenomic profiles of whole-genome duplications in Poaceae and landscape of differential duplicate retention and losses among major Poaceae lineages. Nat Commun 15 , (2024). Zhang, G. Q. et al. The Apostasia genome and the evolution of orchids. Nature 549 , 379–383 (2017). Landis, J. B. et al. Impact of whole-genome duplication events on diversification rates in angiosperms. Am J Bot 105 , 348–363 (2018). Vargas-Chávez, C. et al. An episodic burst of massive genomic rearrangements and the origin of non-marine annelids. bioRxiv 2024.05.16.594344 (2025) doi:10.1101/2024.05.16.594344. Palfalvi, G. et al. Genomes of the Venus Flytrap and Close Relatives Unveil the Roots of Plant Carnivory. Current Biology 30 , 2312–2320 (2020). Givnish, T. J. et al. Monocot plastid phylogenomics, timeline, net rates of species diversification, the power of multi-gene analyses, and a functional model for the origin of monocots. Am J Bot (2018). Albalat, R. & Cañestro, C. Evolution by gene loss. Nat Rev Genet 17 , 379–391 (2016). O’Malley, M. A., Wideman, J. G. & Ruiz-Trillo, I. Losing Complexity: The Role of Simplification in Macroevolution. Trends Ecol Evol 31 , 608–621 (2016). Leitch, I. J., Beaulieu, J. M., Chase, M. W., Leitch, A. R. & Fay, M. F. Genome Size Dynamics and Evolution in Monocots. J Bot 2010 , 1–18 (2010). Waterhouse, R. M. et al. BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics. Mol Biol Evol 35 , 543–548 (2018). Emms, D. M. & Kelly, S. OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biol 20 , (2019). Harris, B. J., Harrison, C. J., Hetherington, A. M. & Williams, T. A. Phylogenomic Evidence for the Monophyly of Bryophytes and the Reductive Evolution of Stomata. Current Biology 30 , 2001–2012 (2020). Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30 , 3059–66 (2002). Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25 , 1972–3 (2009). Smith, S. A. & Dunn, C. W. Phyutility: A phyloinformatics tool for trees, alignments and molecular data. Bioinformatics 24 , 715–716 (2008). Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32 , 268–74 (2015). Morel, B., Williams, T. A., Stamatakis, A. & Szöllősi, G. J. AleRax: a tool for gene and species tree co-estimation and reconciliation under a probabilistic model of gene duplication, transfer, and loss. Bioinformatics 40 , (2024). R Core Team, A. A language and environment for statistical computing. (2014). Henry, H. W. and L. tidyr: Easily Tidy Data with ‘spread()’ and ‘gather()’ Functions. Preprint at https://cran.r-project.org/package=tidyr (2018). Wickham, H. ggplot2: Elegant Graphics for Data Analysis. Preprint at (2016). Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res 47 , 256-W259 (2019). Mi, H. et al. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res 45 , 183–189 (2017). Paps, J. & Holland, P. W. H. Reconstruction of the ancestral metazoan genome reveals an increase in genomic novelty. Nat Commun 9 , 1730 (2018). Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7464600","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":505945283,"identity":"de494e83-8193-4393-a2d2-61c07cfe3e8c","order_by":0,"name":"Alexander M.C. Bowles","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABIElEQVRIie2RMUvEMBiGv1BIl2DWyMn1L3wlcFPFv2I5KA4nFG4RHDwo5BbBNT/CQXG5sRKwS8BVuKUi1MXB4upgrh6okEPdBPtAyAfJw5uXAPT0/EVCt8gMYMttD4Cw8/kw8CrBWqFulE5hv1LowM3fK7wIbgRZJKkK581xnieM83OEVkHEZ0yiRxGGjpHYLFXMjpYaM7atn5BoC7Eumdz3xRgW10SZQyUmdMnQMLyzGLAjIBfAZOkxIsOfy06JHpvph4Kwt0lxd8g6BUZBp9yedinpSvE9LDZUYmqzE8UmcvDehebX2oqxNnTqqz+sika0i0TysLp/yV+TIefmsm5Vsns2L66Er/6Kr/HiAF0FsekjffCq/vHdnp6enn/BGxfYWUpgE9VfAAAAAElFTkSuQmCC","orcid":"","institution":"University of Oxford","correspondingAuthor":true,"prefix":"","firstName":"Alexander","middleName":"M.C.","lastName":"Bowles","suffix":""},{"id":505945284,"identity":"c3db8ef5-6803-4f06-a037-0404b2fb5108","order_by":1,"name":"Jordi Paps","email":"","orcid":"","institution":"University of Bristol","correspondingAuthor":false,"prefix":"","firstName":"Jordi","middleName":"","lastName":"Paps","suffix":""}],"badges":[],"createdAt":"2025-08-26 15:48:52","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-7464600/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7464600/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":90086663,"identity":"6ce5d422-0857-4da4-a0ab-f12bd28760f5","added_by":"auto","created_at":"2025-08-28 10:13:20","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":182662,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA.\u003c/strong\u003e Phylogenetic tree of angiosperms based on 765 genes, inferred with site heterogenous models in IQTree, with bootstraps highlighted at each node. Taxa are colour coded as follows; outgroup: no colour, ANA grade: grey, Ceratophyllales: red, Monocots: pink, Magnoliids: green, Chloranthales: yellow, Eudicots: blue. Image credits are available in Data S10. \u003cstrong\u003eB. \u003c/strong\u003eCondensed phylogenetic tree of angiosperms, using the colour coding above. \u003cstrong\u003eC. \u003c/strong\u003eResults of AU (Approximately Unbiased) testing.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-7464600/v1/0e07d4a660211bd4e11ab123.png"},{"id":90085688,"identity":"3ec422dd-37a8-4c61-bb00-e9e855dde6d4","added_by":"auto","created_at":"2025-08-28 09:57:20","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":135331,"visible":true,"origin":"","legend":"\u003cp\u003eContrasting genomic paths of Monocots and Eudicots. The central tree shows angiosperm groups represented in the genomic dataset colour coded as follows: ANA grade: grey, Ceratophyllales: red, Monocots: pink, Magnoliids: green, Chloranthales: yellow, Eudicots: blue. Rates of gene duplication and loss for key nodes are highlighted.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-7464600/v1/d03665d32af27bb6999b3df9.png"},{"id":90085910,"identity":"7ba23e22-a0e7-45dd-a09a-3623fe913b2c","added_by":"auto","created_at":"2025-08-28 10:05:20","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":181816,"visible":true,"origin":"","legend":"\u003cp\u003eThe history of angiosperms is rooted in gene loss. Functional analysis of lost genes in Monocots (pink) and duplicated genes in Eudicots (blue). Silhouettes are sourced from phylopic.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-7464600/v1/080ab53b10870856dbe6c353.png"},{"id":90085691,"identity":"45e586e6-fa28-4c71-9b38-980754b9cbeb","added_by":"auto","created_at":"2025-08-28 09:57:20","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":97455,"visible":true,"origin":"","legend":"\u003cp\u003eGene evolution of flowering plants. \u003cstrong\u003eA.\u003c/strong\u003e shows the rates of gene duplication and loss with the origin of particular angiosperm orders. \u003cstrong\u003eB.\u003c/strong\u003e shows the gene space of 273 archaeplastid genome, colour coded by taxonomic group.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-7464600/v1/e68e44a7abd7eb1530c85a46.png"},{"id":90085694,"identity":"5ccc3205-bb87-4575-ae1a-557370f329e9","added_by":"auto","created_at":"2025-08-28 09:57:20","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":97929,"visible":true,"origin":"","legend":"\u003cp\u003eSummary model for the divergence of monocots.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-7464600/v1/8d21c38971467014d07df26f.png"},{"id":90086825,"identity":"4b7cb5b7-1255-4dc4-b579-eff9d40dfdf7","added_by":"auto","created_at":"2025-08-28 10:21:20","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1292549,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7464600/v1/f97946bf-fa67-45bb-bc2c-30cbef1c8a4f.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eThe genomes of flowering plants reveal contrasting evolutionary paths in monocots and eudicots\u003c/p\u003e","fulltext":[{"header":"Main","content":"\u003cp\u003eFlowering plants (angiosperms) are the foundation of most of Earth’s terrestrial ecosystems and the basis of human agricultural success\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e,\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e. They are of significant socioeconomic importance, with major crops and ornamental plants found within monocots (e.g. wheat, rice, orchids), eudicots (e.g. legumes, potatoes, sunflowers, roses), and other angiosperms (e.g. magnolias, avocado, pepper, water lilies). As the most successful lineage of land plants, they exhibit a remarkable diversity of form, function, and ecology\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e,\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. Their origin and radiation have shaped climates\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e,\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e, the structure of the biosphere\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e, and driven the evolution of other organisms — such as fungi, insects, and birds\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e — with whom they have also co-evolved. Terrestrial biodiversity is therefore inherently intertwined with angiosperm diversification, highlighting their critical role in promoting ecosystem complexity.\u003c/p\u003e\u003cp\u003eThe improving picture of flowering plant interrelationships has highlighted the diversification of the main lineages of angiosperms, namely the ANA grade, Ceratophyllales, Magnoliids, Monocotyledons, Chloranthales, and Eudicotyledons\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. The two largest groups, the monocots and eudicots, comprise over 95% of flowering plant diversity\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e, but have key differences in their morphology, ecology and life histories. However, the genomic basis of these differences and their evolution is not fully understood. The dramatic increase of whole genome sampling offers a timely opportunity to understand this iconic episode of plant evolution. Pinpointing how these two lineages evolved to be so different requires a detailed reconstruction of the genomic changes underpinning flowering plant evolution. Here we produce a statistically supported angiosperm tree of life using an extensive sampling of complete genomes\u003csup\u003e\u003cspan additionalcitationids=\"CR17\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e–\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e, with a phylogenetically broad outgroup. With this tree, we then investigate the role of gene duplication and loss in the origin of distinct angiosperm groups. Our analyses reveal that the evolution of monocots was driven by major reductions in their gene complement, while the emergence of eudicots was underlined by gene duplication.\u003c/p\u003e\n\u003ch3\u003eAnalysing flowering plant genome evolution\u003c/h3\u003e\n\u003cp\u003eDefining the genomic changes accompanying flowering plant evolution is key to unravelling the molecular basis of their biological innovations. We curated a dataset of 1181 flowering plant whole genomes comprising 50 out of 64 angiosperm orders representing 98% of flowering plant species (Data S1)\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. With variable genome quality and some groups overrepresented in databases\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e, five genomes per order were selected based on protein-coding gene completeness and phylogenetic distribution to produce a high quality and taxonomically balanced dataset. Genomes for non-flowering land plants and other archaeplastids (red, green, and streptophyte algae) to use as outgroup taxa were added. The final dataset consists of 273 genomes (Data S2), including a broad representation of algae (36), bryophytes (14), lycophytes (3), ferns (7) and gymnosperms (11) and angiosperms (202). To the best of our knowledge, this constitutes the largest, most taxonomically comprehensive dataset of angiosperm genomes to date.\u003c/p\u003e\u003cp\u003eThe 273 archaeplastid genomes, altogether containing 10.2\u0026nbsp;million proteins, were clustered into 223,333 orthogroups, sets of evolutionarily related genes which may include orthologs and paralogs. Though improving, the topology of the flowering plant phylogeny is a long-standing topic in evolutionary biology\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. In order to infer the evolutionary relationships amongst these 273 species, this genome dataset was queried to identify orthogroups with a 60% taxonomic occupancy, resulting in 765 orthogroups being selected. Using site-heterogeneous models, phylogenetic analysis of these genes was used to gain an understanding of flowering plant relations (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Our analysis recovers the well-established placement of Amborellales and Nymphaeales as successive sister groups to the mesangiosperms. Other such relationships have historically been more uncertain (e.g. the placement of Ceratophyllales). Similar to Zuntini et al\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e, we recover Ceratophyllales as sister to all other mesangiosperms. Akin to the topology of the 1KP project\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e, our analyses recover monocots as sister to all remaining mesangiosperms as well as Magnoliids and Chloranthales as the closest relatives of eudicots. Statistical testing rejected other alternative flowering plant topologies (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, Figure S1 \u0026amp; S2)\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e, giving further support to our phylogeny. As such, we use this species tree to understand the evolution of angiosperm gene content. Additionally, within these major groups, order level relationships are largely consistent with published phylogenies\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eLarge differences in gene content between monocots and eudicots\u003c/h2\u003e\u003cp\u003eWe determined broad-scale differences in gene content in the morphologically divergent monocots and eudicots using sophisticated gene tree-species tree reconciliation methods (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, Data S3 \u0026amp; S4, Figure S3). We investigated the duplication and loss of genes within orthogroups in all the main lineages of flowering plants using AleRax\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e. These analyses revealed large scale genome reduction with the origin of monocots, with gene loss estimated in 9480 orthogroups. These rates of gene loss were over ten-fold greater compared to gene duplication (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). In contrast, the origin of eudicots was accompanied by modest gene duplication (3356 orthogroups), with rates of gene duplication compared to gene loss estimated at about two-fold greater (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Patterns of gene evolution across the other major groups of flowering plants also revealed high rates of gene duplication in angiosperms and core eudicots (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Both nodes have been previously linked to whole genome duplication (WGD) events\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e. In addition, patterns of reductive evolution continued within the monocots, in the commelinids, correlating with a reduction in ancestral genome size\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWe also investigated the role of novel core genes in the origin and early evolution of angiosperms (Data S5). These are defined as orthogroups that are present in all (or all bar one) members of a clade and absent in all outgroup taxa. A single novel core orthogroup is found across all flowering plant genomes which is involved in plant immunity, containing a NOI (RIN4-like) domain\u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e; this domain plays a crucial role in plant immune signalling, particularly in effector-triggered immunity, by binding to virulence factors from infecting bacteria. For monocots and eudicots, no conserved gene novelties were observed (Data S5). These analyses agree with recent findings highlighting a period of exceptional diversification in the early evolution of angiosperms, in which 80% of extant orders originated\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e, with high rates of gene tree conflict\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e and whole genome duplication\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e. This contrasts to older plant lineages which experienced two early bursts of gene novelty in land plants and multicellular streptophytes respectively (Data S5)\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e and corresponds to patterns seen in animals and fungi whose genomes are characterised by early conserved gene novelty, followed by more recent periods of dynamic genome evolution\u003csup\u003e\u003cspan additionalcitationids=\"CR25 CR26\" citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e–\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eReductive evolution in monocots\u003c/h3\u003e\n\u003cp\u003eFunctional analysis suggested that genomic loss (identified by AleRax) underpins the morphology of monocots (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). Gene Ontology (GO) terms suggested that this is underpinned by widespread gene loss across numerous functional categories, with especially high losses in gene regulation, protein synthesis, and signal transduction. These losses likely contributed to their streamlined metabolism and distinct developmental architecture.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eMonocots have a single cotyledon, parallel leaf venation, complex vascular organisation, fibrous roots and trimerous flowers\u003csup\u003e\u003cspan additionalcitationids=\"CR29 CR30 CR31\" citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e–\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e. Eudicots by comparison have two cotyledons, reticulate leaf venation, vascular tissue organised in rings, a taproot and tetra- to pentamerous flowers\u003csup\u003e\u003cspan additionalcitationids=\"CR29 CR30 CR31\" citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e–\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e. The genetic networks underpinning the development of these innovations were queried against the list of lost genes (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Data S6)\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e,\u003cspan additionalcitationids=\"CR34\" citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e–\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u003c/sup\u003e. Several gene families lost from the first monocots have been functionally shown to have single or fused cotyledon phenotypes in \u003cem\u003eArabidopsis\u003c/em\u003e, offering potential mechanistic insights into monocotyledons. These include CUP-SHAPED COTYLEDON (CUC), SHOOTMERISTEMLESS (STM) and members of the TIR (TOLL/INTERLEUKIN-1 RECEPTOR-LIKE) family. Leaf morphogenesis genes were lost at every stage of leaf development, ranging from establishment to transition, through modification and senescence (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Data S6 \u0026amp; S7). This suggests that the first monocots constructed their leaves in a distinct way to eudicots. Key components of primary root development were lost in the first monocots, including members of the WOX (WUSCHEL-RELATED HOMEOBOX), TIRs (as above) and EIN3-4 (ETHYLENE INSENSITIVE) orthgroups, which coincides with their fibrous root habit (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Data S6 \u0026amp; S7). Finally, fundamental regulators in floral development exhibited subfamily loss in monocots, including SEPALLATA, APETALAs, AGAMOUS and PISTILLATA (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Data S6 \u0026amp; S7).\u003c/p\u003e\n\u003ch3\u003eEvolution by genome expansion in eudicots\u003c/h3\u003e\n\u003cp\u003eGene duplication, as identified by AleRax, is found in the origin of eudicots, with the largest number of genes being related to gene regulation and signalling (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). Eudicots show a radiation of floral development genes such as SEPALLATA, APETALA, PISTILLATA and AGAMOUS gene groups (Data S8 \u0026amp; S9). GO terms identified these genes function in floral whorl development, meristem identity and the maintenance of floral organ identity, potentially linked to the first tetramerous flowers (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Data S8 \u0026amp; S9). Genes associated with vernalisation responses were classified in these expansions (FRIGIDA, VERNALISATION1-2). Further duplications were seen in the evolution of reproductive structures (anther wall development, male gamete generation), coinciding with origin of tricolpate pollen in flowering plants\u003csup\u003e\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Data S8 \u0026amp; S9). Furthermore, the regulation of cell wall organisation was associated with these duplications including callose synthase, expansin and pectinerase gene families (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Data S8 \u0026amp; S9). The biosynthetic pathways of several compounds expanded during this period of eudicot radiation, including beta-D-glucan and flavonol biosynthesis (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, Data S8 \u0026amp; S9).\u003c/p\u003e\n\u003ch3\u003eGene diversity early in angiosperm evolution\u003c/h3\u003e\n\u003cp\u003eGene family diversification within orders pointed to a history of gene loss, compensated by modest gene duplication (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). This suggests that flowering plant orders evolved through a dynamic process involving both the reduction and expansion of specific gene families. Additionally, novel core orthogroups were observed at the order level (Data S5), with NO APICAL MERISTEM (NAM) proteins (19) being a common observation. This protein domain controls boundary formation and lateral organ separation, vital for leaf and flower patterning\u003csup\u003e\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e. These could have contributed to the development of elaborate plant architecture seen within angiosperm orders.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTo understand the diversity of flowering plant genomes early in their evolutionary history, we analysed plant gene disparity, based on the shared presence and absence of genes (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). The distances between species in this space shows the dissimilarity in gene content. Non-metric multidimensional scaling revealed that angiosperms comprise a third of the theoretical gene space of plants; non-flowering land plants and archaeplastid algae consist of the other two thirds. The majority of angiosperm gene space is tightly clustered within this third, with several plant specialists occupying the extremes (e.g. \u003cem\u003eSapria himalayana\u003c/em\u003e, a rare holoparasite)\u003csup\u003e\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e\u003c/sup\u003e. This suggests that angiosperms reached a core genomic diversity early in their evolutionary history (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). This corresponds to their high floral morphological diversity which had emerged by the early Cretaceous\u003csup\u003e\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\n\u003ch3\u003eContrasting genomic paths\u003c/h3\u003e\n\u003cp\u003eTogether, the emerging picture from our analyses indicates that dynamic genome evolution has shaped the origin of eudicots and monocots (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). The first monocot genomes are characterised by gene reduction (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). By contrast, eudicots accumulated by a diversification of genes (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). These two strategies have most likely led to the dominance of eudicots and monocots over the other groups of flowering plants. Despite these initial contrasting genomic paths, taxa within these lineages have continued to diversify, with an accumulation of genetic changes that postdate the divergence of monocots and eudicots. For example, complex patterns of genome duplication are seen within the grasses and orchids\u003csup\u003e\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e,\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u003c/sup\u003e. Recent studies have used taxonomically comprehensive transcriptomic data to demonstrate high diversification rates and gene family expansion in the major plant groups\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e,\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e\u003c/sup\u003e. As a consensus of flowering plant relationships is reached, the extent to which patterns of dynamic genome evolution persist in the Magnoliids, Chloranthales, Ceratophyllales and ANA grade will be garnered.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eThe large-scale gene loss seen in monocots (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e) as well as flowering plant orders (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e) challenges the view of plant evolution being solely driven by WGD. Indeed, our patterns of gene loss did not appear to be associated with these known whole genome duplication events (Figure S4). While clearly an important mechanism of angiosperm diversification, WGD is but one force shaping plant genomes. The significance of these patterns and processes are also being realised in other nodes of the tree of life (e.g. annelids, carnivorous plants)\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e,\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e. During early animal evolution, the origin of Deuterostomes and Ecdysozoa is dominated by gene loss\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e. Collectively, these works are highlighting the loss as well as gain of genes throughout evolution.\u003c/p\u003e\u003cp\u003eThe drivers behind the divergence of these two main groups remains unclear. Some hypotheses have suggested that adaptation to aquatic environments could explain the divergence of monocots\u003csup\u003e\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u003c/sup\u003e. This is largely based on the life histories of the first two splitting lineages, Acorales, a group mostly found in wetlands and Alismatales, containing the duckweeds and seagrasses. The change to a fibrous root system, parallel leaf venation and reorganisation of vascular tissues could potentially be beneficial for adaptations to an aquatic environment. In this context, the simplification of structures such as the taproot might have reduced energy expenditure in monocots, enabling them to allocate resources towards other survival-enhancing traits. Whatever the driver, this divergence, which occurred over 125\u0026nbsp;million years ago during the early Cretaceous\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e, reflects a deeper evolutionary shift in plant architecture and functionality.\u003c/p\u003e\u003cp\u003eGene family contractions in monocots likely involved not only the deletion of entire genes but also the loss of gene function through rapid mutations and pseudogenization\u003csup\u003e\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e,\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e. Chromosomal rearrangements and genome size variation likely contributed to their morphological divergence from eudicots\u003csup\u003e\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e\u003c/sup\u003e. In addition, this study solely focuses on protein-coding genes; however, non-coding genes, regulatory regions, and epigenetic modifications most likely contributed to the diversification of plant life. The analysis presented here, which incorporates genomic data for 273 plants from across the tree of life, provides new insight into the composition of flowering plant genomes and emphasizes the role of genome evolution in the radiation of angiosperms.\u003c/p\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003cdiv id=\"Sec9\" class=\"Section3\"\u003e\u003c/div\u003e\u003c/div\u003e\n\n\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"Methods","content":"\u003ch2\u003eCompiling genomic dataset\u003c/h2\u003e\n\u003cp\u003eBroad taxonomic sampling of genomic data was implemented to accurately infer the origin and diversification of angiosperm gene content (Data S1). In total, over 1000 flowering plant genomes were identified in the literature, by 31st January 2024 (Data S1). BUSCO analysis was used to assess the quality of genome annotation, using a threshold of \u0026lt;\u0026thinsp;30% missing genes in the BUSCO Eukaryota dataset as a benchmark to accept a genome for further analysis (Data S1)\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e49\u003c/span\u003e\u003c/sup\u003e. To improve computational tractability, a maximum of five species per flowering plant order were chosen. In total, 273 archaeplastid genome were downloaded equating to 10,227,129 predicted proteins, including 202 flowering plants and 71 archaeplastid outgroup taxa (Data S2).\u003c/p\u003e\n\u003ch3\u003eOrthology inference\u003c/h3\u003e\n\u003cp\u003eOrthoFinder (v.2.3.7) was used to cluster protein-coding genes into orthogroups\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e50\u003c/span\u003e\u003c/sup\u003e, based on sequence divergence, using default settings (orthofinder -f data_folder). Orthofinder was launched on 31st January 2024 and therefore any genomes published after this date were not included in the analysis.\u003c/p\u003e\n\u003ch2\u003eSpecies tree analysis\u003c/h2\u003e\n\u003cp\u003eSingle copy orthologs were identified using a previously described python script\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e51\u003c/span\u003e\u003c/sup\u003e, which removes paralogous genes from orthogroups. The script enables the user to specify a minimum taxonomic occupancy of each orthogroup, set at 60% to select 765 genes. Single copy orthologs were aligned using Mafft\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e52\u003c/span\u003e\u003c/sup\u003e using \u0026ndash;auto parameter and trimmed with Trimal\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e53\u003c/span\u003e\u003c/sup\u003e using the \u0026ndash;automated1 parameter. Multiple sequence alignments were concatenated using Phyutility\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e to create a supermatrix. A bootstrapped maximum likelihood phylogeny was inferred using IQ-Tree\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e55\u003c/span\u003e\u003c/sup\u003e using the Bayesian Information Criterion (BIC) to select best fitting substitution model and empirical profile mixture models (C10\u0026ndash;C60). 1000 ultrafast bootstrap replicates were used.\u003c/p\u003e\n\u003ch2\u003eIdentifying duplicated and lost gene groups\u003c/h2\u003e\n\u003cp\u003eTo understand the role of gene duplication and loss in the diversification of flowering plants, gene tree-species tree reconciliation was used. Protein sequences of viable orthogroups (those with 4 or more members) were aligned with MAFFT\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e52\u003c/span\u003e\u003c/sup\u003e using gap threshold 0.2, due to the large size of gene families. Aligned sequences were trimmed with TRIMAL\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e53\u003c/span\u003e\u003c/sup\u003e using \u0026ndash;automated1 parameter. Maximum Likelihood trees were inferred with IQTREE\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e55\u003c/span\u003e\u003c/sup\u003e as above. Additionally, 1000 ultrafast bootstrap replicates were specified to estimate uncertainty as well as the option to write these to ufboot file with branch lengths. These ufboot tree files were used as input into AleRax\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e56\u003c/span\u003e\u003c/sup\u003e to infer ancestral gene content and instances of gene duplication and loss. Results were visualised in R\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e57\u003c/span\u003e\u003c/sup\u003e using packages tidyr\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e58\u003c/span\u003e\u003c/sup\u003e and GGplot2\u003csup\u003e59\u003c/sup\u003e. Trees were plotted in iToL\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e60\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003eTo obtain a functional description for various flowering plant gene groups, proteins were assessed using Gene Ontology\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e61\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\n\u003ch2\u003eIdentifying gained and lost gene groups\u003c/h2\u003e\n\u003cp\u003eThe OrthoFinder gene count table (Orthogroups.GeneCount.tsv) was analysed to identify the origin and complete loss of orthologous groups of proteins (OGs) based upon their taxonomic occupancy (Data S4). Different sets of OGs can be analyzed (initially defined in Paps and Holland\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e62\u003c/span\u003e\u003c/sup\u003e);\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003e\n \u003cp\u003eAncestral (OGs present in the Last Common Ancestor of a clade),\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eAncestral Core (OGs present in every representative species within a clade or absent only in one genome),\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eNovel (OGs present in the Last Common Ancestor of a clade and absent in all outgroup taxa),\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eNovel Core (OGs present in every representative species within a clade or absent only once and absent in all outgroup taxa),\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eLost (OGs lost in the Last Common Ancestor of a clade).\u003c/p\u003e\n \u003c/li\u003e\n\u003c/ul\u003e\n\u003ch2\u003eClustering of gene content\u003c/h2\u003e\n\u003cp\u003eTo understand genome diversity during early flowering plant evolution, we clustered the count data for the protein coding genes. Count data was transformed to presence and absence and NMDS coordinates were calculated in R with Vegan and plotted with ggplot\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e59\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eData will be available on FigShare\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eSilvestro, D. \u003cem\u003eet al.\u003c/em\u003e Fossil data support a pre-Cretaceous origin of flowering plants. \u003cem\u003eNat Ecol Evol\u003c/em\u003e \u003cstrong\u003e5\u003c/strong\u003e, 449\u0026ndash;457 (2021).\u003c/li\u003e\n\u003cli\u003eLi, H. T. \u003cem\u003eet al.\u003c/em\u003e Origin of angiosperms and the puzzle of the Jurassic gap. \u003cem\u003eNat Plants\u003c/em\u003e \u003cstrong\u003e5\u003c/strong\u003e, 461\u0026ndash;470 (2019).\u003c/li\u003e\n\u003cli\u003eJiao, Y. \u003cem\u003eet al.\u003c/em\u003e Ancestral polyploidy in seed plants and angiosperms. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e473\u003c/strong\u003e, 97\u0026ndash;100 (2011).\u003c/li\u003e\n\u003cli\u003eClark, J. W. \u0026amp; Donoghue, P. C. J. Uncertainty in the timing of diversification of flowering plants rests with equivocal interpretation of their fossil record. \u003cem\u003eR Soc Open Sci\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, (2025).\u003c/li\u003e\n\u003cli\u003eLeebens-Mack, J. H. \u003cem\u003eet al.\u003c/em\u003e One thousand plant transcriptomes and the phylogenomics of green plants. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e574\u003c/strong\u003e, 679\u0026ndash;685 (2019).\u003c/li\u003e\n\u003cli\u003eZuntini, A. R. \u003cem\u003eet al.\u003c/em\u003e Phylogenomics and the rise of the angiosperms. \u003cem\u003eNature\u003c/em\u003e (2024).\u003c/li\u003e\n\u003cli\u003eDiamond, J. Evolution, consequences and future of plant and animal domestication. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e418\u003c/strong\u003e, 700\u0026ndash;707 (2002).\u003c/li\u003e\n\u003cli\u003eGovaerts, R., Nic Lughadha, E., Black, N., Turner, R. \u0026amp; Paton, A. The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity. \u003cem\u003eScientific Data 2021 8:1\u003c/em\u003e \u003cstrong\u003e8\u003c/strong\u003e, 1\u0026ndash;10 (2021).\u003c/li\u003e\n\u003cli\u003eBenton, M. J., Wilf, P. \u0026amp; Sauquet, H. The Angiosperm Terrestrial Revolution and the origins of modern biodiversity. \u003cem\u003eNew Phytologist\u003c/em\u003e \u003cstrong\u003e233\u003c/strong\u003e, 2017\u0026ndash;2035 (2021).\u003c/li\u003e\n\u003cli\u003eMoyroud, E. \u0026amp; Glover, B. J. The Evolution of Diverse Floral Morphologies. \u003cem\u003eCurrent Biology\u003c/em\u003e \u003cstrong\u003e27\u003c/strong\u003e, R941\u0026ndash;R951 (2017).\u003c/li\u003e\n\u003cli\u003eHolbourn, A. E. \u003cem\u003eet al.\u003c/em\u003e Late Miocene climate cooling and intensification of southeast Asian winter monsoon. \u003cem\u003eNat Commun\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, 1\u0026ndash;13 (2018).\u003c/li\u003e\n\u003cli\u003eGurung, K. \u003cem\u003eet al.\u003c/em\u003e Climate windows of opportunity for plant expansion during the Phanerozoic. \u003cem\u003eNat Commun\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 1\u0026ndash;9 (2022).\u003c/li\u003e\n\u003cli\u003ePennington, R. T., Crook, Q. C. B. \u0026amp; Richardson, J. A. Introduction and synthesis: Plant phylogeny and the origin of major biomes. \u003cem\u003ePhilosophical Transactions of the Royal Society B: Biological Sciences\u003c/em\u003e \u003cstrong\u003e359\u003c/strong\u003e, 1455\u0026ndash;1464 (2004).\u003c/li\u003e\n\u003cli\u003eChase, M. W. \u003cem\u003eet al.\u003c/em\u003e An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. \u003cem\u003eBotanical Journal of the Linnean Society\u003c/em\u003e \u003cstrong\u003e181\u003c/strong\u003e, 1\u0026ndash;20 (2016).\u003c/li\u003e\n\u003cli\u003eChristenhusz, M. J. M. \u0026amp; Byng, J. W. The number of known plant species in the world and its annual increase. \u003cem\u003ePhytotaxa\u003c/em\u003e \u003cstrong\u003e261\u003c/strong\u003e, 201\u0026ndash;217 (2016).\u003c/li\u003e\n\u003cli\u003eMarks, R. A., Hotaling, S., Frandsen, P. B. \u0026amp; VanBuren, R. Representation and participation across 20 years of plant genome sequencing. \u003cem\u003eNat Plants\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 1571\u0026ndash;1578 (2021).\u003c/li\u003e\n\u003cli\u003eBernal-Gallardo, J. J. \u0026amp; Folter, S. de. Plant genome information facilitates plant functional genomics. \u003cem\u003ePlanta\u003c/em\u003e \u003cstrong\u003e259\u003c/strong\u003e, (2024).\u003c/li\u003e\n\u003cli\u003eSchwacke, R., Bolger, M. E. \u0026amp; Usadel, B. PubPlant \u0026ndash; a continuously updated online resource for sequenced and published plant genomes. \u003cem\u003eFront Plant Sci\u003c/em\u003e \u003cstrong\u003e16\u003c/strong\u003e, 1603547 (2025).\u003c/li\u003e\n\u003cli\u003eMorel, B., Williams, T. A., Stamatakis, A. \u0026amp; Sz\u0026ouml;llősi, G. J. AleRax: a tool for gene and species tree co-estimation and reconciliation under a probabilistic model of gene duplication, transfer, and loss. \u003cem\u003eBioinformatics\u003c/em\u003e \u003cstrong\u003e40\u003c/strong\u003e, (2024).\u003c/li\u003e\n\u003cli\u003eClark, J. W. \u0026amp; Donoghue, P. C. J. Whole-Genome Duplication and Plant Macroevolution. \u003cem\u003eTrends Plant Sci\u003c/em\u003e \u003cstrong\u003e23\u003c/strong\u003e, 933\u0026ndash;945 (2018).\u003c/li\u003e\n\u003cli\u003eCarta, A., Bedini, G. \u0026amp; Peruzzi, L. A deep dive into the ancestral chromosome number and genome size of flowering plants. \u003cem\u003eNew Phytologist\u003c/em\u003e \u003cstrong\u003e228\u003c/strong\u003e, 1097\u0026ndash;1106 (2020).\u003c/li\u003e\n\u003cli\u003eAfzal, A. J., Kim, J. H. \u0026amp; Mackey, D. The role of NOI-domain containing proteins in plant immune signaling. \u003cem\u003eBMC Genomics\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 327 (2013).\u003c/li\u003e\n\u003cli\u003eBowles, A. M. C., Bechtold, U. \u0026amp; Paps, J. The Origin of Land Plants Is Rooted in Two Bursts of Genomic Novelty. \u003cem\u003eCurrent Biology\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 530\u0026ndash;536 (2020).\u003c/li\u003e\n\u003cli\u003eGuijarro-Clarke, C., Holland, P. W. H. \u0026amp; Paps, J. Widespread patterns of gene loss in the evolution of the animal kingdom. \u003cem\u003eNat Ecol Evol\u003c/em\u003e \u003cstrong\u003e4\u003c/strong\u003e, 519\u0026ndash;523 (2020).\u003c/li\u003e\n\u003cli\u003eOca\u0026ntilde;a-Pallar\u0026egrave;s, E. \u003cem\u003eet al.\u003c/em\u003e Divergent genomic trajectories predate the origin of animals and fungi. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e609\u003c/strong\u003e, 747\u0026ndash;753 (2022).\u003c/li\u003e\n\u003cli\u003eFern\u0026aacute;ndez, R. \u0026amp; Gabald\u0026oacute;n, T. Gene gain and loss across the metazoan tree of life. \u003cem\u003eNat Ecol Evol\u003c/em\u003e \u003cstrong\u003e4\u003c/strong\u003e, 524\u0026ndash;533 (2020).\u003c/li\u003e\n\u003cli\u003ePaps, J., Rossi, M. E., Bowles, A. M. C. \u0026amp; Alvarez-Presas, M. Assembling animals: trees, genomes, cells, and contrast to plants. \u003cem\u003eFront Ecol Evol\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, (2023).\u003c/li\u003e\n\u003cli\u003ePerico, C., Tan, S. \u0026amp; Langdale, J. A. Developmental regulation of leaf venation patterns: monocot versus eudicots and the role of auxin. \u003cem\u003eNew Phytologist\u003c/em\u003e \u003cstrong\u003e234\u003c/strong\u003e, 783\u0026ndash;803 (2022).\u003c/li\u003e\n\u003cli\u003eSauquet, H. \u003cem\u003eet al.\u003c/em\u003e The ancestral flower of angiosperms and its early diversification. \u003cem\u003eNat Commun\u003c/em\u003e \u003cstrong\u003e8\u003c/strong\u003e, (2017).\u003c/li\u003e\n\u003cli\u003eJohn W. Chandler. Cotyledon organogenesis. \u003cem\u003eJ Exp Bot\u003c/em\u003e \u003cstrong\u003e59\u003c/strong\u003e, 2917\u0026ndash;2931 (2008).\u003c/li\u003e\n\u003cli\u003eBurger, W. C. The Question of Cotyledon Homology in Angiosperms. \u003cem\u003eThe Botanical Review\u003c/em\u003e \u003cstrong\u003e64\u003c/strong\u003e, 356\u0026ndash;371 (1998).\u003c/li\u003e\n\u003cli\u003eScarpella, E. \u0026amp; Meijer, A. H. Pattern formation in the vascular system of monocot and dicot plant species. \u003cem\u003eNew Phytologist\u003c/em\u003e \u003cstrong\u003e164\u003c/strong\u003e, 209\u0026ndash;242 (2004).\u003c/li\u003e\n\u003cli\u003eChen, C. \u0026amp; Du, X. LEAFY COTYLEDONs: Connecting different stages of plant development. \u003cem\u003eFront Plant Sci\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, (2022).\u003c/li\u003e\n\u003cli\u003eJung, J. K. H. H. \u0026amp; McCouch, S. Getting to the roots of it: Genetic and hormonal control of root architecture. \u003cem\u003eFront Plant Sci\u003c/em\u003e \u003cstrong\u003e4\u003c/strong\u003e, (2013).\u003c/li\u003e\n\u003cli\u003eChanderbali, A. S., Berger, B. A., Howarth, D. G., Soltis, P. S. \u0026amp; Soltis, D. E. Evolving ideas on the origin and evolution of flowers: New perspectives in the genomic era. \u003cem\u003eGenetics\u003c/em\u003e \u003cstrong\u003e202\u003c/strong\u003e, 1255\u0026ndash;1265 (2016).\u003c/li\u003e\n\u003cli\u003eDoyle, J. A. Molecular and Fossil Evidence on the Origin of Angiosperms. \u003cem\u003eAnnu Rev Earth Planet Sci\u003c/em\u003e \u003cstrong\u003e40\u003c/strong\u003e, 301\u0026ndash;326 (2012).\u003c/li\u003e\n\u003cli\u003eCheng, X. \u003cem\u003eet al.\u003c/em\u003e NO APICAL MERISTEM (MtNAM) regulates floral organ identity and lateral organ separation in Medicago truncatula. \u003cem\u003eNew Phytologist\u003c/em\u003e \u003cstrong\u003e195\u003c/strong\u003e, 71\u0026ndash;84 (2012).\u003c/li\u003e\n\u003cli\u003eCai, L. \u003cem\u003eet al.\u003c/em\u003e Deeply Altered Genome Architecture in the Endoparasitic Flowering Plant Sapria himalayana. \u003cem\u003eCurrent Biology\u003c/em\u003e \u003cstrong\u003e31\u003c/strong\u003e, 1002\u0026ndash;1011 (2021).\u003c/li\u003e\n\u003cli\u003eL\u0026oacute;pez-Mart\u0026iacute;nez, A. M. \u003cem\u003eet al.\u003c/em\u003e Angiosperm flowers reached their highest morphological diversity early in their evolutionary history. \u003cem\u003eNew Phytologist\u003c/em\u003e \u003cstrong\u003e241\u003c/strong\u003e, 1348\u0026ndash;1360 (2023).\u003c/li\u003e\n\u003cli\u003eZhang, T. \u003cem\u003eet al.\u003c/em\u003e Phylogenomic profiles of whole-genome duplications in Poaceae and landscape of differential duplicate retention and losses among major Poaceae lineages. \u003cem\u003eNat Commun\u003c/em\u003e \u003cstrong\u003e15\u003c/strong\u003e, (2024).\u003c/li\u003e\n\u003cli\u003eZhang, G. Q. \u003cem\u003eet al.\u003c/em\u003e The Apostasia genome and the evolution of orchids. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e549\u003c/strong\u003e, 379\u0026ndash;383 (2017).\u003c/li\u003e\n\u003cli\u003eLandis, J. B. \u003cem\u003eet al.\u003c/em\u003e Impact of whole-genome duplication events on diversification rates in angiosperms. \u003cem\u003eAm J Bot\u003c/em\u003e \u003cstrong\u003e105\u003c/strong\u003e, 348\u0026ndash;363 (2018).\u003c/li\u003e\n\u003cli\u003eVargas-Ch\u0026aacute;vez, C. \u003cem\u003eet al.\u003c/em\u003e An episodic burst of massive genomic rearrangements and the origin of non-marine annelids. \u003cem\u003ebioRxiv\u003c/em\u003e 2024.05.16.594344 (2025) doi:10.1101/2024.05.16.594344.\u003c/li\u003e\n\u003cli\u003ePalfalvi, G. \u003cem\u003eet al.\u003c/em\u003e Genomes of the Venus Flytrap and Close Relatives Unveil the Roots of Plant Carnivory. \u003cem\u003eCurrent Biology\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 2312\u0026ndash;2320 (2020).\u003c/li\u003e\n\u003cli\u003eGivnish, T. J. \u003cem\u003eet al.\u003c/em\u003e Monocot plastid phylogenomics, timeline, net rates of species diversification, the power of multi-gene analyses, and a functional model for the origin of monocots. \u003cem\u003eAm J Bot\u003c/em\u003e (2018).\u003c/li\u003e\n\u003cli\u003eAlbalat, R. \u0026amp; Ca\u0026ntilde;estro, C. Evolution by gene loss. \u003cem\u003eNat Rev Genet\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 379\u0026ndash;391 (2016).\u003c/li\u003e\n\u003cli\u003eO\u0026rsquo;Malley, M. A., Wideman, J. G. \u0026amp; Ruiz-Trillo, I. Losing Complexity: The Role of Simplification in Macroevolution. \u003cem\u003eTrends Ecol Evol\u003c/em\u003e \u003cstrong\u003e31\u003c/strong\u003e, 608\u0026ndash;621 (2016).\u003c/li\u003e\n\u003cli\u003eLeitch, I. J., Beaulieu, J. M., Chase, M. W., Leitch, A. R. \u0026amp; Fay, M. F. Genome Size Dynamics and Evolution in Monocots. \u003cem\u003eJ Bot\u003c/em\u003e \u003cstrong\u003e2010\u003c/strong\u003e, 1\u0026ndash;18 (2010).\u003c/li\u003e\n\u003cli\u003eWaterhouse, R. M. \u003cem\u003eet al.\u003c/em\u003e BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics. \u003cem\u003eMol Biol Evol\u003c/em\u003e \u003cstrong\u003e35\u003c/strong\u003e, 543\u0026ndash;548 (2018).\u003c/li\u003e\n\u003cli\u003eEmms, D. M. \u0026amp; Kelly, S. OrthoFinder: Phylogenetic orthology inference for comparative genomics. \u003cem\u003eGenome Biol\u003c/em\u003e \u003cstrong\u003e20\u003c/strong\u003e, (2019).\u003c/li\u003e\n\u003cli\u003eHarris, B. J., Harrison, C. J., Hetherington, A. M. \u0026amp; Williams, T. A. Phylogenomic Evidence for the Monophyly of Bryophytes and the Reductive Evolution of Stomata. \u003cem\u003eCurrent Biology\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 2001\u0026ndash;2012 (2020).\u003c/li\u003e\n\u003cli\u003eKatoh, K., Misawa, K., Kuma, K. \u0026amp; Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. \u003cem\u003eNucleic Acids Res\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 3059\u0026ndash;66 (2002).\u003c/li\u003e\n\u003cli\u003eCapella-Guti\u0026eacute;rrez, S., Silla-Mart\u0026iacute;nez, J. M. \u0026amp; Gabald\u0026oacute;n, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. \u003cem\u003eBioinformatics\u003c/em\u003e \u003cstrong\u003e25\u003c/strong\u003e, 1972\u0026ndash;3 (2009).\u003c/li\u003e\n\u003cli\u003eSmith, S. A. \u0026amp; Dunn, C. W. Phyutility: A phyloinformatics tool for trees, alignments and molecular data. \u003cem\u003eBioinformatics\u003c/em\u003e \u003cstrong\u003e24\u003c/strong\u003e, 715\u0026ndash;716 (2008).\u003c/li\u003e\n\u003cli\u003eNguyen, L.-T., Schmidt, H. A., von Haeseler, A. \u0026amp; Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. \u003cem\u003eMol Biol Evol\u003c/em\u003e \u003cstrong\u003e32\u003c/strong\u003e, 268\u0026ndash;74 (2015).\u003c/li\u003e\n\u003cli\u003eMorel, B., Williams, T. A., Stamatakis, A. \u0026amp; Sz\u0026ouml;llősi, G. J. AleRax: a tool for gene and species tree co-estimation and reconciliation under a probabilistic model of gene duplication, transfer, and loss. \u003cem\u003eBioinformatics\u003c/em\u003e \u003cstrong\u003e40\u003c/strong\u003e, (2024).\u003c/li\u003e\n\u003cli\u003eR Core Team, A. A language and environment for statistical computing. (2014).\u003c/li\u003e\n\u003cli\u003eHenry, H. W. and L. tidyr: Easily Tidy Data with \u0026lsquo;spread()\u0026rsquo; and \u0026lsquo;gather()\u0026rsquo; Functions. Preprint at https://cran.r-project.org/package=tidyr (2018).\u003c/li\u003e\n\u003cli\u003eWickham, H. ggplot2: Elegant Graphics for Data Analysis. Preprint at (2016).\u003c/li\u003e\n\u003cli\u003eLetunic, I. \u0026amp; Bork, P. Interactive Tree Of Life (iTOL) v4: recent updates and new developments. \u003cem\u003eNucleic Acids Res\u003c/em\u003e \u003cstrong\u003e47\u003c/strong\u003e, 256-W259 (2019).\u003c/li\u003e\n\u003cli\u003eMi, H. \u003cem\u003eet al.\u003c/em\u003e PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. \u003cem\u003eNucleic Acids Res\u003c/em\u003e \u003cstrong\u003e45\u003c/strong\u003e, 183\u0026ndash;189 (2017).\u003c/li\u003e\n\u003cli\u003ePaps, J. \u0026amp; Holland, P. W. H. Reconstruction of the ancestral metazoan genome reveals an increase in genomic novelty. \u003cem\u003eNat Commun\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, 1730 (2018).\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"University of Oxford","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-7464600/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7464600/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eFlowering plants (angiosperms) emerged over 150\u0026nbsp;million years ago\u003csup\u003e\u003cspan additionalcitationids=\"CR2 CR3\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e, leading to the origin of two major groups, the monocots and eudicots. Accompanying this rapid species diversification was a period of dynamic genome evolution, as evidenced by their conflicting evolutionary relationships\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. However, the genomic trends governing the evolution of flowering plants remain poorly understood. Here, starting with 1,181 genomes, we have selected and analysed 273 archaeplastid genomes, to produce a novel, robustly supported angiosperm phylogeny. With this phylogeny, our analyses identify unprecedented rates of gene loss and duplication. The origin of monocots was accompanied by a period of reductive genome evolution while the first eudicot genomes experienced modest rates of gene duplication. Lost genes in the first monocots support the morphological simplification of the cotyledon, leaf venation patterning and root system architecture. Contrastingly, genome expansion in eudicots were associated to floral development and plant reproduction. Individual orders were characterised by pervasive gene loss, coupled with modest gene duplication. This suggests that angiosperms reached a core genomic diversity early in their evolutionary history, corresponding to their high floral diversity. This work highlights the importance of loss as well as gain of function in the diversification of the most speciose plant group.\u003c/p\u003e","manuscriptTitle":"The genomes of flowering plants reveal contrasting evolutionary paths in monocots and eudicots","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-08-28 09:57:15","doi":"10.21203/rs.3.rs-7464600/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"4baaecc1-a1d6-41d0-bf78-2dc780be037d","owner":[],"postedDate":"August 28th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-08-28T09:57:15+00:00","versionOfRecord":[],"versionCreatedAt":"2025-08-28 09:57:15","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7464600","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7464600","identity":"rs-7464600","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00