Detectable episodic positive selection in the virion strand A-strain maize streak virus genes may have a role in its host adaptation

preprint OA: closed
Full text JSON View at publisher
Full text 129,143 characters · extracted from preprint-html · click to expand
Detectable episodic positive selection in the virion strand A-strain maize streak virus genes may have a role in its host adaptation | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Detectable episodic positive selection in the virion strand A-strain maize streak virus genes may have a role in its host adaptation Kehinde A. Oyeniran, Mobolaji O. Tenibiaje This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4670195/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Maize streak virus (MSV) has only three genes : cp encoding the coat protein, mp encoding the movement protein and rep / repA encoding two distinct replication associated proteins from an alternatively spliced transcript. These genes have roles in encapsidation, movement, replication and interactions with the external environment and are thus prone to stimuli-driven molecular adaptation. We accomplished selection studies for these publicly available curated, recombination-free complete coding sequences for representative A-strain maize streak virus (MSV-A) cp and mp genes. We found evidence of gene-wide selection in these two MSV genes at specific sites within the genes ( cp 1.23% and mp 0.99%). Positively selected sites have amino acids that are 60% hydrophilic and 40% hydrophobic in nature. We found significant evidence of positive selection at branches ( cp : 0.76 and mp :1.66%) representing the diversity of MSV-A strain in South Africa that is closely related to the MSV-Mat-A isolate (GenBank accession number: AF329881) that is well disseminated and adapted to the maize plant in the sub-Saharan Africa. While in the mp gene, selection significantly intensified for the overall diversities of the MSV-A sequences, and those that are closely related to the MSV-Mat-A isolate. These findings have revealed that these genes, despite mostly undergoing non-diversifying selection, the detectable diversifying positive selection observed could have a major role in MSV-A host adaptive evolution that has over time, ensured a degree of pathogenicity that is sufficient for onward transmission rather than killing its host. Maize streak virus positive selection coat protein movement protein geminiviruses Figures Figure 1 Figure 2 Figure 3 Introduction Maize streak virus (MSV) is a type member of the Geminiviridae family, a single-stranded DNA virus with great economic impacts on the cultivation and growth of the maize plant in the sub-Saharan Africa (Martin et al., 2008 ; Roumagnac et al., 2022 ). MSV seriously constrains maize production in the sub-Saharan Africa resulting in serious economic losses and low yield turnout, most especially for peasant farmers with little or no resources, and limited access to improved maize cultivars in the region (Bediako E et al., 2017 ; Charles, 2014 ; Oppong et al., 2013 ). Infected maize plants often show symptoms such as chlorotic lesions, leaf striation, yellowing, stunting, low yield turnout, and, in severe cases, death (Martin et al., 1999 ; Martin & Rybicki, 2002 ; Oyeniran et al., 2021 ). Of the 11 known MSV strains A through to K, only the A-strain causes severe maize streak disease (MSD) with stunting and leaf striation symptom. MSV-A being the major MSD causing agent is believed to have become adapted to the maize plant (Ketsela et al. 2022 ). MSV encodes three major genes in its genome that include the virion sense movement protein ( mp ), and coat protein ( cp ) genes (Muñoz-Martín et al., 2003 ; Owor et al., 2007 ). While the complementary strand encodes the replication associated proteins ( rep / repA ) genes which are saddled with initiating and moderating replication of the virus genome (Shepherd et al., 2007 ). MSV genes as important drivers of its evolution perform vital functions that ensure its survival, spread and replications in susceptible hosts (Boulton, 2002 ; Davies et al., 1997 ). Consequently, these genes are likely targets for natural selection from the perspectives of host and pathogen evolutionary arms race (Denes et al., 2022 ; Wang et al., 2020 ). MSV movements as chiefly facilitated by its leaf hopper vectors would also mean that the virus must constantly cope with a plethora of changing environment and its likely effects on its genes. Thus, signatures of positive selection for amino acid changes responsible for host adaptation in most pathogens should be detectable in an evolutionary, analytical framework (Antonides et al., 2019 ). MSV coat protein ( cp ) is a virion sense gene that is expressed from the long intergenic region (LIR) transcripts. The cp , about ~ 735 nucleotides (nts) long has encapsidation functions and also plays key roles in systemic spread especially by the leaf hoppers (H. Liu et al., 2001 ). The movement protein ( mp ) is another virion sense gene of ~ 310 nts that is also expressed alongside the cp gene from bidirectionally transcribed LIR with main function of mediating viral movements within infected host cells (Boulton, 2002 ; Wright et al., 1997 ). Both cp and mp virion sense genes are also linked with MSV inter and intra host movements either via intermediate leaf hoppers spread for the cp or cell-to-cell movement within infected tissues for mp . Further, because of the binding capability of the cp gene, and the accompanying nuclear signal while facilitating partially uncoated single stranded DNA (ssDNA) cell entry (Davies et al., 1997 ; Owor et al., 2007 ), continuous interaction of the cp gene with the constantly changing host conditions might make it undergo persistent stimuli-driven molecular evolution. The non-structural complementary sense repA / rep proteins are expressed as either spliced rep and un-spliced repA (Ruschhaupt et al., 2013 ; Shepherd et al., 2007 ), rep plays key roles in replication initiation while repA acts as host and viral gene transcription regulator. The MSV rep , a spliced product of the C1:C2 is also believed to have roles in the activation of virion sense promoter and specifically have this role for the coat protein promoter (Horváth et al., 1998 ; Nikovics et al., 2001 ). The repA technically moderates rep activities through coordinated checks and balancing mechanisms. This is necessary for MSV as they cannot suppress their promoter unlike some begomoviruses (Nikovics et al., 2001 ). It is possible that regulating the expression of these proteins at varying stages of infection, and at different host cell cycle not only plays pivotal roles in coordinating the virus life cycle, it can also make these regulating genes evolutionary selection targets. Key mechanisms of evolution are natural selection and genetic change. Natural selection sits specifically at the intersection of diversifying evolution. Natural selection is caused by competition and environmental changes, acts on genetic variation, produces evolution, changes gene pool, and resulting in selective survival, host expansion, and adaptation (Aguadé, 1999 ; Deom et al., 2021 ; Li et al., 2018 ). Fitness-based selection is a deliberate event that carefully guides the course of evolution by ensuring that organisms only pass on useful traits to the next generation. Furthermore, natural selection can be likened to a differential fitness driven giant sieve that separates undesirable traits from the desirable ones, ultimately producing fitter and healthier descendants (Acosta-Leal et al., 2011 ; Oyeniran & Oyediran, 2024; Spielman et al., 2019 ). Here, we intend to identify the occurrence of selection in the virion strand cp and mp genes of the economically important MSV-A lineages that have disseminated within the sub-Saharan Africa using the publicly available sequence data. Given that natural selection as evolutionary signatures is detectable in sequence data, it is possible to estimate sites and branches within these genes that are evolving under selection pressure up to amino acid level as these could further give insights into how these genes evolve as they interact within changing host conditions. Methods Sequence selection and alignments Full coding sequences (CDS) of quality MSV-A cp (n = 115) and mp (n = 115) gene sequences were retrieved from the GenBank via their accession numbers using the Linux based e-fetch program (Sayers, 2010 ). Sequences were selected such that they represent their different sampling regions in the sub-Saharan Africa, while also considering the availability of annotations for the gene of interest in the sequences. Sequences were striped of stop codons before aligning as codon-based multiple sequence with muscle (Edgar, 2004 ), in Aliview (Larsson, 2014 ), and later back-translated to the corresponding codon-based multiple nucleotide sequence alignment. This method of nucleotide sequence alignment prevents gaps insertion between the first or second nucleotide positions of a codon, in order to ensure biologically useful alignments with in frame codons. Recombination analyses using recombination detection program (RDP) RDP version 5 (Martin et al., 2021 ) was used to detect evidence of recombination in the datasets. Detected recombinants sequences and their tracts were completely removed from the alignment. Recombinant free dataset from this step was used for downstream phylogenetic based selection studies. Construction of gene trees For each MSV gene being considered, codon-based nucleotide gene trees were constructed by providing sequence alignments to IQ-TREE v.1.6.12 (Minh et al., 2020 ). We used the ModelFinder within IQ-TREE to choose the appropriate model of sequence evolution (Kalyaanamoorthy et al., 2017 ), as adjudged by Bayesian Information Criterion (BIC) support measures among 185 codon models. The best substitution for each position was then used to infer the best gene tree using a maximum likelihood (ML) approach. For branch support analysis, we performed 2000 replicates for a non-parametric Shimodaira-Hasegawa-like approximate likelihood ratio test (SH-aLRT)(Guindon et al., 2010 ; Shimodaira & Hasegawa, 1999 ) and 5000 alignments of ultrafast bootstrapping (Hoang et al., 2018 ). Majority-rule consensus ML tree was constructed for each gene based on 5000 bootstrap trees and were further edited in FigTree v.1.4.4 (Rambaut, 2018 ). Detecting selection Selection tests were carried out using the HyPhy package via the DataMonkey Adaptive Evolution server(Kosakovsky Pond et al., 2005 ; Weaver et al., 2018 ) based on codon-based nucleotide sequence alignment (which also accounts for silent substitution at codon level) and the majority-rule consensus ML trees. Selection tests are based on calculations of the nonsynonymous-to-synonymous substitution rate ratios ( ω = dN/dS) using codon models and Likelihood Ratio Tests (LRTs) under null hypotheses neutral evolution ( ω = 1), negative (purifying) selection denoted by ω 1. The HyPhy methods first carry out an initial global MG94xREV fit for branch length and nucleotide substitution optimizations that serve as initial parameter values in model fitting for the hypothesis testing process. An important advantage of including the synonymous rate variation allowing dS to vary across sites and branches in the phylogeny enhances more efficient detection of positive selection and lesser false discovery (Weaver et al., 2018 ). This was to detect selection signals at specific codon sites within genes, and at particular branches or lineages of MSV gene trees. Gene-wide To estimate positive selection anywhere on the gene trees, we used Branch-site Unrestricted Statistical Test for Episodic Diversification with Multi-nucleotide Substitution (BUSTED-MH) algorithm (Lucaci et al., 2023 ; Murrell et al., 2015 ), while considering biases introduced by multi-nucleotide substitutions and variations associated with synonymous substitution rates that often lead to increased false positive into account. BUSTED-MH fits a codon model with three rate classes, constrained as ω 1 ≤ ω 2 ≤ 1 ≤ ω 3 and estimates the proportion of individual sites that belong in ω class. Positive selection is then estimated by comparing this model fits to a null model where positive selection is not allowed (ω 3 = 1). The null hypothesis will be rejected if evidence exists for at least one or numerous codon positions in sites and branches having undergone episodic positive selection. At sites To determine if positive selection was detected at any codon sites, the Mixed Effects Model of Evolution (MEME) algorithm was used (Murrell et al., 2012 ). MEME tests that individual sites have experienced episodic or diversifying positive selection within a proportion of branches using a mixed-effects maximum likelihood approach. MEME infers two ω rate classes per site while simultaneously calculating corresponding weights (i.e. the proportion of branches evolving under that rate class). Two rate classes (α) and (β- and β+) were inferred by a single dS and dN values per site. The β- and β + were constrained to be less than or equal to α in the null model; while in the alternative model, β + was not constrained. Positive selection is then inferred for a site if a likelihood ratio test returns a significant β+ > α at a site. At branches Adaptive Branch-Site Random Effects Likelihood (aBSREL) algorithm (Smith et al., 2015 ) was used to test for individual MSV-A branches (lineages) and sites under selection. All branches and sites were tested. aBSREL explores both site and branch level ω heterogeneity and then infers the optimal number of ω classes from AIC c (the small sample AIC). While the alternative model is compared to a null model disallowing positive selection in the rate classes, it also performed a Likelihood Ratio Test at each branch and branch specific p-values corrected for multiple testing using the Holm-Bonferroni before assessing significance. Further, RELAX algorithm(Wertheim et al., 2014 ) was used to detect the differential selection and whether the selection pressure was significantly relaxed or intensified across the branches within the MSV-A clades by setting the internal nodes and leave nodes as test and reference respectively. RELAX employs a random effect branch-site model to test if a set of test branches evolves under a different stringent condition of selection than a set of reference branches. It entails fitting a codon model with tree ω classes to the entire phylogeny for the null model, while testing for changes in selection constraints linked with the selection intensity parameter k (≥ 0). The selection intensity parameter, in the alternative model, served as an exponent to the ω classes. A significant k > 1 obtained following a likelihood ratio testing of the null and alternative models would mean that selection strength was intensified along the test branches being considered relative to the reference. A significant result of k < 1 implies that selection strength was relaxed along the test branches. Results Individual gene tree and its evolution The best-fit codon substitution model determined by Bayesian Information Criterion (BIC) within the 95% confidence limit for cp : MG + F1X4 + G4 and Mp : MG + F1X4 + G4. In the present nomenclature, MG stands for codon substitution model of Muse & Gaut, ( 1994 ), while that of GY is for nonsynonymous/synonymous and transition/transversion rate ratios model of Goldman & Yang, ( 1994 ). The frequency type F1X4 denotes overall unequal nucleotide frequencies but equal nucleotide frequencies over the three codon positions. The rate type G4 indicates the discrete Gamma model ofYang, ( 1994 ) with four rate categories. The best Maximum Likelihood trees were inferred from the codon-based nucleotide alignment and substitution model. The cp gene had a total tree length (the sum of branch lengths, each representing number of substitutions per codon site) of 1.05, and mp , 0.85. The majority-rule consensus gene trees produced by subsequent bootstrapping are presented in Fig. 1. Figure 1 : Midpoint rooted maximum likelihood (ML) trees for MSV-A genes inferred using IQ-Tree2 (Minh et al., 2020 ) and visualised with Figtree (Rambaut, 2018 ). The bar indicates the number of nucleotide substitutions per codon site. The numbers shown are bootstrap support values from 5000 ultrafast replicates (Hoang et al., 2018 ). Clades with bootstrap supports of less than 50 were collapsed: (A) Coat protein ( cp ) gene (B) Movement protein ( mp ) gene. Detecting selection Gene-wide BUSTED-MH algorithm found evidence of gene-wide episodic diversifying selection in both cp and mp and genes (LRT p-value ≤ 0.05) in the phylogeny. Therefore, there is evidence that at least one site on at least one test branch of these genes in the MSV-A lineage has experienced diversifying selection (Table 1 ). At sites MEME found evidence of episodic positive/diversifying selection (LRT, p value ≤ 0.05) at particular sites in the genes. For cp phylogeny, three out of 244 codon sites (1.23%) showed evidence for selection (Table 2 ). For mp , one of 101 codon sites (0.99%) showed evidence for selection (Table 2 ). Table 1 Statistical results of model fits for the Branch-site Unrestricted Statistical Test for Episodic Diversification-Multi-nucleotide Substitution (BUSTED-MH) algorithm (Lucaci et al., 2023 ; Murrell et al., 2015 ) accomplished with the Datamonkey server (Weaver et al., 2018 ) showing mp gene, there is evidence of episodic diversifying selection in cp and mp genes (the null model of no positive selection, ω 3 = 1, is rejected; LRT: p < 0.05). Model log L #. parameters AIC c ω 1 (Negative) ω 2 (Neutral) ω 3 (Positive) Cp (87 Sequences, 244 Codon sites, LRT p-value = 0.029) Alternative Model -2652.1 195 5697.9 0.00(96.92%) 0.00(1.14%) 5.38(1.93%) Null Model -2654.9 194 5701.5 0.00(86.85%) 0.06(4.71%) 1.00(8.43%) Mp (47 Sequences, 101 Codon sites, LRT p-value = 0.024) Alternative Model -989.2 115 2214.2 0.34(98.27%) 0.35(1.71%) 1.00e + 10(0.02%) Null Model -992.2 114 2218.6 0.00(0.64%) 0.00(63.92%) 1.00(35.44%) Table 2 The results of the MEME (Mixed Effects Model of Evolution) algorithm (Murrell et al., 2012 ) performed with the DataMonkey server (Weaver et al., 2018 ) showing codon sites under positive/diversifying selection (p ≤ 0.05) for selected MSV-A genes. Site α β- P- β+ P+ LRT P-value Cp (Three out of 244 sites; 1.23%) 19 0.00 0.00 0.00 3.17 1.00 8.78 0.01 29 0.00 0.00 0.97 68.28 0.03 7.68 0.01 38 0.00 0.00 0.93 17.46 0.07 4.81 0.04 Mp (One out of 101 sites; 0.99%) 9 0.00 0.00 0.98 593.09 0.02 11.95 0.00 Site numbers correspond to codon sites as numbered in the peptide alignment (Fig. ). α = synonymous substitution rate; β− = non-synonymous substitution rate for the negative/neutral evolution component; p− = proportion of tree evolving neutrally or under negative selection; β+ = non-synonymous substitution rate for the positive/neutral evolution component; p + = proportion of tree evolving neutrally or under positive selection; LRT = likelihood ratio test statistic Different amino acids exist in positively selected sites Following the identification of positively selected sites, amino acids for these sites were identified. For cp , 60% of amino acids in the positively selected sites are hydrophilic, while the remaining 40% hydrophobic. For mp , it is 100% hydrophobic amino acid. Across the two genes, a slight preponderance of hydrophilic amino acids exists than it is of hydrophobic amino acids (Fig. 2). Figure 2 Positively selected sites and their amino acids are all different amino acids in same sites as estimated by MEME (Mixed Effects Model of Evolution) algorithm (Murrell et al., 2012 ) performed with the DataMonkey server (Weaver et al., 2018 ). (A) Capsid protein ( cp ) gene. (B) Movement protein ( mp ). * Nonpolar; † Polar; ‡ Charged. At branches The aBSTREL algorithm tested specific gene locus for branch specific selection searching various lineages under selection in each locus (Table 4 ). Selection was detected in both Cp and Mp genes at only one (of the 130 = 0.76%), and one (of the 60 = 1.66%) branches of the phylogenies (LRT, P -value ≤ 0.05) respectively. RELAX algorithm inferred that selection significantly intensified for the diversity of the entire isolates, as well as those that are closely related to the maze adapted MSV-Mat-A isolate (GenBank accession number: AF329878) (Fig. 3). Table 4 aBSREL (adaptive Branch-Site Random Effects Likelihood) algorithm (Smith et al., 2015 ) as determined in the DataMonkey server (Weaver et al., 2018 ) for Cp and Mp genes. Branch/Node LRT test statistic p-value (corrected for multiple testing) ω distribution over sites Cp AF329878_MSV-A 15.2025 0.0220 ω 1 = 0.00 (99%) ω 2 = 100000 (0.84%) Mp Node25 15.1669 0.0024 ω 1 = 0.00 (98%) ω 2 = 460 (2.3%) Figure 3 Omega distributions (ω = dn/ds) under the RELAX alternative model for branches under statistically significant selection within MSV-A mp gene as determined by the RELAX algorithm (Wertheim et al., 2014 ) in the DataMonkey server (Weaver et al., 2018 ). Selection intensified significantly in the within the internal nodes test branches relative to the leave nodes in MSV-A mp gene. Table S1 shows the full RELAX model results. K relaxation/intensification parameter; LT likelihood ratio. Discussion The present study has looked into detecting signatures of selection within the maize adapted A-strain MSV genes. We have found evidence of selection at gene level in cp , and mp genes. The cp gene appears to be under stronger selection pressure when compared to the mp in which positive selection was detected at only one site and a node in the phylogeny. Summarily, we have found evidence of selection at overall gene and codon site levels (stronger for cp gene), and at lineage levels for cp and mp genes. Overall, amino acids corresponding to these positively selected sites across the two genes show slightly higher proportions for hydrophilicity than hydrophobicity with more than one amino acid on a site. Additional evidence for differential selective pressure was observed for the mp gene in which selection was significantly intensified for the internal test nodes relative to the reference leaf nodes. Evolution of the MSV-A genes as inferred from their respective maximum likelihood trees revealed the genes have evolved in a similar pattern given the inferred nucleotide substitution rates. These genes have diverged at closely similar rates. The cp appears to have evolved the most while the mp gene the least with their respective nucleotide substitution rates per codon sites of 0.08 and 0.03. Of the two genes, the mp gene, with the lowest rate of nucleotide substitutions per codon sites, seemed to have evolved the least relative to the cp . These differing evolution rates can be attributed to how these genes carry out their biological functions. The cp gene for instance, being a coat protein that interacts within the virus nucleus during genome packaging and the external environment such as host membrane receptor, and can also manipulates host transcription machinery (Mostert et al., 2023 ; Zhou et al., 2018 ), may likely evolve more than the mp gene that is saddled with more of internal cell-to-cell movement. Detectable evidence of gene wide diversifying positive selection within the genes implies that these genes have experienced or are trying to experience molecular changes that are adaptive in nature. Given that mutations can have varying effects on gene functions that range from useless to deleterious in nature, valid instances of useful, adaptive selection can still occur in the events of positive diversifying selection. Therefore, detecting statistically significant evidence of positive selection in the cp and mp genes implies they have been subject to adaptive changes in either codon sites or branches in the phylogeny. The gene wide selection analysis also serves as a benchmark for further selection analyses. At sites, we have detected episodic diversifying positive selection at the individual codon sites in the cp and mp genes with stronger effects on cp . Episodic diversifying selection was found acting on codon sites in at least three loci with a greater proportion of the cp gene (1.23%) than the mp (0.99%). Although it appears these genes are mostly under negative and neutral selection, it is also important not to hastily conclude that the signature of positive selection observed is inconsequential, given the association of positive selection with host adaptation (Montoya et al., 2021 ; Nigam et al., 2019 ; Thines, 2019 ). Factors that determine host adaptation for MSV-A have been previously linked to the expression of virion sense cp and mp genes also known as pathogenicity determinants(Liu et al., 1999 ; Monjane et al., 2011 ; Wright et al., 1997 ) could further explain the diversifying selection observed in certain proportions of these genes and also probably responsible for the regulated expression of these genes in the maize adapted MSV-A such that evolution favouring adaptation of these viruses to their maize host over time, has translated into reduced symptom severities that readily fosters onward transmission; with a minimal host harm (Monjane et al., 2020 ). In wheat dwarf virus (WDV); another member of genus Mastrevirus , the cp and mp genes being crucial for successful viral encapsidation and movement during infection, may likely render them twice as likely to be under selection pressure (Wu et al., 2008 , 2015 ). Identifying the amino acids in the various positively selected sites can further provide us with information regarding the evolutionary states of these sites. For instance, different amino acids on a positively selected site may show a likely selection pressure in which multiple amino acids on the sites are subject to evolutionary changes. Although this could also mean that the virus in this case is happy with its status quo, it is also very important to consider the crucial interplay that exists between natural selection controlled codon usage in viruses that is tailored towards achieving a desired goal such as host adaptation. An interplay that exists between coincidental codon usage between viruses and hosts, and their regulations for efficient translation and protein folding is crucial to adapting to host environment, and has been described as the mechanisms of multiple hosts and vector adaptation in Zika virus (Butt et al., 2016 ). For MSV-A, it remains to be seen how these mechanisms will specifically apply, as codon usage between MSV-A and its hosts may likely be responsible for the inferred low rate of positive diversifying selection. Overall, the importance of amino acid changes in viral proteins for host adaptation is central to how viruses adapt to their hosts, and natural selection forces acting on different sites are already pointing to the fact that the virus in question is trying to change these amino acids in the course of its evolution, most likely for host adaptability and pathogenicity as applicable in other viruses with similar evolutionary pattern (Carrique et al., 2020 ; Mänz et al., 2016 ). At branches in cp gene, aBSTREL found evidence of selection in a clade of MSV-A (GenBank accession number: AF329878) from southern African isolate that is closely related to MSV-A-Mat-A isolate, observed to have, over time become adapted to the maize host (Monjane et al., 2020 ). For the mp gene, evidence of statistically significant positive selection was also found for a node representing the diversity of MSV-A from the Island of Reunion and South Africa (Pande et al., 2012; Peterschmitt et al., 1996). These observed variations in selection pressure between clades or lineages show that evolution occurs under different selection pressures, and may further suggest that positive selection may have a role in adaptive host evolution in A-strain MSV. We found that selection pressure was significantly intensified for all the MSV isolates on testing all the internal nodes that represent the entire sequence diversity of the mp gene, and those MSV-A isolates that are closely related to the MSV-Mat-A isolate (GenBank accession number: AF329881) that is believed to have become adapted to the maize plant. The MSV mp is a virion sense gene that enables the virus to move from cell to cell, it is plausible that the mp gene like the cp may also be subject to stimuli driven diversifying selection and our results here have further lent credence to this, and further substantiating our selection driven host adaptive evolution hypothesis in these economically important maize streak virus strain. Conclusion Our findings have shown that evidence of positive selection exists in the virion strand genes of the A-strain maize streak virus, specifically in isolates closely related to the well adapted MSV-A strain that is spreading within the sub-Saharan Africa. It is therefore plausible that positive selection pressure may be driving the course of host adaptive evolution in this economically important MSV strain. Declarations Author Contributions: Conceptualization, Kehinde Oyeniran; Formal analysis, Kehinde Oyeniran, and Mobolaji Tenibiaje; Investigation, Kehinde Oyeniran and Mobolaji Tenibiaje; Visualization, Kehinde Oyeniran; Writing – original draft, Kehinde Oyeniran; Writing – review & editing, Kehinde Oyeniran and Mobolaji Tenibiaje. All authors have read and agreed to the published version of the manuscript. Conflict of Interest: None References Acosta-Leal, R., Duffy, S., Xiong, Z., Hammond, R. W., & Elena, S. F. (2011). Advances in Plant Virus Evolution: Translating Evolutionary Insights into Better Disease Management. Phytopathology , 101 (10), 1136–1148. https://doi.org/10.1094/PHYTO-01-11-0017 Aguadé, M. (1999). Positive selection drives the evolution of the Acp29AB accessory gland protein in Drosophila. Genetics , 152 (2), 543–551. https://doi.org/10.1093/GENETICS/152.2.543 Antonides, J., Mathur, S., & DeWoody, J. A. (2019). Episodic positive diversifying selection on key immune system genes in major avian lineages. Genetica , 147 (5–6), 337–350. https://doi.org/10.1007/s10709-019-00081-3 Bediako E, A.-, A, K., Puije GC, van der, KJ, T., Frimpong K, A., G, A., Kubi A, A., Lamptey J, N., A, O., B, M., I, A., & FN, T. (2017). Spatio-Temporal Variations in the Incidence and Severity of Maize Streak Disease in the Volta Region of Ghana. Journal of Plant Pathology & Microbiology , 08 (03). https://doi.org/10.4172/2157-7471.1000401 Boulton, M. I. (2002). Functions and interactions of mastrevirus gene products. Physiological and Molecular Plant Pathology , 60 (5), 243–255. https://doi.org/10.1006/pmpp.2002.0403 Butt, A. M., Nasrullah, I., Qamar, R., & Tong, Y. (2016). Evolution of Codon Usage in Zika Virus Genomes Is Host and Vector Specific. Emerging Microbes \& Infections . https://doi.org/10.1038/emi.2016.106 Carrique, L., Fan, H., Walker, A. P., Keown, J. R., Sharps, J., Staller, E., Barclay, W., Fodor, E., & Grimes, J. M. (2020). Host ANP32A Mediates the Assembly of the Influenza Virus Replicase. Nature . https://doi.org/10.1038/s41586-020-2927-z Charles, K. (2014). Maize streak virus: A review of pathogen occurrence, biology and management options for smallholder farmers. African Journal of Agricultural Research , 9 (36), 2736–2742. https://doi.org/10.5897/AJAR2014.8897 Davies, J. W., Boulton, M. I., & Liu, H. (1997). Maize streak virus coat protein binds single- and double-stranded DNA in vitro. Journal of General Virology , 78 (6), 1265–1270. https://doi.org/10.1099/0022-1317-78-6-1265 Denes, C. E., Cole, A. J., Nguyen Tran, M. T., Nizam Khalid, M. K., Hewitt, A. W., Hesselson, D., & Neely, G. G. (2022). The VEGAS Platform Is Unsuitable for Mammalian Directed Evolution. Acs Synthetic Biology . https://doi.org/10.1021/acssynbio.2c00460 Deom, C. M., Brewer, M. T., & Severns, P. M. (2021). Positive selection and intrinsic disorder are associated with multifunctional C4(AC4) proteins and geminivirus diversification. Scientific Reports 2021 11:1 , 11 (1), 1–11. https://doi.org/10.1038/s41598-021-90557-0 E. Sayers. (2010). The E-utilities In-Depth: Parameters, Syntax and More. 2009 May 29 [Updated 2022 Nov 30]. In: Entrez Programming Utilities Help [Internet]. Bethesda (MD): National Center for Biotechnology Information (US). Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research , 32 (5), 1792–1797. https://doi.org/10.1093/nar/gkh340 Goldman, N., & Yang, Z. (1994). A codon-based model of nucleotide substitution for protein-coding DNA sequences. Molecular Biology and Evolution , 11 (5), 725–736. https://doi.org/10.1093/oxfordjournals.molbev.a040153 Guindon, S., Dufayard, J.-F., Lefort, V., Anisimova, M., Hordijk, W., & Gascuel, O. (2010). New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. Systematic Biology , 59 (3), 307–321. https://doi.org/10.1093/sysbio/syq010 Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q., & Vinh, L. S. (2018). UFBoot2: Improving the Ultrafast Bootstrap Approximation. Molecular Biology and Evolution , 35 (2), 518–522. https://doi.org/10.1093/molbev/msx281 Horváth, G. V., Pettkó-Szandtner, A., Nikovics, K., Bilgin, M., Boulton, M., Davies, J. W., Gutiérrez, C., & Dudits, D. (1998). Prediction of functional regions of the maize streak virus replication-associated proteins by protein-protein interaction analysis. Plant Molecular Biology , 38 (5), 699–712. https://doi.org/10.1023/A:1006076316887 Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A., & Jermiin, L. S. (2017). ModelFinder: fast model selection for accurate phylogenetic estimates. Nature Methods , 14 (6), 587–589. https://doi.org/10.1038/nmeth.4285 Ketsela, D., Oyeniran, K. A. , Berhanu, F., Fontenele, R. S. , Kraberger, S.,·Varsani, A. (2022). Molecular identification and phylogenetic characterization of A-strain isolates of Maize streak virus from western Ethiopia. Archives of Virology , 1–11. https://doi.org/10.1007/s00705-022-05614-4 Kosakovsky Pond, S. L., Frost, S. D. W., & Muse, S. V. (2005). HyPhy: hypothesis testing using phylogenies. Bioinformatics , 21 (5), 676–679. https://doi.org/10.1093/BIOINFORMATICS/BTI079 Larsson, A. (2014). AliView: A fast and lightweight alignment viewer and editor for large datasets. Bioinformatics , 30 (22), 3276–3278. https://doi.org/10.1093/bioinformatics/btu531 Li, X.-D., Jiang, G.-F., Yan, L.-Y., Li, R., Mu, Y., & Deng, W.-A. (2018). Positive Selection Drove the Adaptation of Mitochondrial Genes to the Demands of Flight and High-Altitude Environments in Grasshoppers. Frontiers in Genetics , 9 , 605. https://doi.org/10.3389/fgene.2018.00605 Liu, H., Andrew, A. P., Lucy, P., Davies, J. W., & Boulton, M. I. (2001). A single amino acid change in the coat protein of Maize streak virus abolishes systemic infection, but not interaction with viral DNA or movement protein. Molecular Plant Pathology , 2 (4), 223–228. https://doi.org/10.1046/j.1464-6722.2001.00068.x Liu, L., Pinner, M., Davies, J., & Stanley, J. (1999). Adaptation of the geminivirus bean yellow dwarf virus to dicotyledonous hosts involves both virion-sense and complementary-sense genes. J Gen Virol , 80 . Lucaci, A. G., Zehr, J. D., Enard, D., Thornton, J. W., & Kosakovsky Pond, S. L. (2023). Evolutionary Shortcuts via Multinucleotide Substitutions and Their Impact on Natural Selection Analyses. Molecular Biology and Evolution , 40 (7). https://doi.org/10.1093/MOLBEV/MSAD150 Mänz, B., Graaf, M. de, Mögling, R., Richard, M., Bestebroer, T. M., Rimmelzwaan, G. F., & M. Fouchier, R. A. (2016). Multiple Natural Substitutions in Avian Influenza a Virus PB2 Facilitate Efficient Replication in Human Cells. Journal of Virology . https://doi.org/10.1128/jvi.00130-16 Martin, D. P., Shepherd, D. N., & Rybicki, E. P. (2008). Maize Streak Virus. Encyclopedia of Virology , 263–272. https://doi.org/10.1016/B978-012374410-4.00707-X Martin, D. P., Varsani, A., Roumagnac, P., Botha, G., Maslamoney, S., Schwab, T., Kelz, Z., Kumar, V., & Murrell, B. (2021). RDP5: a computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. Virus Evolution , 7 (1), 87. https://doi.org/10.1093/VE/VEAA087 Martin, D. P., Willment, J. a, & Rybicki, E. P. (1999). Evaluation of Maize Streak Virus Pathogenicity in Differentially Resistant Zea mays Genotypes. Phytopathology , 89 (8), 695–700. https://doi.org/10.1094/Phyto.1999.89.8.695 Martin, D., & Rybicki, E. (2002). Investigation of Maize streak virus Pathogenicity determinants using chimaeric genomes. Virology , 300 . Minh, B. Q., Schmidt, H. A., Chernomor, O., Schrempf, D., Woodhams, M. D., Von Haeseler, A., Lanfear, R., & Teeling, E. (2020). IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Molecular Biology and Evolution , 37 (5), 1530–1534. https://doi.org/10.1093/MOLBEV/MSAA015 Monjane, A. L., Dellicour, S., Hartnady, P., Oyeniran, K. A., Owor, B. E., Bezuidenhout, M., Linderme, D., Syed, R. A., Donaldson, L., Murray, S., Rybicki, E. P., Kvarnheden, A., Yazdkhasti, E., Lefeuvre, P., Froissart, R., Roumagnac, P., Shepherd, D. N., Harkins, G. W., Suchard, M. A., … Martin, D. P. (2020). Symptom evolution following the emergence of maize streak virus. ELife , 9 . https://doi.org/10.7554/eLife.51984 Monjane, A. L., Harkins, G. W., Martin, D. P., Lemey, P., Lefeuvre, P., Shepherd, D. N., Oluwafemi, S., Simuyandi, M., Zinga, I., Komba, E. K., Lakoutene, D. P., Mandakombo, N., Mboukoulida, J., Semballa, S., Tagne, A., Tiendrebeogo, F., Erdmann, J. B., van Antwerpen, T., Owor, B. E., … Varsani, A. (2011). Reconstructing the History of Maize Streak Virus Strain A Dispersal To Reveal Diversification Hot Spots and Its Origin in Southern Africa. Journal of Virology , 85 (18), 9623–9636. https://doi.org/10.1128/JVI.00640-11 Montoya, V., McLaughlin, A., Mordecai, G., Miller, R. L., & Joy, J. B. (2021). Variable Routes to Genomic and Host Adaptation Among Coronaviruses. Journal of Evolutionary Biology . https://doi.org/10.1111/jeb.13771 Mostert, I., Bester, R., Burger, J. T., & Maree, H. J. (2023). Identification of Interactions Between Proteins Encoded by Grapevine Leafroll-Associated Virus 3. Viruses . https://doi.org/10.3390/v15010208 Muñoz-Martín, A., Collin, S., Herreros, E., Mullineaux, P. M., Fernández-Lobato, M., & Fenoll, C. (2003). Regulation of MSV and WDV virion-sense promoters by WDV nonstructural proteins: a role for their retinoblastoma protein-binding motifs . https://doi.org/10.1016/S0042-6822(02)00072-7 Murrell, B., Weaver, S., Smith, M. D., Wertheim, J. O., Murrell, S., Aylward, A., Eren, K., Pollner, T., Martin, D. P., Smith, D. M., Scheffler, K., & Kosakovsky Pond, S. L. (2015). Gene-Wide Identification of Episodic Selection. Molecular Biology and Evolution , 32 (5), 1365. https://doi.org/10.1093/MOLBEV/MSV035 Murrell, B., Wertheim, J. O., Moola, S., Weighill, T., Scheffler, K., & Kosakovsky Pond, S. L. (2012). Detecting Individual Sites Subject to Episodic Diversifying Selection. PLOS Genetics , 8 (7), e1002764. https://doi.org/10.1371/JOURNAL.PGEN.1002764 Muse, S. V., & Gaut, B. S. (1994). A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Molecular Biology and Evolution , 11 (5), 715–724. https://doi.org/10.1093/OXFORDJOURNALS.MOLBEV.A040152 Nigam, D., LaTourrette, K., Noronha Souza, P. F., & García-Ruíz, H. (2019). Genome-Wide Variation in Potyviruses. Frontiers in Plant Science . https://doi.org/10.3389/fpls.2019.01439 Nikovics, K., Simidjieva, J., Peres, A., Ayaydin, F., Pasternak, T., Davies, J. W., Boulton, M. I., Dudits, D., & Horváth, G. V. (2001). Cell-Cycle, Phase-Specific Activation of Maize streak virus Promoters. / 609 MPMI , 14 (5), 609–617. Oppong, L., Frimpong, B. N., Abrokwah, L. A., & Ofori, K. (2013). FARMERS’ PERCEPTIONS ON MAIZE STREAK VIRUS DISEASE, PRODUCTION CONSTRAINTS, AND PREFERRED MAIZE VARIETIES IN THE FOREST-TRANSITION ZONE OF GHANA. FARMERS’ PERCEPTIONS ON MAIZE STRE. ProJournal of Agricultural Science Research (PASR) . https://www.researchgate.net/publication/353973172 Owor, B. E., Martin, D. P., Shepherd, D. N., Edema, R., Monjane, A. L., Rybicki, E. P., Thomson, J. A., & Varsani, A. (2007). Genetic analysis of maize streak virus isolates from Uganda reveals widespread distribution of a recombinant variant. Journal of General Virology , 88 (11), 3154–3165. https://doi.org/10.1099/vir.0.83144-0 Owor, B., Martin, D., Shepherd, D., Edema, R., & Rybicki, E. (2007). Genetic analysis of Maize streak virus (MSV) isolates from Uganda reveals widespread distribution of a recombinant MSV variant in Uganda. J Gen Virol , 88 . Oyeniran, K. A. & O. J. A. (2024). Existential Origin of Life. In F.E. Olu-Ajayi and I. Osasona (Ed.), History and Philosophy of Science (pp. 1–12). Oyeniran, K. A., Hartnady, P., Claverie, S., Lefeuvre, P., Monjane, A. L., Donaldson, L., Michel, J., Arvind, L., & Martin, D. P. (2021). How virulent are emerging maize ‑ infecting mastreviruses ? Archives of Virology , 0123456789 . https://doi.org/10.1007/s00705-020-04906-x Rambaut, A. (2018). Figtree ver 1.4.4. - Institute of Evolutionary Biology, University of Edinburgh, Edinburgh. - References . Scientific Research Publishing. https://www.scirp.org/reference/referencespapers?referenceid=3470267 Roumagnac, P., Lett, J. M., Fiallo-Olivé, E., Navas-Castillo, J., Zerbini, F. M., Martin, D. P., & Varsani, A. (2022). Establishment of five new genera in the family Geminiviridae: Citlodavirus, Maldovirus, Mulcrilevirus, Opunvirus, and Topilevirus. Archives of Virology , 167 (2), 695–710. https://doi.org/10.1007/S00705-021-05309-2/FIGURES/5 Ruschhaupt, M., Martin, D. P., Lakay, F., Bezuidenhout, M., Rybicki, E. P., Jeske, H., & Shepherd, D. N. (2013). Replication modes of Maize streak virus mutants lacking RepA or the RepA-pRBR interaction motif. Virology , 442 (2), 173–179. https://doi.org/10.1016/j.virol.2013.04.012 Shepherd, D. N., Mangwende, T., Martin, D. P., Bezuidenhout, M., Thomson, J. A., Rybicki, E. P., & Rybicki, C. E. P. (2007). Inhibition of maize streak virus (MSV) replication by transient and transgenic expression of MSV replication-associated protein mutants. Journal of General Virology , 88 , 325–336. https://doi.org/10.1099/vir.0.82338-0 Shimodaira, H., & Hasegawa, M. (1999). Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference. Molecular Biology and Evolution , 16 (8), 1114–1116. https://doi.org/10.1093/oxfordjournals.molbev.a026201 Smith, M. D., Wertheim, J. O., Weaver, S., Murrell, B., Scheffler, K., & Kosakovsky Pond, S. L. (2015). Less is more: An adaptive branch-site random effects model for efficient detection of episodic diversifying selection. Molecular Biology and Evolution , 32 (5), 1342–1353. https://doi.org/10.1093/molbev/msv022 Spielman, S. J., Weaver, S., Shank, S. D., Magalis, B. R., Li, M., & Kosakovsky Pond, S. L. (2019). Evolution of viral genomes: Interplay between selection, recombination, and other forces. In Methods in Molecular Biology (Vol. 1910, pp. 427–468). Humana Press Inc. https://doi.org/10.1007/978-1-4939-9074-0_14 Thines, M. (2019). An Evolutionary Framework for Host Shifts – Jumping Ships for Survival. New Phytologist . https://doi.org/10.1111/nph.16092 Wang, W., Zhao, H., & Han, G.-Z. (2020). Host-Virus Arms Races Drive Elevated Adaptive Evolution in Viral Receptors. Journal of Virology . https://doi.org/10.1128/jvi.00684-20 Weaver, S., Shank, S. D., Spielman, S. J., Li, M., Muse, S. V., & Kosakovsky Pond, S. L. (2018). Datamonkey 2.0: A Modern Web Application for Characterizing Selective and Other Evolutionary Processes. Molecular Biology and Evolution , 35 (3), 773–777. https://doi.org/10.1093/MOLBEV/MSX335 Wertheim, J. O., Murrell, B., Smith, M. D., Kosakovsky Pond, S. L., & Scheffler, K. (2014). RELAX: Detecting Relaxed Selection in a Phylogenetic Framework. Mol. Biol. Evol. , 32 (3), 820–832. https://doi.org/10.1093/molbev/msu400 Wright, E. A., Heckel, T., Groenendijk, J., Davies, J. W., & Boulton, M. I. (1997). Splicing features in maize streak virus virion- and complementary-sense gene expression. The Plant Journal , 12 (6), 1285–1297. https://doi.org/10.1046/j.1365-313x.1997.12061285.x Wu, B., Melcher, U., Guo, X., Wang, X., Fan, L., & Zhou, G. (2008). Assessment of codivergence of Mastreviruses with their plant hosts. BMC Evolutionary Biology , 8 (1), 1–13. https://doi.org/10.1186/1471-2148-8-335 Wu, B., Shang, X., Schubert, J., Habekuß, A., Elena, S. F., & Wang, X. (2015). Global-scale computational analysis of genomic sequences reveals the recombination pattern and coevolution dynamics of cereal-infecting geminiviruses. Scientific Reports 2015 5:1 , 5 (1), 1–10. https://doi.org/10.1038/srep08153 Yang, Z. (1994). Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods. Journal of Molecular Evolution , 39 (3), 306–314. https://doi.org/10.1007/BF00160154/METRICS Zhou, X., Park, B., Choi, D., & Han, K. (2018). A Generalized Approach to Predicting Protein-Protein Interactions Between Virus and Host. BMC Genomics . https://doi.org/10.1186/s12864-018-4924-2 Additional Declarations No competing interests reported. Supplementary Files Supplementary.zip Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4670195","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":332583051,"identity":"15271ed4-1679-4ce8-a2a3-8275e8d44a9a","order_by":0,"name":"Kehinde A. Oyeniran","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABCklEQVRIiWNgGAWjYLCCBCCWYOBhOMDAxiAHEjjwgBQtxmAtCcTYBNLCANSS2AAzBBfgl0g+9uDhHrs8yfbeg4cLymzS54cdfgi0xU5OtwG7FskZaekGCc+Si6V5ziUcnnEuLXfj7TQDoJZkY7MD2LUY3Mgxk0g4wJw4TyLH4DBv2+HcjbMTQFoOJG7DqSX/G1BLfeI8+TcgLf/TDWenfyCgJYcNqOVw4mwJHpCWAwny0jn4bZHseQZy2PHEmT1Ah/GcSzbcIJ1TcCDBALdf+NmTn0n+OFCdOOP4GePPPGV28vKz0zd/+FBhJ4dLC4NAArpTwSoNcCgHW4NulnwDHtWjYBSMglEwIgEAahJmKr5zlQIAAAAASUVORK5CYII=","orcid":"","institution":"University of Cape Town","correspondingAuthor":true,"prefix":"","firstName":"Kehinde","middleName":"A.","lastName":"Oyeniran","suffix":""},{"id":332583052,"identity":"1fe865ea-7f2c-45da-af0f-707866d21cbb","order_by":1,"name":"Mobolaji O. Tenibiaje","email":"","orcid":"","institution":"Bamidele Olumilua University of Education Science and Technology, Ikere-Ekiti","correspondingAuthor":false,"prefix":"","firstName":"Mobolaji","middleName":"O.","lastName":"Tenibiaje","suffix":""}],"badges":[],"createdAt":"2024-07-01 20:53:28","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4670195/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4670195/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":61765064,"identity":"07f64cfc-9114-4804-bc41-a9dbe5f2377f","added_by":"auto","created_at":"2024-08-05 10:17:13","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":466791,"visible":true,"origin":"","legend":"\u003cp\u003eMidpoint rooted maximum likelihood (ML) trees for MSV-A genes inferred using IQ-Tree2 (Minh et al., 2020)and visualised with Figtree (Rambaut, 2018). The bar indicates the number of nucleotide substitutions per codon site. The numbers shown are bootstrap support values from 5000 ultrafast replicates (Hoang et al., 2018). Clades with bootstrap supports of less than 50 were collapsed: (A) Coat protein (\u003cem\u003ecp\u003c/em\u003e) gene (B) Movement protein (\u003cem\u003emp\u003c/em\u003e) gene.\u003c/p\u003e","description":"","filename":"Fig.1.png","url":"https://assets-eu.researchsquare.com/files/rs-4670195/v1/898cc2e1e0ca6a7dd9ad3a54.png"},{"id":61765066,"identity":"fc084252-78ae-4cb3-9de3-3ab62a88a9d2","added_by":"auto","created_at":"2024-08-05 10:17:13","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":780649,"visible":true,"origin":"","legend":"\u003cp\u003ePositively selected sites and their amino acids are all different amino acids in same\u003c/p\u003e\n\u003cp\u003esites as estimated by MEME (Mixed Effects Model of Evolution) algorithm (Murrell et al., 2012) performed with the DataMonkey server (Weaver et al., 2018). (A) Capsid protein (\u003cem\u003ecp\u003c/em\u003e) gene. (B) Movement protein (\u003cem\u003emp\u003c/em\u003e). * Nonpolar; † Polar; ‡ Charged.\u003c/p\u003e","description":"","filename":"Fig.2.png","url":"https://assets-eu.researchsquare.com/files/rs-4670195/v1/a13c3af8d018c7a298a64ac9.png"},{"id":61765068,"identity":"e3d8b96d-0a13-484b-adce-2c78beb524ba","added_by":"auto","created_at":"2024-08-05 10:17:13","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":1205751,"visible":true,"origin":"","legend":"\u003cp\u003eOmega distributions (ω=dn/ds) under the RELAX alternative model for branches under statistically significant selection within MSV-A \u003cem\u003emp\u003c/em\u003e gene as determined by the RELAX algorithm (Wertheim et al., 2014) in the DataMonkey server (Weaver et al., 2018). Selection intensified significantly in the within the internal nodes test branches relative to the leave nodes in MSV-A \u003cem\u003emp\u003c/em\u003e gene. Table S1 shows the full RELAX model results. K relaxation/intensification parameter; LT likelihood ratio.\u003c/p\u003e","description":"","filename":"Fig.3.png","url":"https://assets-eu.researchsquare.com/files/rs-4670195/v1/4a95eaa095f11910ffb5c0c6.png"},{"id":67391608,"identity":"a71e7a60-b63a-4ae9-ac74-d75e15d902e1","added_by":"auto","created_at":"2024-10-24 11:17:02","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3040338,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4670195/v1/fd537163-48c6-4aa8-8a6b-dad48c3239dc.pdf"},{"id":61765842,"identity":"ed94a527-ea4c-427d-99be-56de728839f5","added_by":"auto","created_at":"2024-08-05 10:25:13","extension":"zip","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":20176,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementary.zip","url":"https://assets-eu.researchsquare.com/files/rs-4670195/v1/569935dccec4dcf7823ef425.zip"}],"financialInterests":"No competing interests reported.","formattedTitle":"Detectable episodic positive selection in the virion strand A-strain maize streak virus genes may have a role in its host adaptation","fulltext":[{"header":"Introduction","content":"\u003cp\u003eMaize streak virus (MSV) is a type member of the \u003cem\u003eGeminiviridae\u003c/em\u003e family, a single-stranded DNA virus with great economic impacts on the cultivation and growth of the maize plant in the sub-Saharan Africa (Martin et al., \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2008\u003c/span\u003e; Roumagnac et al., \u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). MSV seriously constrains maize production in the sub-Saharan Africa resulting in serious economic losses and low yield turnout, most especially for peasant farmers with little or no resources, and limited access to improved maize cultivars in the region (Bediako E et al., \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Charles, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2014\u003c/span\u003e; Oppong et al., \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2013\u003c/span\u003e). Infected maize plants often show symptoms such as chlorotic lesions, leaf striation, yellowing, stunting, low yield turnout, and, in severe cases, death (Martin et al., \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e1999\u003c/span\u003e; Martin \u0026amp; Rybicki, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2002\u003c/span\u003e; Oyeniran et al., \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). Of the 11 known MSV strains A through to K, only the A-strain causes severe maize streak disease (MSD) with stunting and leaf striation symptom. MSV-A being the major MSD causing agent is believed to have become adapted to the maize plant (Ketsela et al. \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2022\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eMSV encodes three major genes in its genome that include the virion sense movement protein (\u003cem\u003emp\u003c/em\u003e), and coat protein (\u003cem\u003ecp\u003c/em\u003e) genes (Mu\u0026ntilde;oz-Mart\u0026iacute;n et al., \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Owor et al., \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). While the complementary strand encodes the replication associated proteins (\u003cem\u003erep\u003c/em\u003e/\u003cem\u003erepA\u003c/em\u003e) genes which are saddled with initiating and moderating replication of the virus genome (Shepherd et al., \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). MSV genes as important drivers of its evolution perform vital functions that ensure its survival, spread and replications in susceptible hosts (Boulton, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2002\u003c/span\u003e; Davies et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e1997\u003c/span\u003e). Consequently, these genes are likely targets for natural selection from the perspectives of host and pathogen evolutionary arms race (Denes et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Wang et al., \u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). MSV movements as chiefly facilitated by its leaf hopper vectors would also mean that the virus must constantly cope with a plethora of changing environment and its likely effects on its genes. Thus, signatures of positive selection for amino acid changes responsible for host adaptation in most pathogens should be detectable in an evolutionary, analytical framework (Antonides et al., \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eMSV coat protein (\u003cem\u003ecp\u003c/em\u003e) is a virion sense gene that is expressed from the long intergenic region (LIR) transcripts. The \u003cem\u003ecp\u003c/em\u003e, about\u0026thinsp;~\u0026thinsp;735 nucleotides (nts) long has encapsidation functions and also plays key roles in systemic spread especially by the leaf hoppers (H. Liu et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2001\u003c/span\u003e). The movement protein (\u003cem\u003emp\u003c/em\u003e) is another virion sense gene of ~\u0026thinsp;310 nts that is also expressed alongside the \u003cem\u003ecp\u003c/em\u003e gene from bidirectionally transcribed LIR with main function of mediating viral movements within infected host cells (Boulton, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2002\u003c/span\u003e; Wright et al., \u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e1997\u003c/span\u003e). Both \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e virion sense genes are also linked with MSV inter and intra host movements either via intermediate leaf hoppers spread for the \u003cem\u003ecp\u003c/em\u003e or cell-to-cell movement within infected tissues for \u003cem\u003emp\u003c/em\u003e. Further, because of the binding capability of the \u003cem\u003ecp\u003c/em\u003e gene, and the accompanying nuclear signal while facilitating partially uncoated single stranded DNA (ssDNA) cell entry (Davies et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e1997\u003c/span\u003e; Owor et al., \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2007\u003c/span\u003e), continuous interaction of the \u003cem\u003ecp\u003c/em\u003e gene with the constantly changing host conditions might make it undergo persistent stimuli-driven molecular evolution.\u003c/p\u003e \u003cp\u003eThe non-structural complementary sense \u003cem\u003erepA\u003c/em\u003e/\u003cem\u003erep\u003c/em\u003e proteins are expressed as either spliced \u003cem\u003erep\u003c/em\u003e and un-spliced \u003cem\u003erepA\u003c/em\u003e (Ruschhaupt et al., \u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Shepherd et al., \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e2007\u003c/span\u003e), \u003cem\u003erep\u003c/em\u003e plays key roles in replication initiation while \u003cem\u003erepA\u003c/em\u003e acts as host and viral gene transcription regulator. The MSV \u003cem\u003erep\u003c/em\u003e, a spliced product of the C1:C2 is also believed to have roles in the activation of virion sense promoter and specifically have this role for the coat protein promoter (Horv\u0026aacute;th et al., \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e1998\u003c/span\u003e; Nikovics et al., \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2001\u003c/span\u003e). The \u003cem\u003erepA\u003c/em\u003e technically moderates \u003cem\u003erep\u003c/em\u003e activities through coordinated checks and balancing mechanisms. This is necessary for MSV as they cannot suppress their promoter unlike some begomoviruses (Nikovics et al., \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2001\u003c/span\u003e). It is possible that regulating the expression of these proteins at varying stages of infection, and at different host cell cycle not only plays pivotal roles in coordinating the virus life cycle, it can also make these regulating genes evolutionary selection targets.\u003c/p\u003e \u003cp\u003eKey mechanisms of evolution are natural selection and genetic change. Natural selection sits specifically at the intersection of diversifying evolution. Natural selection is caused by competition and environmental changes, acts on genetic variation, produces evolution, changes gene pool, and resulting in selective survival, host expansion, and adaptation (Aguad\u0026eacute;, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e1999\u003c/span\u003e; Deom et al., \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Li et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Fitness-based selection is a deliberate event that carefully guides the course of evolution by ensuring that organisms only pass on useful traits to the next generation. Furthermore, natural selection can be likened to a differential fitness driven giant sieve that separates undesirable traits from the desirable ones, ultimately producing fitter and healthier descendants (Acosta-Leal et al., \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2011\u003c/span\u003e; Oyeniran \u0026amp; Oyediran, 2024; Spielman et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eHere, we intend to identify the occurrence of selection in the virion strand \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e genes of the economically important MSV-A lineages that have disseminated within the sub-Saharan Africa using the publicly available sequence data. Given that natural selection as evolutionary signatures is detectable in sequence data, it is possible to estimate sites and branches within these genes that are evolving under selection pressure up to amino acid level as these could further give insights into how these genes evolve as they interact within changing host conditions.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eSequence selection and alignments\u003c/h2\u003e \u003cp\u003eFull coding sequences (CDS) of quality MSV-A \u003cem\u003ecp\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;115) and \u003cem\u003emp\u003c/em\u003e (n\u0026thinsp;=\u0026thinsp;115) gene sequences were retrieved from the GenBank via their accession numbers using the Linux based e-fetch program (Sayers, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2010\u003c/span\u003e). Sequences were selected such that they represent their different sampling regions in the sub-Saharan Africa, while also considering the availability of annotations for the gene of interest in the sequences. Sequences were striped of stop codons before aligning as codon-based multiple sequence with muscle (Edgar, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2004\u003c/span\u003e), in Aliview (Larsson, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2014\u003c/span\u003e), and later back-translated to the corresponding codon-based multiple nucleotide sequence alignment. This method of nucleotide sequence alignment prevents gaps insertion between the first or second nucleotide positions of a codon, in order to ensure biologically useful alignments with in frame codons.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eRecombination analyses using recombination detection program (RDP)\u003c/h2\u003e \u003cp\u003eRDP version 5 (Martin et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) was used to detect evidence of recombination in the datasets. Detected recombinants sequences and their tracts were completely removed from the alignment. Recombinant free dataset from this step was used for downstream phylogenetic based selection studies.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eConstruction of gene trees\u003c/h2\u003e \u003cp\u003eFor each MSV gene being considered, codon-based nucleotide gene trees were constructed by providing sequence alignments to IQ-TREE v.1.6.12 (Minh et al., \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). We used the ModelFinder within IQ-TREE to choose the appropriate model of sequence evolution (Kalyaanamoorthy et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2017\u003c/span\u003e), as adjudged by Bayesian Information Criterion (BIC) support measures among 185 codon models. The best substitution for each position was then used to infer the best gene tree using a maximum likelihood (ML) approach. For branch support analysis, we performed 2000 replicates for a non-parametric Shimodaira-Hasegawa-like approximate likelihood ratio test (SH-aLRT)(Guindon et al., \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Shimodaira \u0026amp; Hasegawa, \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e1999\u003c/span\u003e) and 5000 alignments of ultrafast bootstrapping (Hoang et al., \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Majority-rule consensus ML tree was constructed for each gene based on 5000 bootstrap trees and were further edited in FigTree v.1.4.4 (Rambaut, \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eDetecting selection\u003c/h2\u003e \u003cp\u003eSelection tests were carried out using the HyPhy package via the DataMonkey Adaptive Evolution server(Kosakovsky Pond et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2005\u003c/span\u003e; Weaver et al., \u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) based on codon-based nucleotide sequence alignment (which also accounts for silent substitution at codon level) and the majority-rule consensus ML trees. Selection tests are based on calculations of the nonsynonymous-to-synonymous substitution rate ratios (\u003cb\u003eω\u003c/b\u003e\u0026thinsp;=\u0026thinsp;dN/dS) using codon models and Likelihood Ratio Tests (LRTs) under null hypotheses neutral evolution (\u003cb\u003eω\u003c/b\u003e\u0026thinsp;=\u0026thinsp;1), negative (purifying) selection denoted by \u003cb\u003eω\u003c/b\u003e\u0026thinsp;\u0026lt;\u0026thinsp;1 and positive (diversifying) selection symbolized by \u003cb\u003eω\u003c/b\u003e\u0026thinsp;\u0026gt;\u0026thinsp;1. The HyPhy methods first carry out an initial global MG94xREV fit for branch length and nucleotide substitution optimizations that serve as initial parameter values in model fitting for the hypothesis testing process. An important advantage of including the synonymous rate variation allowing dS to vary across sites and branches in the phylogeny enhances more efficient detection of positive selection and lesser false discovery (Weaver et al., \u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). This was to detect selection signals at specific codon sites within genes, and at particular branches or lineages of MSV gene trees.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eGene-wide\u003c/h2\u003e \u003cp\u003eTo estimate positive selection anywhere on the gene trees, we used Branch-site Unrestricted Statistical Test for Episodic Diversification with Multi-nucleotide Substitution (BUSTED-MH) algorithm (Lucaci et al., \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Murrell et al., \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e2015\u003c/span\u003e), while considering biases introduced by multi-nucleotide substitutions and variations associated with synonymous substitution rates that often lead to increased false positive into account. BUSTED-MH fits a codon model with three rate classes, constrained as ω\u003csub\u003e1\u003c/sub\u003e\u0026thinsp;\u0026le;\u0026thinsp;ω\u003csub\u003e2\u003c/sub\u003e\u0026thinsp;\u0026le;\u0026thinsp;1\u0026thinsp;\u0026le;\u0026thinsp;ω\u003csub\u003e3\u003c/sub\u003e and estimates the proportion of individual sites that belong in ω class. Positive selection is then estimated by comparing this model fits to a null model where positive selection is not allowed (ω\u003csub\u003e3\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;1). The null hypothesis will be rejected if evidence exists for at least one or numerous codon positions in sites and branches having undergone episodic positive selection.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eAt sites\u003c/h3\u003e\n\u003cp\u003eTo determine if positive selection was detected at any codon sites, the Mixed Effects Model of Evolution (MEME) algorithm was used (Murrell et al., \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). MEME tests that individual sites have experienced episodic or diversifying positive selection within a proportion of branches using a mixed-effects maximum likelihood approach. MEME infers two ω rate classes per site while simultaneously calculating corresponding weights (i.e. the proportion of branches evolving under that rate class). Two rate classes (α) and (β- and β+) were inferred by a single dS and dN values per site. The β- and β\u0026thinsp;+\u0026thinsp;were constrained to be less than or equal to α in the null model; while in the alternative model, β\u0026thinsp;+\u0026thinsp;was not constrained. Positive selection is then inferred for a site if a likelihood ratio test returns a significant β+ \u0026gt; α at a site.\u003c/p\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eAt branches\u003c/h2\u003e \u003cp\u003eAdaptive Branch-Site Random Effects Likelihood (aBSREL) algorithm (Smith et al., \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) was used to test for individual MSV-A branches (lineages) and sites under selection. All branches and sites were tested. aBSREL explores both site and branch level ω heterogeneity and then infers the optimal number of ω classes from AIC\u003csub\u003ec\u003c/sub\u003e (the small sample AIC). While the alternative model is compared to a null model disallowing positive selection in the rate classes, it also performed a Likelihood Ratio Test at each branch and branch specific p-values corrected for multiple testing using the Holm-Bonferroni before assessing significance.\u003c/p\u003e \u003cp\u003eFurther, RELAX algorithm(Wertheim et al., \u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2014\u003c/span\u003e) was used to detect the differential selection and whether the selection pressure was significantly relaxed or intensified across the branches within the MSV-A clades by setting the internal nodes and leave nodes as test and reference respectively. RELAX employs a random effect branch-site model to test if a set of test branches evolves under a different stringent condition of selection than a set of reference branches. It entails fitting a codon model with tree ω classes to the entire phylogeny for the null model, while testing for changes in selection constraints linked with the selection intensity parameter k (\u0026ge;\u0026thinsp;0). The selection intensity parameter, in the alternative model, served as an exponent to the ω classes. A significant k\u0026thinsp;\u0026gt;\u0026thinsp;1 obtained following a likelihood ratio testing of the null and alternative models would mean that selection strength was intensified along the test branches being considered relative to the reference. A significant result of k\u0026thinsp;\u0026lt;\u0026thinsp;1 implies that selection strength was relaxed along the test branches.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec11\"\u003e\n \u003ch2\u003eIndividual gene tree and its evolution\u003c/h2\u003e\n \u003cp\u003eThe best-fit codon substitution model determined by Bayesian Information Criterion (BIC) within the 95% confidence limit for \u003cem\u003ecp\u003c/em\u003e: MG\u0026thinsp;+\u0026thinsp;F1X4\u0026thinsp;+\u0026thinsp;G4 and \u003cem\u003eMp\u003c/em\u003e: MG\u0026thinsp;+\u0026thinsp;F1X4\u0026thinsp;+\u0026thinsp;G4. In the present nomenclature, MG stands for codon substitution model of Muse \u0026amp; Gaut, (\u003cspan\u003e1994\u003c/span\u003e), while that of GY is for nonsynonymous/synonymous and transition/transversion rate ratios model of Goldman \u0026amp; Yang, (\u003cspan\u003e1994\u003c/span\u003e). The frequency type F1X4 denotes overall unequal nucleotide frequencies but equal nucleotide frequencies over the three codon positions. The rate type G4 indicates the discrete Gamma model ofYang, (\u003cspan\u003e1994\u003c/span\u003e) with four rate categories. The best Maximum Likelihood trees were inferred from the codon-based nucleotide alignment and substitution model. The \u003cem\u003ecp\u003c/em\u003e gene had a total tree length (the sum of branch lengths, each representing number of substitutions per codon site) of 1.05, and \u003cem\u003emp\u003c/em\u003e, 0.85. The majority-rule consensus gene trees produced by subsequent bootstrapping are presented in Fig.\u0026nbsp;1.\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003eFigure\u0026nbsp;1\u003c/strong\u003e: Midpoint rooted maximum likelihood (ML) trees for MSV-A genes inferred using IQ-Tree2 (Minh et al., \u003cspan\u003e2020\u003c/span\u003e) and visualised with Figtree (Rambaut, \u003cspan\u003e2018\u003c/span\u003e). The bar indicates the number of nucleotide substitutions per codon site. The numbers shown are bootstrap support values from 5000 ultrafast replicates (Hoang et al., \u003cspan\u003e2018\u003c/span\u003e). Clades with bootstrap supports of less than 50 were collapsed: (A) Coat protein (\u003cem\u003ecp\u003c/em\u003e) gene (B) Movement protein (\u003cem\u003emp\u003c/em\u003e) gene.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec12\"\u003e\n \u003ch2\u003eDetecting selection\u003c/h2\u003e\n \u003cdiv id=\"Sec13\"\u003e\n \u003ch2\u003eGene-wide\u003c/h2\u003e\n \u003cp\u003eBUSTED-MH algorithm found evidence of gene-wide episodic diversifying selection in both \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e and genes (LRT p-value \u0026le; 0.05) in the phylogeny. Therefore, there is evidence that at least one site on at least one test branch of these genes in the MSV-A lineage has experienced diversifying selection (Table \u003cspan\u003e1\u003c/span\u003e).\u003c/p\u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec14\"\u003e\n \u003ch2\u003eAt sites\u003c/h2\u003e\n \u003cp\u003eMEME found evidence of episodic positive/diversifying selection (LRT, \u003cem\u003ep\u003c/em\u003e value\u0026thinsp;\u0026le;\u0026thinsp;0.05) at particular sites in the genes. For \u003cem\u003ecp\u003c/em\u003e phylogeny, three out of 244 codon sites (1.23%) showed evidence for selection (Table\u0026nbsp;\u003cspan\u003e2\u003c/span\u003e). For \u003cem\u003emp\u003c/em\u003e, one of 101 codon sites (0.99%) showed evidence for selection (Table\u0026nbsp;\u003cspan\u003e2\u003c/span\u003e).\u003c/p\u003e\n \u003cdiv\u003e\n \u003ctable id=\"Tab1\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv\u003eTable 1\u003c/div\u003e\n \u003cdiv\u003e\n \u003cp\u003eStatistical results of model fits for the Branch-site Unrestricted Statistical Test for Episodic Diversification-Multi-nucleotide Substitution (BUSTED-MH) algorithm (Lucaci et al., \u003cspan\u003e2023\u003c/span\u003e; Murrell et al., \u003cspan\u003e2015\u003c/span\u003e) accomplished with the Datamonkey server (Weaver et al., \u003cspan\u003e2018\u003c/span\u003e) showing \u003cem\u003emp\u003c/em\u003e gene, there is evidence of episodic diversifying selection in \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e genes (the null model of no positive selection, \u0026omega;\u003csub\u003e3\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;1, is rejected; LRT: p\u0026thinsp;\u0026lt;\u0026thinsp;0.05).\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"7\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eModel\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u003cem\u003elog\u003c/em\u003e L\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e#.\u003c/p\u003e\n \u003cp\u003eparameters\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAIC\u003csub\u003ec\u003c/sub\u003e\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u0026omega;\u003csub\u003e1\u003c/sub\u003e\u003c/p\u003e\n \u003cp\u003e(Negative)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u0026omega;\u003csub\u003e2\u003c/sub\u003e\u003c/p\u003e\n \u003cp\u003e(Neutral)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u0026omega;\u003csub\u003e3\u003c/sub\u003e\u003c/p\u003e\n \u003cp\u003e(Positive)\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colspan=\"7\"\u003e\n \u003cp\u003e\u003cem\u003eCp\u003c/em\u003e (87 Sequences, 244 Codon sites, LRT p-value\u0026thinsp;=\u0026thinsp;0.029)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAlternative Model\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-2652.1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e195\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e5697.9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00(96.92%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00(1.14%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e5.38(1.93%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNull Model\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-2654.9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e194\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e5701.5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00(86.85%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.06(4.71%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1.00(8.43%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colspan=\"7\"\u003e\n \u003cp\u003e\u003cem\u003eMp\u003c/em\u003e (47 Sequences, 101 Codon sites, LRT p-value\u0026thinsp;=\u0026thinsp;0.024)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAlternative Model\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-989.2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e115\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e2214.2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.34(98.27%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.35(1.71%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1.00e\u0026thinsp;+\u0026thinsp;10(0.02%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNull Model\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-992.2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e114\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e2218.6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00(0.64%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00(63.92%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1.00(35.44%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cdiv\u003e\n \u003ctable id=\"Tab2\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv\u003eTable 2\u003c/div\u003e\n \u003cdiv\u003e\n \u003cp\u003eThe results of the MEME (Mixed Effects Model of Evolution) algorithm (Murrell et al., \u003cspan\u003e2012\u003c/span\u003e) performed with the DataMonkey server (Weaver et al., \u003cspan\u003e2018\u003c/span\u003e) showing codon sites under positive/diversifying selection (p\u0026thinsp;\u0026le;\u0026thinsp;0.05) for selected MSV-A genes.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"8\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eSite\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u0026alpha;\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u0026beta;-\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eP-\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u0026beta;+\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eP+\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eLRT\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eP-value\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colspan=\"8\"\u003e\n \u003cp\u003e\u003cem\u003eCp\u003c/em\u003e (Three out of 244 sites; 1.23%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e8.78\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.01\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.97\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e68.28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.03\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e7.68\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.01\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e38\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e17.46\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.07\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e4.81\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.04\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colspan=\"8\"\u003e\n \u003cp\u003e\u003cem\u003eMp\u003c/em\u003e (One out of 101 sites; 0.99%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e593.09\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.02\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e11.95\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003eSite numbers correspond to codon sites as numbered in the peptide alignment (Fig. ). \u0026alpha;\u0026thinsp;=\u0026thinsp;synonymous substitution rate; \u0026beta;\u0026minus; = non-synonymous substitution rate for the negative/neutral evolution component; p\u0026minus; = proportion of tree evolving neutrally or under negative selection; \u0026beta;+ = non-synonymous substitution rate for the positive/neutral evolution component; p\u0026thinsp;+\u0026thinsp;=\u0026thinsp;proportion of tree evolving neutrally or under positive selection; LRT\u0026thinsp;=\u0026thinsp;likelihood ratio test statistic\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec15\"\u003e\n \u003ch2\u003eDifferent amino acids exist in positively selected sites\u003c/h2\u003e\n \u003cp\u003eFollowing the identification of positively selected sites, amino acids for these sites were identified. For \u003cem\u003ecp\u003c/em\u003e, 60% of amino acids in the positively selected sites are hydrophilic, while the remaining 40% hydrophobic. For \u003cem\u003emp\u003c/em\u003e, it is 100% hydrophobic amino acid. Across the two genes, a slight preponderance of hydrophilic amino acids exists than it is of hydrophobic amino acids (Fig.\u0026nbsp;2).\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003eFigure\u0026nbsp;2\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003ePositively selected sites and their amino acids are all different amino acids in same\u003c/p\u003e\n \u003cp\u003esites as estimated by MEME (Mixed Effects Model of Evolution) algorithm (Murrell et al., \u003cspan\u003e2012\u003c/span\u003e) performed with the DataMonkey server (Weaver et al., \u003cspan\u003e2018\u003c/span\u003e). (A) Capsid protein (\u003cem\u003ecp\u003c/em\u003e) gene. (B) Movement protein (\u003cem\u003emp\u003c/em\u003e). * Nonpolar; \u0026dagger; Polar; \u0026Dagger; Charged.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec16\"\u003e\n \u003ch2\u003eAt branches\u003c/h2\u003e\n \u003cp\u003eThe aBSTREL algorithm tested specific gene locus for branch specific selection searching various lineages under selection in each locus (Table\u0026nbsp;\u003cspan\u003e4\u003c/span\u003e). Selection was detected in both \u003cem\u003eCp\u003c/em\u003e and Mp genes at only \u003cstrong\u003eone\u003c/strong\u003e (of the \u003cstrong\u003e130\u003c/strong\u003e\u0026thinsp;=\u0026thinsp;0.76%), and \u003cstrong\u003eone\u003c/strong\u003e (of the \u003cstrong\u003e60\u003c/strong\u003e\u0026thinsp;=\u0026thinsp;1.66%) branches of the phylogenies (LRT, \u003cem\u003eP\u003c/em\u003e-value\u0026thinsp;\u003cstrong\u003e\u0026le;\u003c/strong\u003e\u0026thinsp;0.05) respectively.\u003c/p\u003e\n \u003cp\u003eRELAX algorithm inferred that selection significantly intensified for the diversity of the entire isolates, as well as those that are closely related to the maze adapted MSV-Mat-A isolate (GenBank accession number: AF329878) (Fig.\u0026nbsp;3).\u003c/p\u003e\n \u003cdiv\u003e\n \u003ctable id=\"Tab3\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv\u003eTable 4\u003c/div\u003e\n \u003cdiv\u003e\n \u003cp\u003eaBSREL (adaptive Branch-Site Random Effects Likelihood) algorithm (Smith et al., \u003cspan\u003e2015\u003c/span\u003e) as determined in the DataMonkey server (Weaver et al., \u003cspan\u003e2018\u003c/span\u003e) for \u003cem\u003eCp\u003c/em\u003e and \u003cem\u003eMp\u003c/em\u003e genes.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eBranch/Node\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eLRT test statistic\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003ep-value (corrected for multiple testing)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u0026omega; distribution over sites\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cem\u003eCp\u003c/em\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAF329878_MSV-A\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e15.2025\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.0220\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026omega; \u003csub\u003e1\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;0.00 (99%) \u0026omega;\u003csub\u003e2\u003c/sub\u003e = 100000 (0.84%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cem\u003eMp\u003c/em\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNode25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e15.1669\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.0024\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e\u0026omega;\u003csub\u003e1\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;0.00 (98%) \u0026omega;\u003csub\u003e2\u003c/sub\u003e\u0026thinsp;=\u0026thinsp;460 (2.3%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003e\u003cstrong\u003eFigure\u0026nbsp;3\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003eOmega distributions (\u0026omega;\u0026thinsp;=\u0026thinsp;dn/ds) under the RELAX alternative model for branches under statistically significant selection within MSV-A \u003cem\u003emp\u003c/em\u003e gene as determined by the RELAX algorithm (Wertheim et al., \u003cspan\u003e2014\u003c/span\u003e) in the DataMonkey server (Weaver et al., \u003cspan\u003e2018\u003c/span\u003e). Selection intensified significantly in the within the internal nodes test branches relative to the leave nodes in MSV-A \u003cem\u003emp\u003c/em\u003e gene. Table \u003cspan\u003eS1\u003c/span\u003e shows the full RELAX model results. K relaxation/intensification parameter; LT likelihood ratio.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe present study has looked into detecting signatures of selection within the maize adapted A-strain MSV genes. We have found evidence of selection at gene level in \u003cem\u003ecp\u003c/em\u003e, and \u003cem\u003emp\u003c/em\u003e genes. The \u003cem\u003ecp\u003c/em\u003e gene appears to be under stronger selection pressure when compared to the \u003cem\u003emp\u003c/em\u003e in which positive selection was detected at only one site and a node in the phylogeny. Summarily, we have found evidence of selection at overall gene and codon site levels (stronger for \u003cem\u003ecp\u003c/em\u003e gene), and at lineage levels for \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e genes. Overall, amino acids corresponding to these positively selected sites across the two genes show slightly higher proportions for hydrophilicity than hydrophobicity with more than one amino acid on a site. Additional evidence for differential selective pressure was observed for the \u003cem\u003emp\u003c/em\u003e gene in which selection was significantly intensified for the internal test nodes relative to the reference leaf nodes.\u003c/p\u003e \u003cp\u003eEvolution of the MSV-A genes as inferred from their respective maximum likelihood trees revealed the genes have evolved in a similar pattern given the inferred nucleotide substitution rates. These genes have diverged at closely similar rates. The \u003cem\u003ecp\u003c/em\u003e appears to have evolved the most while the \u003cem\u003emp\u003c/em\u003e gene the least with their respective nucleotide substitution rates per codon sites of 0.08 and 0.03. Of the two genes, the \u003cem\u003emp\u003c/em\u003e gene, with the lowest rate of nucleotide substitutions per codon sites, seemed to have evolved the least relative to the \u003cem\u003ecp\u003c/em\u003e. These differing evolution rates can be attributed to how these genes carry out their biological functions. The \u003cem\u003ecp\u003c/em\u003e gene for instance, being a coat protein that interacts within the virus nucleus during genome packaging and the external environment such as host membrane receptor, and can also manipulates host transcription machinery (Mostert et al., \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Zhou et al., \u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2018\u003c/span\u003e), may likely evolve more than the \u003cem\u003emp\u003c/em\u003e gene that is saddled with more of internal cell-to-cell movement.\u003c/p\u003e \u003cp\u003eDetectable evidence of gene wide diversifying positive selection within the genes implies that these genes have experienced or are trying to experience molecular changes that are adaptive in nature. Given that mutations can have varying effects on gene functions that range from useless to deleterious in nature, valid instances of useful, adaptive selection can still occur in the events of positive diversifying selection. Therefore, detecting statistically significant evidence of positive selection in the \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e genes implies they have been subject to adaptive changes in either codon sites or branches in the phylogeny. The gene wide selection analysis also serves as a benchmark for further selection analyses. At sites, we have detected episodic diversifying positive selection at the individual codon sites in the \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e genes with stronger effects on \u003cem\u003ecp\u003c/em\u003e. Episodic diversifying selection was found acting on codon sites in at least three loci with a greater proportion of the \u003cem\u003ecp\u003c/em\u003e gene (1.23%) than the \u003cem\u003emp\u003c/em\u003e (0.99%). Although it appears these genes are mostly under negative and neutral selection, it is also important not to hastily conclude that the signature of positive selection observed is inconsequential, given the association of positive selection with host adaptation (Montoya et al., \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Nigam et al., \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e2019\u003c/span\u003e; Thines, \u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e2019\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eFactors that determine host adaptation for MSV-A have been previously linked to the expression of virion sense \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e genes also known as pathogenicity determinants(Liu et al., \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e1999\u003c/span\u003e; Monjane et al., \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2011\u003c/span\u003e; Wright et al., \u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e1997\u003c/span\u003e) could further explain the diversifying selection observed in certain proportions of these genes and also probably responsible for the regulated expression of these genes in the maize adapted MSV-A such that evolution favouring adaptation of these viruses to their maize host over time, has translated into reduced symptom severities that readily fosters onward transmission; with a minimal host harm (Monjane et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). In wheat dwarf virus (WDV); another member of genus \u003cem\u003eMastrevirus\u003c/em\u003e, the \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e genes being crucial for successful viral encapsidation and movement during infection, may likely render them twice as likely to be under selection pressure (Wu et al., \u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e2008\u003c/span\u003e, \u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e2015\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eIdentifying the amino acids in the various positively selected sites can further provide us with information regarding the evolutionary states of these sites. For instance, different amino acids on a positively selected site may show a likely selection pressure in which multiple amino acids on the sites are subject to evolutionary changes. Although this could also mean that the virus in this case is happy with its status quo, it is also very important to consider the crucial interplay that exists between natural selection controlled codon usage in viruses that is tailored towards achieving a desired goal such as host adaptation. An interplay that exists between coincidental codon usage between viruses and hosts, and their regulations for efficient translation and protein folding is crucial to adapting to host environment, and has been described as the mechanisms of multiple hosts and vector adaptation in Zika virus (Butt et al., \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). For MSV-A, it remains to be seen how these mechanisms will specifically apply, as codon usage between MSV-A and its hosts may likely be responsible for the inferred low rate of positive diversifying selection. Overall, the importance of amino acid changes in viral proteins for host adaptation is central to how viruses adapt to their hosts, and natural selection forces acting on different sites are already pointing to the fact that the virus in question is trying to change these amino acids in the course of its evolution, most likely for host adaptability and pathogenicity as applicable in other viruses with similar evolutionary pattern (Carrique et al., \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; M\u0026auml;nz et al., \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2016\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eAt branches in \u003cem\u003ecp\u003c/em\u003e gene, aBSTREL found evidence of selection in a clade of MSV-A (GenBank accession number: AF329878) from southern African isolate that is closely related to MSV-A-Mat-A isolate, observed to have, over time become adapted to the maize host (Monjane et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). For the \u003cem\u003emp\u003c/em\u003e gene, evidence of statistically significant positive selection was also found for a node representing the diversity of MSV-A from the Island of Reunion and South Africa (Pande et al., 2012; Peterschmitt et al., 1996). These observed variations in selection pressure between clades or lineages show that evolution occurs under different selection pressures, and may further suggest that positive selection may have a role in adaptive host evolution in A-strain MSV. We found that selection pressure was significantly intensified for all the MSV isolates on testing all the internal nodes that represent the entire sequence diversity of the \u003cem\u003emp\u003c/em\u003e gene, and those MSV-A isolates that are closely related to the MSV-Mat-A isolate (GenBank accession number: AF329881) that is believed to have become adapted to the maize plant. The MSV \u003cem\u003emp\u003c/em\u003e is a virion sense gene that enables the virus to move from cell to cell, it is plausible that the \u003cem\u003emp\u003c/em\u003e gene like the \u003cem\u003ecp\u003c/em\u003e may also be subject to stimuli driven diversifying selection and our results here have further lent credence to this, and further substantiating our selection driven host adaptive evolution hypothesis in these economically important maize streak virus strain.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eOur findings have shown that evidence of positive selection exists in the virion strand genes of the A-strain maize streak virus, specifically in isolates closely related to the well adapted MSV-A strain that is spreading within the sub-Saharan Africa. It is therefore plausible that positive selection pressure may be driving the course of host adaptive evolution in this economically important MSV strain.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cstrong\u003eAuthor Contributions:\u0026nbsp;\u003c/strong\u003eConceptualization, Kehinde Oyeniran; Formal analysis, Kehinde Oyeniran, and Mobolaji Tenibiaje; Investigation, Kehinde Oyeniran and Mobolaji Tenibiaje; Visualization, Kehinde Oyeniran; Writing \u0026ndash; original draft, Kehinde Oyeniran; Writing \u0026ndash; review \u0026amp; editing, Kehinde Oyeniran and Mobolaji Tenibiaje.\u003c/p\u003e\n\u003cp\u003eAll authors have read and agreed to the published version of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConflict of Interest: \u0026nbsp;\u003c/strong\u003eNone\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eAcosta-Leal, R., Duffy, S., Xiong, Z., Hammond, R. W., \u0026amp; Elena, S. F. (2011). Advances in Plant Virus Evolution: Translating Evolutionary Insights into Better Disease Management. \u003cem\u003ePhytopathology\u003c/em\u003e, \u003cem\u003e101\u003c/em\u003e(10), 1136\u0026ndash;1148. https://doi.org/10.1094/PHYTO-01-11-0017\u003c/li\u003e\n\u003cli\u003eAguad\u0026eacute;, M. (1999). Positive selection drives the evolution of the Acp29AB accessory gland protein in Drosophila. \u003cem\u003eGenetics\u003c/em\u003e, \u003cem\u003e152\u003c/em\u003e(2), 543\u0026ndash;551. https://doi.org/10.1093/GENETICS/152.2.543\u003c/li\u003e\n\u003cli\u003eAntonides, J., Mathur, S., \u0026amp; DeWoody, J. A. (2019). Episodic positive diversifying selection on key immune system genes in major avian lineages. \u003cem\u003eGenetica\u003c/em\u003e, \u003cem\u003e147\u003c/em\u003e(5\u0026ndash;6), 337\u0026ndash;350. https://doi.org/10.1007/s10709-019-00081-3\u003c/li\u003e\n\u003cli\u003eBediako E, A.-, A, K., Puije GC, van der, KJ, T., Frimpong K, A., G, A., Kubi A, A., Lamptey J, N., A, O., B, M., I, A., \u0026amp; FN, T. (2017). Spatio-Temporal Variations in the Incidence and Severity of Maize Streak Disease in the Volta Region of Ghana. \u003cem\u003eJournal of Plant Pathology \u0026amp; Microbiology\u003c/em\u003e, \u003cem\u003e08\u003c/em\u003e(03). https://doi.org/10.4172/2157-7471.1000401\u003c/li\u003e\n\u003cli\u003eBoulton, M. I. (2002). Functions and interactions of mastrevirus gene products. \u003cem\u003ePhysiological and Molecular Plant Pathology\u003c/em\u003e, \u003cem\u003e60\u003c/em\u003e(5), 243\u0026ndash;255. https://doi.org/10.1006/pmpp.2002.0403\u003c/li\u003e\n\u003cli\u003eButt, A. M., Nasrullah, I., Qamar, R., \u0026amp; Tong, Y. (2016). Evolution of Codon Usage in Zika Virus Genomes Is Host and Vector Specific. \u003cem\u003eEmerging Microbes \\\u0026amp; Infections\u003c/em\u003e. https://doi.org/10.1038/emi.2016.106\u003c/li\u003e\n\u003cli\u003eCarrique, L., Fan, H., Walker, A. P., Keown, J. R., Sharps, J., Staller, E., Barclay, W., Fodor, E., \u0026amp; Grimes, J. M. (2020). Host ANP32A Mediates the Assembly of the Influenza Virus Replicase. \u003cem\u003eNature\u003c/em\u003e. https://doi.org/10.1038/s41586-020-2927-z\u003c/li\u003e\n\u003cli\u003eCharles, K. (2014). Maize streak virus: A review of pathogen occurrence, biology and management options for smallholder farmers. \u003cem\u003eAfrican Journal of Agricultural Research\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e(36), 2736\u0026ndash;2742. https://doi.org/10.5897/AJAR2014.8897\u003c/li\u003e\n\u003cli\u003eDavies, J. W., Boulton, M. I., \u0026amp; Liu, H. (1997). Maize streak virus coat protein binds single- and double-stranded DNA in vitro. \u003cem\u003eJournal of General Virology\u003c/em\u003e, \u003cem\u003e78\u003c/em\u003e(6), 1265\u0026ndash;1270. https://doi.org/10.1099/0022-1317-78-6-1265\u003c/li\u003e\n\u003cli\u003eDenes, C. E., Cole, A. J., Nguyen Tran, M. T., Nizam Khalid, M. K., Hewitt, A. W., Hesselson, D., \u0026amp; Neely, G. G. (2022). The VEGAS Platform Is Unsuitable for Mammalian Directed Evolution. \u003cem\u003eAcs Synthetic Biology\u003c/em\u003e. https://doi.org/10.1021/acssynbio.2c00460\u003c/li\u003e\n\u003cli\u003eDeom, C. M., Brewer, M. T., \u0026amp; Severns, P. M. (2021). Positive selection and intrinsic disorder are associated with multifunctional C4(AC4) proteins and geminivirus diversification. \u003cem\u003eScientific Reports 2021 11:1\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e(1), 1\u0026ndash;11. https://doi.org/10.1038/s41598-021-90557-0\u003c/li\u003e\n\u003cli\u003eE. Sayers. (2010). \u003cem\u003eThe E-utilities In-Depth: Parameters, Syntax and More. 2009 May 29 [Updated 2022 Nov 30]. In: Entrez Programming Utilities Help [Internet]. Bethesda (MD):\u003c/em\u003e National Center for Biotechnology Information (US).\u003c/li\u003e\n\u003cli\u003eEdgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. \u003cem\u003eNucleic Acids Research\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e(5), 1792\u0026ndash;1797. https://doi.org/10.1093/nar/gkh340\u003c/li\u003e\n\u003cli\u003eGoldman, N., \u0026amp; Yang, Z. (1994). A codon-based model of nucleotide substitution for protein-coding DNA sequences. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e(5), 725\u0026ndash;736. https://doi.org/10.1093/oxfordjournals.molbev.a040153\u003c/li\u003e\n\u003cli\u003eGuindon, S., Dufayard, J.-F., Lefort, V., Anisimova, M., Hordijk, W., \u0026amp; Gascuel, O. (2010). New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. \u003cem\u003eSystematic Biology\u003c/em\u003e, \u003cem\u003e59\u003c/em\u003e(3), 307\u0026ndash;321. https://doi.org/10.1093/sysbio/syq010\u003c/li\u003e\n\u003cli\u003eHoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q., \u0026amp; Vinh, L. S. (2018). UFBoot2: Improving the Ultrafast Bootstrap Approximation. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e35\u003c/em\u003e(2), 518\u0026ndash;522. https://doi.org/10.1093/molbev/msx281\u003c/li\u003e\n\u003cli\u003eHorv\u0026aacute;th, G. V., Pettk\u0026oacute;-Szandtner, A., Nikovics, K., Bilgin, M., Boulton, M., Davies, J. W., Guti\u0026eacute;rrez, C., \u0026amp; Dudits, D. (1998). Prediction of functional regions of the maize streak virus replication-associated proteins by protein-protein interaction analysis. \u003cem\u003ePlant Molecular Biology\u003c/em\u003e, \u003cem\u003e38\u003c/em\u003e(5), 699\u0026ndash;712. https://doi.org/10.1023/A:1006076316887\u003c/li\u003e\n\u003cli\u003eKalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A., \u0026amp; Jermiin, L. S. (2017). ModelFinder: fast model selection for accurate phylogenetic estimates. \u003cem\u003eNature Methods\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(6), 587\u0026ndash;589. https://doi.org/10.1038/nmeth.4285\u003c/li\u003e\n\u003cli\u003eKetsela, D., Oyeniran, K. A. , Berhanu, F., Fontenele, R. S. , Kraberger, S.,\u0026middot;Varsani, A. (2022). Molecular identification and phylogenetic characterization of A-strain isolates of Maize streak virus from western Ethiopia. \u003cem\u003eArchives of Virology\u003c/em\u003e, 1\u0026ndash;11. https://doi.org/10.1007/s00705-022-05614-4\u003c/li\u003e\n\u003cli\u003eKosakovsky Pond, S. L., Frost, S. D. W., \u0026amp; Muse, S. V. (2005). HyPhy: hypothesis testing using phylogenies. \u003cem\u003eBioinformatics\u003c/em\u003e, \u003cem\u003e21\u003c/em\u003e(5), 676\u0026ndash;679. https://doi.org/10.1093/BIOINFORMATICS/BTI079\u003c/li\u003e\n\u003cli\u003eLarsson, A. (2014). AliView: A fast and lightweight alignment viewer and editor for large datasets. \u003cem\u003eBioinformatics\u003c/em\u003e, \u003cem\u003e30\u003c/em\u003e(22), 3276\u0026ndash;3278. https://doi.org/10.1093/bioinformatics/btu531\u003c/li\u003e\n\u003cli\u003eLi, X.-D., Jiang, G.-F., Yan, L.-Y., Li, R., Mu, Y., \u0026amp; Deng, W.-A. (2018). Positive Selection Drove the Adaptation of Mitochondrial Genes to the Demands of Flight and High-Altitude Environments in Grasshoppers. \u003cem\u003eFrontiers in Genetics\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e, 605. https://doi.org/10.3389/fgene.2018.00605\u003c/li\u003e\n\u003cli\u003eLiu, H., Andrew, A. P., Lucy, P., Davies, J. W., \u0026amp; Boulton, M. I. (2001). A single amino acid change in the coat protein of Maize streak virus abolishes systemic infection, but not interaction with viral DNA or movement protein. \u003cem\u003eMolecular Plant Pathology\u003c/em\u003e, \u003cem\u003e2\u003c/em\u003e(4), 223\u0026ndash;228. https://doi.org/10.1046/j.1464-6722.2001.00068.x\u003c/li\u003e\n\u003cli\u003eLiu, L., Pinner, M., Davies, J., \u0026amp; Stanley, J. (1999). Adaptation of the geminivirus bean yellow dwarf virus to dicotyledonous hosts involves both virion-sense and complementary-sense genes. \u003cem\u003eJ Gen Virol\u003c/em\u003e, \u003cem\u003e80\u003c/em\u003e.\u003c/li\u003e\n\u003cli\u003eLucaci, A. G., Zehr, J. D., Enard, D., Thornton, J. W., \u0026amp; Kosakovsky Pond, S. L. (2023). Evolutionary Shortcuts via Multinucleotide Substitutions and Their Impact on Natural Selection Analyses. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e40\u003c/em\u003e(7). https://doi.org/10.1093/MOLBEV/MSAD150\u003c/li\u003e\n\u003cli\u003eM\u0026auml;nz, B., Graaf, M. de, M\u0026ouml;gling, R., Richard, M., Bestebroer, T. M., Rimmelzwaan, G. F., \u0026amp; M. Fouchier, R. A. (2016). Multiple Natural Substitutions in Avian Influenza a Virus PB2 Facilitate Efficient Replication in Human Cells. \u003cem\u003eJournal of Virology\u003c/em\u003e. https://doi.org/10.1128/jvi.00130-16\u003c/li\u003e\n\u003cli\u003eMartin, D. P., Shepherd, D. N., \u0026amp; Rybicki, E. P. (2008). Maize Streak Virus. \u003cem\u003eEncyclopedia of Virology\u003c/em\u003e, 263\u0026ndash;272. https://doi.org/10.1016/B978-012374410-4.00707-X\u003c/li\u003e\n\u003cli\u003eMartin, D. P., Varsani, A., Roumagnac, P., Botha, G., Maslamoney, S., Schwab, T., Kelz, Z., Kumar, V., \u0026amp; Murrell, B. (2021). RDP5: a computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. \u003cem\u003eVirus Evolution\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e(1), 87. https://doi.org/10.1093/VE/VEAA087\u003c/li\u003e\n\u003cli\u003eMartin, D. P., Willment, J. a, \u0026amp; Rybicki, E. P. (1999). Evaluation of Maize Streak Virus Pathogenicity in Differentially Resistant Zea mays Genotypes. \u003cem\u003ePhytopathology\u003c/em\u003e, \u003cem\u003e89\u003c/em\u003e(8), 695\u0026ndash;700. https://doi.org/10.1094/Phyto.1999.89.8.695\u003c/li\u003e\n\u003cli\u003eMartin, D., \u0026amp; Rybicki, E. (2002). Investigation of Maize streak virus Pathogenicity determinants using chimaeric genomes. \u003cem\u003eVirology\u003c/em\u003e, \u003cem\u003e300\u003c/em\u003e.\u003c/li\u003e\n\u003cli\u003eMinh, B. Q., Schmidt, H. A., Chernomor, O., Schrempf, D., Woodhams, M. D., Von Haeseler, A., Lanfear, R., \u0026amp; Teeling, E. (2020). IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e37\u003c/em\u003e(5), 1530\u0026ndash;1534. https://doi.org/10.1093/MOLBEV/MSAA015\u003c/li\u003e\n\u003cli\u003eMonjane, A. L., Dellicour, S., Hartnady, P., Oyeniran, K. A., Owor, B. E., Bezuidenhout, M., Linderme, D., Syed, R. A., Donaldson, L., Murray, S., Rybicki, E. P., Kvarnheden, A., Yazdkhasti, E., Lefeuvre, P., Froissart, R., Roumagnac, P., Shepherd, D. N., Harkins, G. W., Suchard, M. A., \u0026hellip; Martin, D. P. (2020). Symptom evolution following the emergence of maize streak virus. \u003cem\u003eELife\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e. https://doi.org/10.7554/eLife.51984\u003c/li\u003e\n\u003cli\u003eMonjane, A. L., Harkins, G. W., Martin, D. P., Lemey, P., Lefeuvre, P., Shepherd, D. N., Oluwafemi, S., Simuyandi, M., Zinga, I., Komba, E. K., Lakoutene, D. P., Mandakombo, N., Mboukoulida, J., Semballa, S., Tagne, A., Tiendrebeogo, F., Erdmann, J. B., van Antwerpen, T., Owor, B. E., \u0026hellip; Varsani, A. (2011). Reconstructing the History of Maize Streak Virus Strain A Dispersal To Reveal Diversification Hot Spots and Its Origin in Southern Africa. \u003cem\u003eJournal of Virology\u003c/em\u003e, \u003cem\u003e85\u003c/em\u003e(18), 9623\u0026ndash;9636. https://doi.org/10.1128/JVI.00640-11\u003c/li\u003e\n\u003cli\u003eMontoya, V., McLaughlin, A., Mordecai, G., Miller, R. L., \u0026amp; Joy, J. B. (2021). Variable Routes to Genomic and Host Adaptation Among Coronaviruses. \u003cem\u003eJournal of Evolutionary Biology\u003c/em\u003e. https://doi.org/10.1111/jeb.13771\u003c/li\u003e\n\u003cli\u003eMostert, I., Bester, R., Burger, J. T., \u0026amp; Maree, H. J. (2023). Identification of Interactions Between Proteins Encoded by Grapevine Leafroll-Associated Virus 3. \u003cem\u003eViruses\u003c/em\u003e. https://doi.org/10.3390/v15010208\u003c/li\u003e\n\u003cli\u003eMu\u0026ntilde;oz-Mart\u0026iacute;n, A., Collin, S., Herreros, E., Mullineaux, P. M., Fern\u0026aacute;ndez-Lobato, M., \u0026amp; Fenoll, C. (2003). \u003cem\u003eRegulation of MSV and WDV virion-sense promoters by WDV nonstructural proteins: a role for their retinoblastoma protein-binding motifs\u003c/em\u003e. https://doi.org/10.1016/S0042-6822(02)00072-7\u003c/li\u003e\n\u003cli\u003eMurrell, B., Weaver, S., Smith, M. D., Wertheim, J. O., Murrell, S., Aylward, A., Eren, K., Pollner, T., Martin, D. P., Smith, D. M., Scheffler, K., \u0026amp; Kosakovsky Pond, S. L. (2015). Gene-Wide Identification of Episodic Selection. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e(5), 1365. https://doi.org/10.1093/MOLBEV/MSV035\u003c/li\u003e\n\u003cli\u003eMurrell, B., Wertheim, J. O., Moola, S., Weighill, T., Scheffler, K., \u0026amp; Kosakovsky Pond, S. L. (2012). Detecting Individual Sites Subject to Episodic Diversifying Selection. \u003cem\u003ePLOS Genetics\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e(7), e1002764. https://doi.org/10.1371/JOURNAL.PGEN.1002764\u003c/li\u003e\n\u003cli\u003eMuse, S. V., \u0026amp; Gaut, B. S. (1994). A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e(5), 715\u0026ndash;724. https://doi.org/10.1093/OXFORDJOURNALS.MOLBEV.A040152\u003c/li\u003e\n\u003cli\u003eNigam, D., LaTourrette, K., Noronha Souza, P. F., \u0026amp; Garc\u0026iacute;a-Ru\u0026iacute;z, H. (2019). Genome-Wide Variation in Potyviruses. \u003cem\u003eFrontiers in Plant Science\u003c/em\u003e. https://doi.org/10.3389/fpls.2019.01439\u003c/li\u003e\n\u003cli\u003eNikovics, K., Simidjieva, J., Peres, A., Ayaydin, F., Pasternak, T., Davies, J. W., Boulton, M. I., Dudits, D., \u0026amp; Horv\u0026aacute;th, G. V. (2001). Cell-Cycle, Phase-Specific Activation of Maize streak virus Promoters. \u003cem\u003e/ 609 MPMI\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(5), 609\u0026ndash;617.\u003c/li\u003e\n\u003cli\u003eOppong, L., Frimpong, B. N., Abrokwah, L. A., \u0026amp; Ofori, K. (2013). FARMERS\u0026rsquo; PERCEPTIONS ON MAIZE STREAK VIRUS DISEASE, PRODUCTION CONSTRAINTS, AND PREFERRED MAIZE VARIETIES IN THE FOREST-TRANSITION ZONE OF GHANA. FARMERS\u0026rsquo; PERCEPTIONS ON MAIZE STRE. \u003cem\u003eProJournal of Agricultural Science Research (PASR)\u003c/em\u003e. https://www.researchgate.net/publication/353973172\u003c/li\u003e\n\u003cli\u003eOwor, B. E., Martin, D. P., Shepherd, D. N., Edema, R., Monjane, A. L., Rybicki, E. P., Thomson, J. A., \u0026amp; Varsani, A. (2007). Genetic analysis of maize streak virus isolates from Uganda reveals widespread distribution of a recombinant variant. \u003cem\u003eJournal of General Virology\u003c/em\u003e, \u003cem\u003e88\u003c/em\u003e(11), 3154\u0026ndash;3165. https://doi.org/10.1099/vir.0.83144-0\u003c/li\u003e\n\u003cli\u003eOwor, B., Martin, D., Shepherd, D., Edema, R., \u0026amp; Rybicki, E. (2007). Genetic analysis of Maize streak virus (MSV) isolates from Uganda reveals widespread distribution of a recombinant MSV variant in Uganda. \u003cem\u003eJ Gen Virol\u003c/em\u003e, \u003cem\u003e88\u003c/em\u003e.\u003c/li\u003e\n\u003cli\u003eOyeniran, K. A. \u0026amp; O. J. A. (2024). Existential Origin of Life. In F.E. Olu-Ajayi and I. Osasona (Ed.), \u003cem\u003eHistory and Philosophy of Science\u003c/em\u003e (pp. 1\u0026ndash;12).\u003c/li\u003e\n\u003cli\u003eOyeniran, K. A., Hartnady, P., Claverie, S., Lefeuvre, P., Monjane, A. L., Donaldson, L., Michel, J., Arvind, L., \u0026amp; Martin, D. P. (2021). How virulent are emerging maize ‑ infecting mastreviruses ? \u003cem\u003eArchives of Virology\u003c/em\u003e, \u003cem\u003e0123456789\u003c/em\u003e. https://doi.org/10.1007/s00705-020-04906-x\u003c/li\u003e\n\u003cli\u003eRambaut, A. (2018). \u003cem\u003eFigtree ver 1.4.4. - Institute of Evolutionary Biology, University of Edinburgh, Edinburgh. - References\u003c/em\u003e. Scientific Research Publishing. https://www.scirp.org/reference/referencespapers?referenceid=3470267\u003c/li\u003e\n\u003cli\u003eRoumagnac, P., Lett, J. M., Fiallo-Oliv\u0026eacute;, E., Navas-Castillo, J., Zerbini, F. M., Martin, D. P., \u0026amp; Varsani, A. (2022). Establishment of five new genera in the family Geminiviridae: Citlodavirus, Maldovirus, Mulcrilevirus, Opunvirus, and Topilevirus. \u003cem\u003eArchives of Virology\u003c/em\u003e, \u003cem\u003e167\u003c/em\u003e(2), 695\u0026ndash;710. https://doi.org/10.1007/S00705-021-05309-2/FIGURES/5\u003c/li\u003e\n\u003cli\u003eRuschhaupt, M., Martin, D. P., Lakay, F., Bezuidenhout, M., Rybicki, E. P., Jeske, H., \u0026amp; Shepherd, D. N. (2013). Replication modes of Maize streak virus mutants lacking RepA or the RepA-pRBR interaction motif. \u003cem\u003eVirology\u003c/em\u003e, \u003cem\u003e442\u003c/em\u003e(2), 173\u0026ndash;179. https://doi.org/10.1016/j.virol.2013.04.012\u003c/li\u003e\n\u003cli\u003eShepherd, D. N., Mangwende, T., Martin, D. P., Bezuidenhout, M., Thomson, J. A., Rybicki, E. P., \u0026amp; Rybicki, C. E. P. (2007). Inhibition of maize streak virus (MSV) replication by transient and transgenic expression of MSV replication-associated protein mutants. \u003cem\u003eJournal of General Virology\u003c/em\u003e, \u003cem\u003e88\u003c/em\u003e, 325\u0026ndash;336. https://doi.org/10.1099/vir.0.82338-0\u003c/li\u003e\n\u003cli\u003eShimodaira, H., \u0026amp; Hasegawa, M. (1999). Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e16\u003c/em\u003e(8), 1114\u0026ndash;1116. https://doi.org/10.1093/oxfordjournals.molbev.a026201\u003c/li\u003e\n\u003cli\u003eSmith, M. D., Wertheim, J. O., Weaver, S., Murrell, B., Scheffler, K., \u0026amp; Kosakovsky Pond, S. L. (2015). Less is more: An adaptive branch-site random effects model for efficient detection of episodic diversifying selection. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e(5), 1342\u0026ndash;1353. https://doi.org/10.1093/molbev/msv022\u003c/li\u003e\n\u003cli\u003eSpielman, S. J., Weaver, S., Shank, S. D., Magalis, B. R., Li, M., \u0026amp; Kosakovsky Pond, S. L. (2019). Evolution of viral genomes: Interplay between selection, recombination, and other forces. In \u003cem\u003eMethods in Molecular Biology\u003c/em\u003e (Vol. 1910, pp. 427\u0026ndash;468). Humana Press Inc. https://doi.org/10.1007/978-1-4939-9074-0_14\u003c/li\u003e\n\u003cli\u003eThines, M. (2019). An Evolutionary Framework for Host Shifts \u0026ndash; Jumping Ships for Survival. \u003cem\u003eNew Phytologist\u003c/em\u003e. https://doi.org/10.1111/nph.16092\u003c/li\u003e\n\u003cli\u003eWang, W., Zhao, H., \u0026amp; Han, G.-Z. (2020). Host-Virus Arms Races Drive Elevated Adaptive Evolution in Viral Receptors. \u003cem\u003eJournal of Virology\u003c/em\u003e. https://doi.org/10.1128/jvi.00684-20\u003c/li\u003e\n\u003cli\u003eWeaver, S., Shank, S. D., Spielman, S. J., Li, M., Muse, S. V., \u0026amp; Kosakovsky Pond, S. L. (2018). Datamonkey 2.0: A Modern Web Application for Characterizing Selective and Other Evolutionary Processes. \u003cem\u003eMolecular Biology and Evolution\u003c/em\u003e, \u003cem\u003e35\u003c/em\u003e(3), 773\u0026ndash;777. https://doi.org/10.1093/MOLBEV/MSX335\u003c/li\u003e\n\u003cli\u003eWertheim, J. O., Murrell, B., Smith, M. D., Kosakovsky Pond, S. L., \u0026amp; Scheffler, K. (2014). RELAX: Detecting Relaxed Selection in a Phylogenetic Framework. \u003cem\u003eMol. Biol. Evol.\u003c/em\u003e, \u003cem\u003e32\u003c/em\u003e(3), 820\u0026ndash;832. https://doi.org/10.1093/molbev/msu400\u003c/li\u003e\n\u003cli\u003eWright, E. A., Heckel, T., Groenendijk, J., Davies, J. W., \u0026amp; Boulton, M. I. (1997). Splicing features in maize streak virus virion- and complementary-sense gene expression. \u003cem\u003eThe Plant Journal\u003c/em\u003e, \u003cem\u003e12\u003c/em\u003e(6), 1285\u0026ndash;1297. https://doi.org/10.1046/j.1365-313x.1997.12061285.x\u003c/li\u003e\n\u003cli\u003eWu, B., Melcher, U., Guo, X., Wang, X., Fan, L., \u0026amp; Zhou, G. (2008). Assessment of codivergence of Mastreviruses with their plant hosts. \u003cem\u003eBMC Evolutionary Biology\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e(1), 1\u0026ndash;13. https://doi.org/10.1186/1471-2148-8-335\u003c/li\u003e\n\u003cli\u003eWu, B., Shang, X., Schubert, J., Habeku\u0026szlig;, A., Elena, S. F., \u0026amp; Wang, X. (2015). Global-scale computational analysis of genomic sequences reveals the recombination pattern and coevolution dynamics of cereal-infecting geminiviruses. \u003cem\u003eScientific Reports 2015 5:1\u003c/em\u003e, \u003cem\u003e5\u003c/em\u003e(1), 1\u0026ndash;10. https://doi.org/10.1038/srep08153\u003c/li\u003e\n\u003cli\u003eYang, Z. (1994). Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods. \u003cem\u003eJournal of Molecular Evolution\u003c/em\u003e, \u003cem\u003e39\u003c/em\u003e(3), 306\u0026ndash;314. https://doi.org/10.1007/BF00160154/METRICS\u003c/li\u003e\n\u003cli\u003eZhou, X., Park, B., Choi, D., \u0026amp; Han, K. (2018). A Generalized Approach to Predicting Protein-Protein Interactions Between Virus and Host. \u003cem\u003eBMC Genomics\u003c/em\u003e. https://doi.org/10.1186/s12864-018-4924-2\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Maize streak virus, positive selection, coat protein, movement protein, geminiviruses","lastPublishedDoi":"10.21203/rs.3.rs-4670195/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4670195/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eMaize streak virus (MSV) has only three genes : \u003cem\u003ecp\u003c/em\u003e encoding the coat protein, \u003cem\u003emp\u003c/em\u003e encoding the movement protein and \u003cem\u003erep\u003c/em\u003e/\u003cem\u003erepA\u003c/em\u003e encoding two distinct replication associated proteins from an alternatively spliced transcript. These genes have roles in encapsidation, movement, replication and interactions with the external environment and are thus prone to stimuli-driven molecular adaptation. We accomplished selection studies for these publicly available curated, recombination-free complete coding sequences for representative A-strain maize streak virus (MSV-A) \u003cem\u003ecp\u003c/em\u003e and \u003cem\u003emp\u003c/em\u003e genes. We found evidence of gene-wide selection in these two MSV genes at specific sites within the genes (\u003cem\u003ecp\u003c/em\u003e 1.23% and \u003cem\u003emp\u003c/em\u003e 0.99%). Positively selected sites have amino acids that are 60% hydrophilic and 40% hydrophobic in nature. We found significant evidence of positive selection at branches (\u003cem\u003ecp\u003c/em\u003e: 0.76 and \u003cem\u003emp\u003c/em\u003e :1.66%) representing the diversity of MSV-A strain in South Africa that is closely related to the MSV-Mat-A isolate (GenBank accession number: AF329881) that is well disseminated and adapted to the maize plant in the sub-Saharan Africa. While in the \u003cem\u003emp\u003c/em\u003e gene, selection significantly intensified for the overall diversities of the MSV-A sequences, and those that are closely related to the MSV-Mat-A isolate. These findings have revealed that these genes, despite mostly undergoing non-diversifying selection, the detectable diversifying positive selection observed could have a major role in MSV-A host adaptive evolution that has over time, ensured a degree of pathogenicity that is sufficient for onward transmission rather than killing its host.\u003c/p\u003e","manuscriptTitle":"Detectable episodic positive selection in the virion strand A-strain maize streak virus genes may have a role in its host adaptation","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-08-05 10:17:08","doi":"10.21203/rs.3.rs-4670195/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"cda1b549-296a-410d-8e46-eae6b5fae92e","owner":[],"postedDate":"August 5th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-11-09T09:53:27+00:00","versionOfRecord":[],"versionCreatedAt":"2024-08-05 10:17:08","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4670195","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4670195","identity":"rs-4670195","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00