The Evolution Mechanism with Protein Structure and Function and the Origin of SARS CoV-2

preprint OA: closed
Full text JSON View at publisher
Full text 123,562 characters · extracted from preprint-html · click to expand
The Evolution Mechanism with Protein Structure and Function and the Origin of SARS CoV-2 | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article The Evolution Mechanism with Protein Structure and Function and the Origin of SARS CoV-2 Dejun Lian, Jie Lian, Qi Dong This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6999350/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Over the last few decades, novel viruses that present severe health risks worldwide. The SARS CoV-2 virus, formerly known as a novel coronavirus, broke out in Wuhan (China) and caused major morbidity and mortality globally. Confirmation of intermediate hosts is essential to prevent further spread of the epidemic. The emergence of COVID-19 has triggered many works aimed at stduy of the evolution and identifying the animal intermediate potentially involved in the transmission of SAR SCoV-2 to humans. This study focuses on comparisons of the SARS RNA- dependent RNA polymerase (RDRP) enzyme coding RNA sequences within and between SARS-CoV‐2 and SARS‐CoV, bat SARS‐like CoV, and other coronaviruses, which are helpful for evolutionary analysis to study the evolution mechanism and finding the possible virus reservoirs and the origin of COVID − 19. SARS CoV-2 Evolution Mechanism Solvent Accessibility Origin Search Figures Figure 1 Figure 2 Figure 3 Figure 4 INTRODUCTION Over the last few decades, novel viruses that present severe health risks worldwide. The SAR SCoV-2 virus, formerly known as a novel coronavirus, broke out in Wuhan (China) and caused major morbidity and mortality globally. It was first appeared in late 2019 and caused coronavirus disease 2019(COVID-19). The WHO declared the SARS CoV2 epidemic on 11th March 2020. It has affected 269 million people in 224 nations and territories, with more than 5.3 million deaths. SARS-CoV‐2 is the seventh coronavirus known to infect humans, and the other six coronaviruses are HCoV‐229E, HCoV‐NL63, HCoV‐OC43, HCoV‐HKU1, SARS CoV, and Middle East respiratory syndrome coronavirus (MERS CoV). HCoV‐229E and HCoV‐NL63 are alphacoronaviruses, and others, including SARS‐CoV‐2, are beta coronaviruses. SARS‐CoV and MERS‐CoV are considered highly pathogenic and are known to be transmitted from bats to humans via intermediate host palm civets4 and dromedary camels ( 1 , 2 ). RNA viruses, such as hepatitis C virus (HCV), influenza virus, and SARS-CoV-2, are notorious for their ability to evolve rapidly under selection in novel environments. It is known that the high mutation rate of RNA viruses can generate huge genetic diversity to facilitate viral adaptation. RNA viruses offer a unique opportunity for the experimental study of molecular evolution. These viruses exhibit both high replication rates (10 5 day −1 ) and high mutation rates (10 − 3 -10 − 5 mutation/ (nucleotide/replication)); hence, evolutionary dynamics which would take years to unfold in even relatively simple bacteria occur within days in RNA virus colonies ( 3 , 4 ). SARS CoV-2 as a RNA virus, is a good example to study the mechanism of evolution of RNA virus. With the accumulation of a huge number of sequences of SARS CoV-2 and the huge population size owning to the spread of this virus worldwide, it is time to study the evolution mechanism of this virus which may shed light onto the study of the evolution mechanism of viruses and protein evolution. Confirmation of intermediate hosts is essential to prevent further spread of the epidemic. The emergence of COVID-19 has triggered many works aimed at identifying the animal intermediate potentially involved in the transmission of SARS CoV-2 to humans. SARS CoV-2, which has recently affected the human population worldwide, is suspected to have originated from bats ( 2 ). It was found to be closely related to the Sarbecoviruses MN996532_RaTG13 and RmYN02 from the Chinese horseshoe bats Rhinolophus affinis and Rhinolophus malayanus, respectively. There is no evidence of direct transmission of Sarbecoviruses from bats to humans. Only direct Sarbecovirus infections in humans have been linked to laboratory accidents during the SARS epidemic. To date, no higher incidence of Sarbecovirus infections has been reported in anthropized ecosystems, even among employees of guano farms who come into direct contact with bat feces (guano), even though approximately 22% of bats from guano farms release coronaviruses in their feces, among which almost 5% are Sarbecoviruses. Given the lack of direct battohuman Sarbecovirus transmission and the need for a reservoir according to the spillover theory of zoonotic emergence, many teams have attempted to identify the animal serving as an intermediate. Moreover, SARS CoV-2 related bat viruses evade human intrinsic immunity but lack efficient transmission capacity. The Malayan or Javan pangolin (Manis javanica) was suspected of being this intermediate host based on, the basis of 1) its ACE2 receptor sequence, 2) the presence of Sarbecoviruses related to SARSCoV2 in animals smuggled from the Indomalayian region, and 3) the presence of pangolins in wet markets in China, where they are considered a delicacy and a component of traditional pharmacopeia. However, this animal has been exonerated from the transmission of SARS CoV-2 to humans ( 61 ). Snakes, and turtles are also potential intermediate hosts that transmit SARS CoV-2 to humans ( 62 , 63 ). With the accumulation of COVID-19 and SARS related viruses sequences and advances in studies on virus evolution mechanisms, a new search for the origin of COVID-19 is needed. This study focuses on comparisons of the SARS RNA dependent RNA polymerase (RDRP) enzyme sequences within and between SARS-CoV‐2 and SARS‐CoV, bat SARS‐like CoV, and other coronaviruses, which are helpful for the study of evolution mechanism and finding the possible virus reservoirs and the origin of COVID-19. MATERIALS AND METHODS Sequences used in the study A total of approximately 30000 sequences of SARS CoV-2 isolates which represent the entire pandemic period were retrieved from GenBank and BV-BRC databases. The nucleotide sequences encoding the protein RDRP were used for comparative analysis. The RNA sequences were additionally filtered such that only sequences longer than 1000 bases were accounted for. The sequences of SARS related viruses analyzed in this study were also downloaded from GenBank. A listing of their accession numbers is available from the author upon request. Sites that were variable within species were considered polymorphic, sites that were identical within species were treated as invariant sites. Only amino acids occurring more than once were counted, in order to avoid errors due to a single aberrant sequence. The protein structures used in this study were determined through X-ray crystallography. We used PDB code 7oyg structure for analysis. Data analysis Multiple protein and nucleotide sequences were aligned with BioEdit and edited by hand. In our analysis, the SAS measure was used to estimate the proportion of each amino acid residue that is accessible to solvent. This was done by taking the ratio of SAS we calculated from the actual protein structure to that of the maximum exposed surface area in the fully extended conformation of the pentapeptide gly-gly-X-gly-gly, where X is the amino acid in question. We then normalized ASA values by the theoretical maximum SAS of each residue ( 14 ) to obtain RSA(relative solvent accessibility), solvent accessible surface area (SAS) of aa of SARS CoV-2 RDRP are calculated using the DSSP program ( http://www.cmbi.ru.nl/dssp.html ) . Statistical analysis Data are expressed as means ± SD. Statistical analyses were performed using the Kruskal–Wallis and Mann–Whitney U methods. All statistical analyses were performed using SPSS version 13.0 (SPSS Inc., Chicago, IL) with additional analysis performed using Stata/MP14 (StataCorp LP). Values of p < 0.05 were considered significant. Logistic Regression, Confidence Intervals We used the methods and model of Lian D ( 11 ) in understanding how a set of predictor variables affect a dichotomous outcome variable (polymorphic or invariant). The correlation of residue variability with structure The correlation was tested using Lian D’s method ( 11 ). We performed several tests to find structural correlates of high evolutionary rate. The correlation of Entropy at position with structure To measure the degree of sequence conservation, we used Lian D’s method. Ka/Ks Ka/Ks values were obtained by Datamonkey Adaptive Evolution Server ( http://www.datamonkey.org/ ). The multi-partition fixed effects likelihood (FEL) and FLAC method implemented in the Hyphy software package on the online server was then used to predict purifying selection. SARS CoV 2 RDRP coding sequences were clustered before used. RNA structure prediction. MFED values were calculated by comparing minimum folding energies for WT and sequences shuffled in order by the algorithm NDR. Ensemble RNA structure predictions were made using the DAMBE program( 10 ). Nucleotic acid phylogenetic analysis Multiple nucleotide sequences were aligned with BioEdit and edited by hand. In addition to the sequences recovered here, reference sequences that cover the phylogenetic diversity of CoVs were compiled for evolutionary analyses. Accordingly, after alignment, gaps and ambiguously aligned regions were removed by hand. Phylogenetic trees were estimated via the maximum likelihood (ML) method implemented in PhyML v3.0 in MEGA 11, with bootstrap support values calculated from 1000 replicate trees. The best fit nucleic acid substitution models were determined via MEGA 11 and DAMBE. Synonymous codon usage analysis To estimate the RSCU bias of 2019-nCoV, available coding sequences (retaining coding sequences with the ATG primer and multiple sequences of 3 nucleotides, excluding incorrect coding sequences) of the 2019‐nCoV genome (1 CDS, 9672 codons), and bat‐SL CoVZC45 genome (1 CDS,9680 codons) were used. The estimations were performed via the DAMBE program. Site-specific synonymous codon usage was inspected by eye and calculated by hand. A literature search was performed in PubMed. The cumulative number of fatalities associated with individual viral infections was retrieved from GenBank. RESULTS Table-1 Logistic regression of amino acid polymorphism with protein structure Proteins N sequences % Polymorphic α β LRT (95% CI) Pr(χ2(1)) SARS CoV-2 RDRP ~ 30,000 39.7 -1.84. 2.12 27.6 (1.33,2.90) P < 0.001 IAV PB1(757aa)a 107,312 95.1 -1.17 2.41 36.71 (1.61, 3.20) P < 0.001 HCV NS5B (566aa)b 1,2402 95.2 0.92 3.52 32.07 (2.12, 4.92) P < 0.001 E.Coli &S. Enterica Proteins(1955aa)c 138 4.5 -3.87 3.53 33.12 (2.35,4.71) P < < 0.0001 NOTE.—CI = confidence interval; LRT = log-likelihood ratio test. a data from ref ( 11 ) b data from ref ( 10 ) c data from ref ( 1 ) DISCUSSION SARS-CoV‐2 encodes at least 27 proteins, including 15 nonstructural proteins, 4 structural proteins, and 8 auxiliary proteins. RNA- dependent RNA polymerase (RDRP), a nonstructural protein, plays a key role in viral transcription and the genome replication. RNA viruses are ubiquitous intracellular parasites that are responsible for many emerging diseases, including AIDS and SARS. Here, we discuss the principal mechanisms of RNA virus evolution and highlight areas where future research is required. With the advances in next-generation sequencing and metagenomics, it is time for us to to carry on through research on the mechanism of virus evolution. In this paper, we investigated the element that predict the relative likelihood that the site will be polymorphic within SARS CoV-2 protein. We found a strong positive relation between polymorphism and solvent accessibility, suggesting that amino acid sites that are more solvent-accessible are less likely be constrained in identity. This is the same with Bustamante CD’s result obtained from the study of the polymorphisms of enzymes in E. coli and S. enteric ( 84 ). This finding is in accordance with work done on multiple families of proteins showing that solvent accessibility impact amino acid substitution rates ( 22 – 27 ). This result is also in accordance with the works done on the proteins of yeast and many virus proteins (28, ) especially HCV and IVA PB1( 11 , 12 ). The results of logistic regression and liner regression analyses show that the amino acids polymorphisms of SARS CoV-2 RDRP and are positively correlated with RSA, as is shown that their RV and aa entropy are positively correlate with RSA. The parameters of logistic regression of SARS CoV-2 RDRP are nearly the same with IVA PB1, and different with HCV and E. coli and S. enteric , which may be attributed to the fact that the evolution times of both SARS CoV-2 and IVA are short, compare with HCV and E. coli and S. enteric . The parameter β of SARS CoV-2 RDRP and IVA PB1 are smaller compare with HCV and E. coli and S. enteric , reflecting that the inner-out protein structure restriction is relaxed, so there are more polymorphism of aa of these proteins. Unexpectedly, for SARS CoV-2 RDRP, the aa entropy and grouped by aa phycico-chemical property entropy are nearly same, which is different compare with the results of HCV and IVA PB1( 11 , 12 ). This finding can be attributed to that the evolution of SARS CoV-2 is within a short evolution period(only less than six years), and within short evolution time, the accumulated na mutants are all single mutant within coding triplet; and for single mutants of triplets, the accumulation of coding aa are random, depleted with physico-chemical property restriction for aa which is observed in previous studies of protein evolution and virus evolution( 11 , 12 ). This findings means that they are common phenomena of the mechanism of protein evolution and virus evolution. Sushant Kumar et. al’s molecular modelling study showed that nearly half of the mutants they studied cause destabilisation (negative ΔΔG) in protein structure. They also observed that during initial phase COVID19 pandemic, the rate occurrence of new mutations were high but it slowed down as the time progresses( 9 ). These may reflect that within a short evolution time, the restriction imposed by protein structure is relaxed. Taken together with the precious findings, it can be concluded that purifying selection of RNA viruses evolution is weakened during a short evolution period and become strong after a long evolution time. This may be a common mechanism of virus evolution and protein evolution. Apart from polymorphisms, we found that purifying selection is common for this viruses. For SARS CoV-2 RDRP, the mean dN/dS is 0.029, far smaller than 1, meaning that purifying selection is very strong for this protein. With the 932 codon, quite a lot of them are monomorphic (data not shown), which may be attributed to the very short evolution time and very strong restriction imposed by RNA secondary structure. Compared to some other mRNA viruses, which generally have very high mutation rates, coronaviruses tend to evolve several-fold more slowly due to their proof-reading machinery, of the order of 10 − 4 substitutions per site per year. SARS-CoV-2 is currently evolving substantially faster than this (albeit still slowly compared to other nonproofed mRNA viruses), estimated at 7 × 10 − 4 substitutions per site per year (2 × 10 − 6 per day), and has seen a remarkable lineage evolution during the pandemic, as expected from its high prevalence and ongoing adaptation to humans.9 Since the emergence of SARS-CoV-2, estimated to be Autumn 2019, several hundred recurrent mutations have been identified, 80% of these being nonsynonymous changes in the virus proteins,4 including many in the RDRP protein( 8 ). Phylogenetic reconstruction determines the evolutionary relationship and host selection between RDRP in the human-close beta coronaviruses. To better understand the host selection of beta coronaviruses, the relationship of RDRP between SARS‐CoV‐2 and other closely related beta coronaviruses has been analyzed. Although SARS CoV-2 and bat SARS‐like CoV RaTG13, with 96.2% overall genome sequence identity, are inner joint neighbor of SARS‐CoV‐2 ( 2 ), their RDRP RNA sequences shared 97.8% identity. Global expansion and deep sequencing work have led to an increased number of SARS-CoV‐2 genotypes. Studying selective stress may be helpful for assessing the variability and potential for identifying host changes in SARS‐CoV‐2. On the bases of previous report, selective pressure analysis revealed that genes (ORF10 and ORF7a) have a greater selective pressure. RDRP is a key factor determining the replication of coronaviruses. The RDRP sequences from SARS‐CoV, bat, or pangolin and other SARS-like CoVs and SARS‐CoV‐2 were aligned and phylogenetic reconstructed. Phylogenetic analyses revealed that SARS CoV-2 clusters with SARS CoV in the Sarbecovirus subgenus, and viruses related to SARS CoV-2 were identified from bats and pangolins ( 1 ). Coronaviruses have long and complex genomes with high plasticity in terms of gene content. Our phylogenetic analyses of RNA sequences encoding RDRP indicated that SARS CoV-2 forms a unique cluster from SARS-related viruses (Fig. 3 ). In the present study, SARS CoV-2 was investigated for the presence of largescale internal RNA base pairing in its RNA sequence encoding RDRP. This property, termed the genome scale ordered RNA structure (GORS), has been previously associated with host persistence in other positive sense RNA viruses, and negative sense RNA viruses ( 65 , 12 ), potentially through its shielding effect on viral RNA recognition in the cell. The genomes of SARS CoV-2 are remarkably structured; in contrast to the replication associated RNA structure, GORS is poorly conserved in the positions and identities of base pairing with other sarbecoviruses, even similarly positioned stem loops in SARS CoV-2 and SARS CoV rarelyshare homologous pairings, which is indicative of more rapid evolutionary changes in RNA structure than in the underlying coding sequences. Sites predicted to be base paired in SARS CoV-2 showed less sequence diversity than unpaired sites did, suggesting that disruption of the RNA structure by mutation imposes a fitness cost on the virus that is potentially restricted to its longer evolution. Although functionally uncharacterized, GORS in SARS CoV-2 and other coronaviruses represents important elements in their cellular interactions that may contribute to their persistence and transmissibility ( 66 ). Our study revealed that the RDRP coding sequence forms compact GOR, which restricts sequence diversity and thus reduces the mutation rate (Fig. 4). The SARS Cov-2 RDRP RNA sequence polymorphism is much less common than highly mutating RNA viruses such as HCV ( 9 ). Therefore, we can use the RDRP coding sequence to infer the phylogenetic relationships among SARS CoV-2 and SARS CoV-2 related viruses more accurately and analyze their evolutionary relationships. Codon usage bias analysis revealed that at most of the sites, the RDRP codons are far from saturated (data not shown), which reflects that the large GORS structure caused restriction. Sequence analysis revealed that the RDRP encoding RNA sequences have great codon usage bias. For exemple, the first codon, most of the sequences encoding Ser, of most isolates is TCA, and only a few of the isolates (< 0.1%) are TCT, TCG, and TCC. For the second codon, most of the isolates encode Ala, and the sequences of most of the isolates are GCT; only a few of the isolates are GCG, but GCA and GCT were not found in our analysis. The third codon encodes Asp, and the sequence of most of the isolates is GAT; only one isolate is GAC. These findings all reflect the strong purifying selection of this virus due to protein structure-function restrictions and RNA GORS restrictions ( 66 ). Analysis of the RNA sequences of RDRP has shown that they are highly conserved owning to protein structure-function, and GORS causes restriction ( 65 , 12 ). The isolate from 2025 presented only 7 RNA sequence mutations, while the sequence of the bat SARS isolate RaTG13 presented 63 RNA sequences difference compares with SARS COVID 2 isolate Wuhan. So bat SARS viruses have little possibility are the source of SARS COVID 2 from the point of view of virus evolution mechanism. Sequences analysis shown that mouse SARS RDRP RNA have nearly the same sequences with isolate Wuhan, showed that mice or rats may be the source of SARS COVID 2. Wild rodents and shrews serve as vital sentinel species for monitoring zoonotic viruses due to their close interaction with human environments and role as natural reservoirs for diverse viral pathogens. Several studies have explored viral diversity and assessed pathogenic risks in wild rodents and shrews, the full extent of this diversity remains insufciently understood. A study of wild rodents and shrews showed that their Coronaviridae, Hantaviridae, Arteriviridae, Astroviridae, Hepeviridae, Lispiviridae, Nairoviridae, Nodaviridae, Paramyxoviridae, Rhabdoviridae, Picornaviridae, Arenaviridae and Picobirnaviridae are highly likely to infect humans. Notably, rodent derived Rotavirus A, HTNV, and SEOV display almost complete amino acid identity with their human derived counterparts ( 70 ), indicating that SARS viruses may also be similar to humans and transmit to humans. Rodents (order Rodentia), followed by bats (order Chiroptera), constitute the largest percentage of living mammals on earth. Thus, it is not surprising that these two orders account for many of the reservoirs of the zoonotic RNA viruses discovered to date. Unlike bats, rodents, especially rats and mice, have received less research attention for SARS virus reservoirs, and it is time to perform this type of study. The direct host of SARS CoV-2 remains a mystery, although there are many conjections ( 1 ). Our sequence analysis revealed that the dog SARS CoV-2 RDRP encoding RNA sequences are nearly the same as those of the Wuhan isolate, indicating that dogs may be the direct hosts of SARS CoV-2. There are several conjectures that the raccoon dog is the direct host of SARS CoV-2 ( 72 , 73 ). The Huanan market in Wuhan was the epicenter of the COVID19 pandemic. Shortly after the Huanan market was closed on January 1, 2020, investigative teams conducted extensive swabbing of surfaces (bench tops, door handles, drains, cages, animal carcasses, etc.) within that facility. Subsequent polymerase chain reaction (PCR) analysis and metagenomic sequencing provided a spatial map of SARS CoV-2 infection across the market—roughly the size of two soccer fields separated by a road—and of the animal species present among the various stalls, as animal DNA and RNA were also present (with RNA indicating a more recent presence, as it degrades relatively rapidly). Unfortunately, few live animals from the market have been tested, and none are likely hosts of SARS CoV-2. Raccoon dogs have received much attention ( 72 , 73 ). Experiments have shown that SARS CoV-2 easily infects raccoon dogs—commonly raised for fur in China, but also sold for meat in “wet” markets such as that in Wuhan—and that they shed high levels of the virus. This report describes the identification of raccoon dog mtDNA in six samples from two different stalls on the Wuhan market. A sample from a cart that tested positive for SARS CoV-2 also had “abundant” raccoon dog genetic material. Far less human genetic material was found in the same sample. The researchers say this suggests—but does not prove-that the raccoon dog or dogs on the cart were more likely to have spread the virus than were humans working the stall or shopping near it ( 73 ). In this study, we performed an evolutionary analysis using the genomic sequences of coronaviruses. SARS-CoV first emerged in China in 2002 and then spread to 37 countries in 2003,where it caused a global outbreak with a 9.6% mortality rate. From SARS‐CoV and MERS‐CoV to SARS‐CoV‐2, bats are the natural hosts of coronaviruses, but the intermediate hosts are different. Some have argued that Wuhan is too far from the greatest density of Chinese Rhinolophus bats in Yunnan for the virus to have easily entered by natural means ( 73 ). Previous findings suggested that the snake is a probable wildlife animal reservoir for SARS‐CoV‐2 infection on the basis of its relatively synonymous codon usage bias resembling that of snakes compared with other animals( 63 ). Pangolins and turtles have also been suggested as possible wildlife animal reservoirs of SARS CoV-2 ( 62 ). Civets and dromedaries, as intermediate hosts of zoonotic SARS‐CoV and MERS‐CoV, live in different time and space environments. These viruses nevertheless transmit coronaviruses to humans, which makes it difficult to find the intermediate hosts of SARS‐CoV‐2 in the short term ( 62 ). Analysis of the viral RDRP RNA sequences helps to quickly target the possible intermediate hosts of SARS‐CoV‐2. Compared with illegally traded markets, these markets are more common and popular. This study provides information that, like snakes, pangolins and turtles, dogs may also act as potential intermediate hosts transmitting SARS‐CoV‐2 to humans, and rats or mice may be zoonotic reservoirs of this virus, although much more needs to be confirmed. Declarations Ethics approval and consent to participate The author comply with ethic approval and consent participate. Availability of data and material All models, priors and settings used in this analysis are provided in the XML Competing interests The author declear no conflict of interest. Funding statement The research received no funding resources Author Contributions Dejun Lian conceptualized the article and analyzed the references and wrote the article. Jie lian curated sequences data, reviewed and edited the article. Qi Dong reviewed and edited the article. References Nathalie Chazal, Coronavirus, the King Who Wanted More Than a Crown: From Common to the Highly Pathogenic SARS CoV-2, Is the Key in the Accessory Genes? Front Microbiol. 2021 Jul 14:12:682603. 10.3389/fmicb.2021.682603 . eCollection 2021. Zhou P, Yang XL, Wang XG. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270–3. https://doi.org/10.1038/s4158602020127 . Tsimring LS, Levine H. RNA Virus Evolution via a Fitness-Space Model. Phys Rev Lett. 1996;76(23):4440–3. 10.1103/PhysRevLett.76.4440 . Domingo E, Holland JJ. RNA Virus Mutations and Fitness for Survival. Annu Rev Microbiol. 1997;51:151–78. 10.1146/annurev.micro.51.1.151 . Andrés Moya 1, Holmes EC. Fernando González-Candelas,The population genetics and evolutionary epidemiology of RNA viruses. Nat Rev Microbiol. 2004;2(4):279–88. 10.1038/nrmicro863 . Sangita Venkataraman BVLS, Prasad, Selvarajan R. Viruses, RNA Dependent RNA Polymerases: Insights from Structure, Function and Evolution, 2018, 10, 76; 10.3390/v10020076 Mahan Ghafari. 1 Louis du Plessis,1 Jayna Raghwani,1 Samir Bhatt,2 Bo Xu,3 Oliver G. Pybus,1 and Aris Katzourakis.Purifying Selection Determines the Short-Term Time Dependency of Evolutionary Rates in SARS-CoV-2 and pH1N1 Influenza Mol. Biol Evol. 2022;39(2):msac009. 10.1093/molbev/msac009 . Rukmankesh Mehra, Kasper P, Kepp. Structure and Mutations of SARS-CoV-2 Spike Protein: A Focused Overview. ACS Infect Dis. 2022;8(1):29–58. 10.1021/acsinfecdis.1c00433 . Epub 2021 Dec 2.PMID: 34856799. Sushant Kumar b, Khushboo Kumari b, Gajendra Kumar Azad,Emerging genetic diversity of SARS-CoV-2 RNA dependent RNA polymerase (RdRp) alters its B-cell epitopes. Biol 2022 Jan:75:29–36. doi: 10.1016/j.biologicals.2021.11.002. Epub 2021 Nov 17. Xuhua Xia, DAMBE6: New Tools for Microbial Genomics, Phylogenetics, and, Evolution M. J Hered. 2017;108(4):431437. 10.1093/jhered/esx033 Dejun, Lian. The Polymorphisms, Solvent Accessibility and Conservatism of Hepatitis C Virus Nonstructural 5B Protein, Preprint, BioRixv, 2025.02.09.637353; 10.1101/2025 . 02.09.637353. DejunLian. JieLian and qi Dong The Polymorphism, Solvent Accessibility and evolution Conservation of IVA PB1 Protein, preprint. Duncan C, Ramsey MP, Scherrer T, Zhou, Wilke CO. The Relationship Between Relative Solvent Accessibility and Evolutionary Rate in Protein Evolution. Genetics. 2011;188:479–88. 10.1534/genetics.111.128025 . Tien MZ, Meyer AG, Sydykova DK, Spielman SJ, Wilke CO. Maximum allowed solvent accessibilites of residues in proteins. PLoS ONE. 2013;8(11):e80635. 10.1371/journal.pone.0080635 . Shenkin PS, Erman B, Mastrandrea LD. Information-theoretical entropy as a measure of sequence variability, Proteins Struct. Funct Genet. 1991;11:297–313. WILLIAMSON RM. Information Theory Analysis of the Relationship between Primary Sequence Structure and Ligand Recognition among a Class of Facilitated Transporters. J theor Biol. 1995;174(2):179–88. 10.1006/jtbi.1995.0090 . Dayhoff MO, Schwartz RM, Orcutt B. A model of evolutionary change in proteins. Atlas Protein Seq Struc. 1978;5:345–52. Manning JR, Jefferson ER, Barton GJ. The contrasting properties of conservation and correlated phylogeny in protein functional residue prediction. BMC Bioinformatics. 2008;9:51. 10.1186/1471-2105-9-51 . Shen B, Vihinen M. Conservation and covariance in PH domain sequences: physicochemical profile and Information theoretical analysis of XLA-causing mutations in the Btk PH domain. Protein engineering. Des Selection. 2004;17(3):267–76. 10.1093/protein/gzh030 . Babar MM, Zaidi NS. 2015. Protein sequence conservation and stable molecular evolution reveals Influenza Virus Nucleoprotein as a universal druggable target. Infection, Genetics and Evolution. 34:200 – 10. 10.1016/j.meegid . 2015.06.030. Goldman M, Thorne JL, Jones DT. Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics. 1998;149(1):445–58. 10.1093/genetics/149.1.445 . Choi SC, Hobolth A, Douglas DM, Robinson M, Kishino H, Thorne JL. Quantifying the Impact of Protein Tertiary Structure on Molecular Evolution. Mol Biol Evol. 2007;24(8):1769–82. 10.1093/molbev/msm097 . Worth CL, Gong S, Blundell TL. Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol. 2009;10(10):709–20. 10.1038/nrm2762 . Echave J, Wilke CO. Biophysical models of protein evolution: Understanding the patterns of evolutionary sequence divergence. Annu Rev Biophys. 2017;46:85–103. 10.1146/annurev-biophys-070816-033819 . Echave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet. 2016;17(2):109–21. 10.1038/nrg.2015.18 . Lin YS, Hsu WL, Hwang JK, Li WS. Proportion of Solvent-Exposed Amino Acids in a Protein and Rate of Protein Evolution. Mol Biol Evol. 2007;24(4):1005–11. 10.1093/molbev/msm019 . Franzosa EA, Xia Y. Structural Determinants of Protein Evolution Are Context-Sensitive at the Residue Level. Mol Biol Evol. 2009;26(10):2387–95. 10.1093/molbev/msp146 . Shahmoradi A, Sydykova DK, Spielman SJ, Jackson EL, Dawson ET, Meyer AG, Wilke CO. Predicting Evolutionary Site Variability from Structure in Viral Proteins: Buriedness, Packing, Flexibility, and Design. J Mol Evol. 2014;79:130–42. 10.1007/s00239-014-9644-x . Douglas M, Fowler S, Fields. Deep mutational scanning: a new style of protein science. Nat Methods. 2014;11(8):801–7. 10.1038/nmeth.3027 . Wellner A, Gurevich MR, Tawfik DS. Mechanisms of Protein Sequence Divergence and Incompatibility. PLoS Genet. 2013;9(7):e1003665. 10.1371/journal.pgen.1003665 . Miller S, Janin J, Lesk AM, Chothia C. Interior and surface of monomeric proteins. J Mol Biol. 1987;196:641–56. 10.1016/0022- . Aartjan JW te, Velthuis. Common and unique features of viral RNA-dependent polymerases, Cell Mol Life Sci. 2014;71(22):4403–4420. 10.1007/s00018-014-1695-z Qin W, Yamashita T, Shirota Y, Lin Y, Wei W, Murakami S. Mutational analysis of the structure and functions of hepatitis C virus RNA-dependent RNA polymerase. Hepatology. 2001;33(3):72–37. 10.1053/jhep.2001.22765 . Pal C, Papp B, Martin JL. An integrated view of protein evolution. Nat Rev Genet. 2006;7:337–48. 10.1038/nrg1838 . Xiaojun, Li. Emergence of SARS-CoV-2 through recombination and strong purifying selection. Sci Adv. 2020;6(27):eabb9153. 10.1126/sciadv.abb9153 . Print 2020 Jul. Darin M, Taverna RA, Goldstein. January, Why are proteins so robust to site mutations?315, Issue 3, 18 2002, Pages 479–84. Haiwei H, Guo J, Choe, Loeb LA. Protein tolerance to random amino acid change. PNAS June. 2004 vol;22:9205–10. /10.1073/pnas.0403255101 . Kisters-Woike B, Vangierdegom C, Müller-Hil B. On the conservation of protein sequences in evolution. TIBS. 2000;25(9):419–21. 10.1016/s0968-0004(01)01877-1 . DePristo MA, Weinreich DM, Hartl DL. Missense meanderings in sequence space: a biophysical view of protein evolution. Nat Rev Genet Sep. 2005;6(9):678–87. 10.1038/nrg1672 . Bowie JU, Reidhaar-Olson JF, Lim WA, Sauer RT. Deciphering the message in protein sequences: tolerance to amino acid substitutions. Science. 1990;247(4948):1306–10. 10.1126/science.2315699 . Reidhaar-Olson JF, Sauer RT. Combinatorial cassette mutagenesis as a probe of the informational content of protein sequences. Science. 1988;241(4861):53–7. 10.1126/science.3388019 . Reidhaar-Olson JF, Sauer RT. Functionally acceptable substitutions in two alpha-helical regions of lambda repressor. Proteins. 1990;7(4):306–16. Overington J. Tertiary Structural Constraints on Protein Evolutionary Diversity: Templates, Key Residues and Structure Prediction. Proc. R. Soc. B Vol.241, Num.1301, pp132-145,1990. Mirny Land Shakhnovich EI. Universally Conserved Positions in Protein Folds: Reading Evolutionary Signals about Stability, Folding Kinetics and Function. J Mol Biol. 1999;291:177–96. 10.1006/jmbi.1999.2911 . Lina JJ, Bhattacharjeea MJ, Yua CP, Tsengb YY, Li WH. Many human RNA viruses show extraordinarily stringent selective constraints on protein evolution. PNAS. 2019;116(38):19009–18. https://doi.org/10.1073/pnas.1907626116 . Holmes EC. The Evolutionary Genetics of Emerging Viruses. Annu Rev Ecol Evol Syst. 2009;40:353–72. 10.1146/annurev.ecolsys.110308.120248 . Woo J, Robertson DL, Lovell SC. Constraints on HIV-1 diversity from protein structure. J Virol. 2010;84(24):12995–3003. 10.1128/JVI.00702-10 . Saitou N, Nei M. Polymorphism and evolution of influenza A virus genes. Mol Biol Evol. 1986;3(1):57–74. 10.1093/oxfordjournals.molbev.a040381 . Yoshiyuki, Suzuki, Natural Selection on the Influenza Virus Genome. Mol Biol Evol. 2006;23(10):1902–11. 10.1093/molbev/msl050 . Epub 2006 Jul 3. Yeh S-W, Huang T-T. Jen-Wei Liu,et.al, Local packing density is the main structural determinant of the rate of protein sequence evolution at site level. 2014. Biomed Res Int. 2014:572409. 10.1155/2014/572409 Bajaj M, Blundell T. Evolution and the Tertiary Structure of Proteins, Ann.Rev. Biophys. Bioeng.1984.13:453 – 92, doi:0.1146/annurev.bb.13. 060184.002321. Overington J. Tertiary Structural Constraints on Protein Evolutionary Diversity: Templates, Key Residues and Structure Prediction. Proc. R. Soc. B Vol.241, Num.1301,pp132-145,1990. SIMMONDS P, TUPLIN A and, EVANS, DJ. Detection of genome-scale ordered RNA structure (GORS) in genomes of positive-stranded RNA viruses: Implications for virus evolution and host persistence, RNA (2004), 10:1337–51. VanInsberghe D, McBride DS, DaSilva J, Stark TJ, Lau MSY, Shepard SS, et al. Genetic drift and purifying selection shape within host influenza A virus populations during natural swine infections. PLoS Pathog. 2024;20(4):e1012131. https://doi.org/10.1371/journal.ppat.1012131 . Babar MM. Najam-us-Sahar Sadaf Zaidi, Protein sequence conservation and stable molecular evolution reveals influenza virus nucleoprotein as a universal druggable target, Infect Genet Evol. 2015 Aug:34:200 – 10.doi: 10.1016/j.meegid.2015.06.030. Epub 2015 Jun 30. Babar MM, Zaidi N-us-S, Tahir S, Muhammad. Global geno-proteomic analysis reveals cross-continental sequence conservation and druggable sites among influenza virus polymerases. Antiviral Res. 2014;112:120–31. 10.1016/j.antiviral. 2014.10.013 . Aartjan JW, Te Velthuis NC, Robb, Achillefs N, Kapanidis. Ervin Fodor,The role of the priming loop in Influenza A virus RNA synthesis. Nat Microbiol. 2016;1(5):16029. 10.1038/nmicrobiol.2016.29 . Epub 2016 Mar 21. Austin L, Hughes. Near-Neutrality: the Leading Edge of the Neutral Theory of Molecular Evolution. Ann N Y Acad Sci. 2008;1133:162–79. 10.1196/annals.1438.001 . Meyer AG, Wilke CO. The utility of protein structure as a predictor of site-wise dN/dS varies widely among HIV-1 proteins. J R Soc Interface. 2015;12:2015057920150579. http://doi.org/10.1098/rsif.2015.0579 . Koonin EV. Towards a postmodern synthesis of evolutionary biology. Cell Cycle. 2009;8(6):799–800. 10.4161/cc.8.6.8187 . Roger Frutos J, SerraCobo T, Chen CA, Devaux. COVID-19: Time to exonerate the pangolin from the transmission of SARSCoV2 to humans. Infect Genet Evol 2020 Oct:84:10449310.1016/j.meegid.2020.104493.Epub 2020 Aug 5. Wei Ji WW, Zhao X, Li JZ. Cross-species transmission of the newly identified coronavirus 2019‐nCoV, J Med Virol. 2020; 92:433–440., 10.1002/jmv.25682 . Special Issue. 2019 Novel Coronavirus Origin, Evolution, Disease, Biology and Epidemiology: Part-I, 2020, Jour Med Virol 92, (4), 433440. Xuhua Xia, DAMBE6: New Tools for Microbial Genomics, Phylogenetics, and, Evolution M. J Hered. 2017;108(4):431437. 10.1093/jhered/esx033 Simmonds P, Tuplin A, Evans DJ. Detection of genomescale ordered RNA structure (GORS) in genomes of positive stranded RNA viruses: implications for virus evolution and host persistence. RNA. 2004;10:1337–51. 10.1261/rna.7640104 . Simmonds P, Cuypers L, Irving WL, McLauchlan J, Cooke GS, Barnes E, Consortium STOPHCV, Ansari MA. Impact of virus subtype and host IFNL4 genotype on largescale RNA structure formation in the genome of hepatitis C virus. bioRxiv. 2020. 10.1101/2020.06.16.155150 . Simmonds P. Pervasive RNA Secondary Structure in the Genomes of SARS CoV-2 and Other Coronaviruses mBio. 2020;11(6):e0166120. 10.1128/mBio.0166120 Lukasz Jaroszewski1§. Mallika Iyer2§, Arghavan Alisoltani1§, Mayya Sedova1 and Adam Godzik1, The interplay of SARS CoV-2 evolution and constraints imposed by the structure and functionality of its proteins BioRxiv Preprint / doi.org/10.1101/2020.08.10.244756 Zhang et al. Microbiome Virome landscape of wild rodents and shrews in Central China Microbiome (2025) 1363 https://doi.org/10.1186/s40168025020590 Evan P, Williams, Briana M, SpruillHarrell, Mariah K Taylor Common Themes in Zoonotic Spillover and Disease Emergence: Lessons Learned from Bat and Rodent Borne, Viruses RNA et al. Viruses.2021;13(8):1509. 10.3390/v13081509 Freuling CM, Breithaupt A, Müller T, Sehl J, BalkemaBuschmann A, et al. Susceptibility of raccoon dogs for experimental SARS CoV-2 infection. Emerg Infect Dis. 2020;26:2982–85. Jon, Cohen. New clues to the pandemic's origin surface, causing uproar. Science. 2023;379(6638):11751176. 10.1126/science.adh9055 . Epub 2023 Mar 23. Edward C, Holmes. The Emergence and Evolution of SARS CoV-2. Annu Rev Virol. 2024;11(1):2142. 10.1146/annurevvirology093022013037 . Epub 2024 Aug 30. Alexander CritsChristoph, Joshua I, Levy JE, Pekar. Genetic tracing of market wildlife and viruses at the epicenter of the COVID-19 pandemic, Cell.202419;187(19):54685482e1110.1016/j.cell.2024.08.010 Peter V, Markov M, Ghafari M, Beer K, Lythgoe P, Simmonds NI, Stilianakis A, Katzourakis. The evolution of SARS CoV-2. Nat Rev Microbiol. 2023;21(6):361379. 10.1038/s41579023008782 . Epub 2023 Apr 5. Xinyuan Cui1,20, Liang KFX. et. al.,Virus diversity, wildlifedomestic animal circulation and potential zoonotic viruses of small mammals, pangolins and zoo animals. Nat Communications|. 2023;142488. https://doi.org/10.1038/s41467023382024 . Mario A, PeñaHernández. Mia Madel Alfajaro, et. al. SARS CoV-2 related bat viruses evade human intrinsic immunity but lack efficient transmission capacity. Nat Microbiol. 2024;9:2038–50. Kristian G, Andersen A, Rambaut W, Ian Lipkin EC, Holmes, Robert F, Garry. The proximal origin of SARSCoV2. Nat Med. 2020;26(4):450452. 10.1038/s4159102008209 . Zhixin Liu X, Wei XX, Li J, Yang J, Tan H, Zhu J, Zhang Q, Wu J, Liu L. Composition and divergence of coronavirus spike proteins and host ACE2 receptors predict potential intermediate hosts of SARS CoV-2, J Med Virol.2020;92(6):595601.doi:0.1002/jmv.25726. Epub 2020 Mar 11. Gobena Ameni A, Zewude B, Tulu etal. A Narrative Review on the Pandemic Zoonotic RNA Virus Infections Occurred During the Last 25 Years. J Epidemiol Glob Health. 2024;14(4):1397–412. 10.1007/s44197024003047 . Rachele Cagliani D, Forni M, Clerici M, Sironi. Coding potential and sequence conservation of SARS CoV-2 and related animal viruses. Infect Genet Evol. 2020;83:104353. 10.1016/j.meegid.2020.104353 . Hayes KH, Luk X, Li J, Fung, Susanna KP, Lau PCY, Woo. Molecular epidemiology, evolution and phylogeny of SARS coronavirus. Infect Genet Evol 2019 Jul:71:2130. 10.1016/j.meegid.2019.03.001 . Epub 2019 Mar 4. David A, Meekins, Natasha N, Gaudreault, Juergen A, Richt. Natural and Experimental SARS CoV-2 Infection in Domestic and Wild Animals. Viruses. 2021;13(10):1993. 10.3390/v13101993 . Bustamante CD, Townsend JP, Hartl DL. Solvent Accessibility and Purifying Selection Within Proteins of Escherichia coli and Salmonella enteric . Mol Biol Evol. 2000;17(2):301–8. 10.1093/oxfordjournals.molbev.a026310 . Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6999350","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":486739510,"identity":"d60097a1-09d2-4292-90d4-a457c0a87d5c","order_by":0,"name":"Dejun Lian","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABAUlEQVRIiWNgGAWjYBACPmYgwcPAxsDAzNj44GMDSIyx8QA+LWxwLezMhw1nNjBIALU04NfCANYCBPxsadK8YC0MDPi1sDM/e/Cmgi9xOzOPsbHtDps63fbDQFtqbKJxO4zN3HDOGbbEnc08ho9zz6RJmJ1JBGo5lpbbgNsvZtK8bWyJGw4DbcltOyxhdgCohbHhMB4t7N+kef+BtZhJW4K0nH9ISAtQJW8DSAvQ+4wgLTcI2sJTJjnnGJvxhsPAQO5tS5PcdgNoSwIev/DzH98m8abmmOyG8wcbH/xss+E3O5/+8MGHGhucWqDgGBo/Ab9yEKghrGQUjIJRMApGLgAAN55bIN4tZI8AAAAASUVORK5CYII=","orcid":"","institution":"Chinese Academy of Sciences","correspondingAuthor":true,"prefix":"","firstName":"Dejun","middleName":"","lastName":"Lian","suffix":""},{"id":486739511,"identity":"6cd9d5e3-ba7e-4410-a93c-1ee02deead38","order_by":1,"name":"Jie Lian","email":"","orcid":"","institution":"Shanghai Jiao Tong University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Jie","middleName":"","lastName":"Lian","suffix":""},{"id":486739512,"identity":"45e5ce04-5428-4d30-a5c0-ae1a7ecc8bde","order_by":2,"name":"Qi Dong","email":"","orcid":"","institution":"Chinese Academy of Sciences","correspondingAuthor":false,"prefix":"","firstName":"Qi","middleName":"","lastName":"Dong","suffix":""}],"badges":[],"createdAt":"2025-06-28 18:08:12","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6999350/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6999350/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":87175478,"identity":"fac91a93-df85-47f6-9698-f2ad2ac789c4","added_by":"auto","created_at":"2025-07-21 08:37:38","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":8898,"visible":true,"origin":"","legend":"\u003cp\u003eLinear correlation of residue variability(RV) of SARS CoV-2 RDRP with RSA\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-6999350/v1/e1ed5a2b2b8560c6f9396d3f.png"},{"id":87175479,"identity":"3fa80b87-302e-42ed-aba2-9dc88d002bd6","added_by":"auto","created_at":"2025-07-21 08:37:38","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":12066,"visible":true,"origin":"","legend":"\u003cp\u003eLinear correlation of amino acid Entropy of RDRP with RSA\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-6999350/v1/2005fda204390aa2f11462fc.png"},{"id":87176332,"identity":"fd765048-8b44-4aae-b954-6fe9a12962f4","added_by":"auto","created_at":"2025-07-21 08:45:38","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":86462,"visible":true,"origin":"","legend":"\u003cp\u003ePhylogenic relationship between SARS Cov-2 and SARS-related viruses\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-6999350/v1/7b74ea0ef56518949f4cc4a2.png"},{"id":87176607,"identity":"d8ff5164-24a0-427a-914e-2501c8fa2391","added_by":"auto","created_at":"2025-07-21 08:53:38","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":169343,"visible":true,"origin":"","legend":"\u003cp\u003eSchematic drawing of the secondary structure of SARS CoV-2 RDRP encoded RNA\u003c/p\u003e","description":"","filename":"image4.png","url":"https://assets-eu.researchsquare.com/files/rs-6999350/v1/844938d910d7d415af4f6800.png"},{"id":87177441,"identity":"508701f5-6223-47c7-b594-ad2a6c6798a0","added_by":"auto","created_at":"2025-07-21 09:01:39","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":768752,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6999350/v1/0b89a591-2fc0-4272-96de-cd67ba474bcb.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"The Evolution Mechanism with Protein Structure and Function and the Origin of SARS CoV-2","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003eOver the last few decades, novel viruses that present severe health risks worldwide. The SAR SCoV-2 virus, formerly known as a novel coronavirus, broke out in Wuhan (China) and caused major morbidity and mortality globally. It was first appeared in late 2019 and caused coronavirus disease 2019(COVID-19). The WHO declared the SARS CoV2 epidemic on 11th March 2020. It has affected 269\u0026nbsp;million people in 224 nations and territories, with more than 5.3\u0026nbsp;million deaths. SARS-CoV‐2 is the seventh coronavirus known to infect humans, and the other six coronaviruses are HCoV‐229E, HCoV‐NL63, HCoV‐OC43, HCoV‐HKU1, SARS CoV, and Middle East respiratory syndrome coronavirus (MERS CoV). HCoV‐229E and HCoV‐NL63 are alphacoronaviruses, and others, including SARS‐CoV‐2, are beta coronaviruses. SARS‐CoV and MERS‐CoV are considered highly pathogenic and are known to be transmitted from bats to humans via intermediate host palm civets4 and dromedary camels (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eRNA viruses, such as hepatitis C virus (HCV), influenza virus, and SARS-CoV-2, are notorious for their ability to evolve rapidly under selection in novel environments. It is known that the high mutation rate of RNA viruses can generate huge genetic diversity to facilitate viral adaptation. RNA viruses offer a unique opportunity for the experimental study of molecular evolution. These viruses exhibit both high replication rates (10\u003csup\u003e5\u003c/sup\u003eday\u003csup\u003e\u0026minus;1\u003c/sup\u003e) and high mutation rates (10\u003csup\u003e\u0026minus;\u0026thinsp;3\u003c/sup\u003e-10\u003csup\u003e\u0026minus;\u0026thinsp;5\u003c/sup\u003emutation/ (nucleotide/replication)); hence, evolutionary dynamics which would take years to unfold in even relatively simple bacteria occur within days in RNA virus colonies (\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e). SARS CoV-2 as a RNA virus, is a good example to study the mechanism of evolution of RNA virus. With the accumulation of a huge number of sequences of SARS CoV-2 and the huge population size owning to the spread of this virus worldwide, it is time to study the evolution mechanism of this virus which may shed light onto the study of the evolution mechanism of viruses and protein evolution.\u003c/p\u003e\u003cp\u003eConfirmation of intermediate hosts is essential to prevent further spread of the epidemic. The emergence of COVID-19 has triggered many works aimed at identifying the animal intermediate potentially involved in the transmission of SARS CoV-2 to humans.\u003c/p\u003e\u003cp\u003eSARS CoV-2, which has recently affected the human population worldwide, is suspected to have originated from bats (\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e). It was found to be closely related to the Sarbecoviruses MN996532_RaTG13 and RmYN02 from the Chinese horseshoe bats Rhinolophus affinis and Rhinolophus malayanus, respectively. There is no evidence of direct transmission of Sarbecoviruses from bats to humans. Only direct Sarbecovirus infections in humans have been linked to laboratory accidents during the SARS epidemic. To date, no higher incidence of Sarbecovirus infections has been reported in anthropized ecosystems, even among employees of guano farms who come into direct contact with bat feces (guano), even though approximately 22% of bats from guano farms release coronaviruses in their feces, among which almost 5% are Sarbecoviruses. Given the lack of direct battohuman Sarbecovirus transmission and the need for a reservoir according to the spillover theory of zoonotic emergence, many teams have attempted to identify the animal serving as an intermediate. Moreover, SARS CoV-2 related bat viruses evade human intrinsic immunity but lack efficient transmission capacity. The Malayan or Javan pangolin (Manis javanica) was suspected of being this intermediate host based on, the basis of 1) its ACE2 receptor sequence, 2) the presence of Sarbecoviruses related to SARSCoV2 in animals smuggled from the Indomalayian region, and 3) the presence of pangolins in wet markets in China, where they are considered a delicacy and a component of traditional pharmacopeia. However, this animal has been exonerated from the transmission of SARS CoV-2 to humans (\u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e). Snakes, and turtles are also potential intermediate hosts that transmit SARS CoV-2 to humans (\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e, \u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e). With the accumulation of COVID-19 and SARS related viruses sequences and advances in studies on virus evolution mechanisms, a new search for the origin of COVID-19 is needed. This study focuses on comparisons of the SARS RNA dependent RNA polymerase (RDRP) enzyme sequences within and between SARS-CoV‐2 and SARS‐CoV, bat SARS‐like CoV, and other coronaviruses, which are helpful for the study of evolution mechanism and finding the possible virus reservoirs and the origin of COVID-19.\u003c/p\u003e"},{"header":"MATERIALS AND METHODS","content":"\u003cp\u003e\u003cb\u003eSequences used in the study\u003c/b\u003e\u003c/p\u003e\u003cp\u003eA total of approximately 30000 sequences of SARS CoV-2 isolates which represent the entire pandemic period were retrieved from GenBank and BV-BRC databases. The nucleotide sequences encoding the protein RDRP were used for comparative analysis. The RNA sequences were additionally filtered such that only sequences longer than 1000 bases were accounted for. The sequences of SARS related viruses analyzed in this study were also downloaded from GenBank. A listing of their accession numbers is available from the author upon request.\u003c/p\u003e\u003cp\u003eSites that were variable within species were considered polymorphic, sites that were identical within species were treated as invariant sites. Only amino acids occurring more than once were counted, in order to avoid errors due to a single aberrant sequence.\u003c/p\u003e\u003cp\u003eThe protein structures used in this study were determined through X-ray crystallography. We used PDB code 7oyg structure for analysis.\u003c/p\u003e\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eData analysis\u003c/h2\u003e\u003cp\u003eMultiple protein and nucleotide sequences were aligned with BioEdit and edited by hand.\u003c/p\u003e\u003cp\u003eIn our analysis, the SAS measure was used to estimate the proportion of each amino acid residue that is accessible to solvent. This was done by taking the ratio of SAS we calculated from the actual protein structure to that of the maximum exposed surface area in the fully extended conformation of the pentapeptide gly-gly-X-gly-gly, where X is the amino acid in question. We then normalized ASA values by the theoretical maximum SAS of each residue (\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e) to obtain RSA(relative solvent accessibility), solvent accessible surface area (SAS) of aa of SARS CoV-2 RDRP are calculated using the DSSP program (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.cmbi.ru.nl/dssp.html\u003c/span\u003e\u003cspan address=\"http://www.cmbi.ru.nl/dssp.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e .\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003eStatistical analysis\u003c/h2\u003e\u003cp\u003eData are expressed as means\u0026thinsp;\u0026plusmn;\u0026thinsp;SD. Statistical analyses were performed using the Kruskal\u0026ndash;Wallis and Mann\u0026ndash;Whitney U methods. All statistical analyses were performed using SPSS version 13.0 (SPSS Inc., Chicago, IL) with additional analysis performed using Stata/MP14 (StataCorp LP). Values of p\u0026thinsp;\u0026lt;\u0026thinsp;0.05 were considered significant.\u003c/p\u003e\u003cp\u003e\u003cb\u003eLogistic Regression, Confidence Intervals\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe used the methods and model of Lian D (\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e) in understanding how a set of predictor variables affect a dichotomous outcome variable (polymorphic or invariant).\u003c/p\u003e\u003cp\u003e\u003cb\u003eThe correlation of residue variability with structure\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe correlation was tested using Lian D\u0026rsquo;s method (\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e). We performed several tests to find structural correlates of high evolutionary rate.\u003c/p\u003e\u003cp\u003e\u003cb\u003eThe correlation of Entropy at position with structure\u003c/b\u003e\u003c/p\u003e\u003cp\u003eTo measure the degree of sequence conservation, we used Lian D\u0026rsquo;s method.\u003c/p\u003e\u003cp\u003e\u003cb\u003eKa/Ks\u003c/b\u003e\u003c/p\u003e\u003cp\u003eKa/Ks values were obtained by Datamonkey Adaptive Evolution Server (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.datamonkey.org/\u003c/span\u003e\u003cspan address=\"http://www.datamonkey.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). The multi-partition fixed effects likelihood (FEL) and FLAC method implemented in the Hyphy software package on the online server was then used to predict purifying selection. SARS CoV 2 RDRP coding sequences were clustered before used.\u003c/p\u003e\u003cp\u003e\u003cb\u003eRNA structure prediction.\u003c/b\u003e\u003c/p\u003e\u003cp\u003eMFED values were calculated by comparing minimum folding energies for WT and sequences shuffled in order by the algorithm NDR. Ensemble RNA structure predictions were made using the DAMBE program(\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cb\u003eNucleotic acid phylogenetic analysis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eMultiple nucleotide sequences were aligned with BioEdit and edited by hand. In addition to the sequences recovered here, reference sequences that cover the phylogenetic diversity of CoVs were compiled for evolutionary analyses. Accordingly, after alignment, gaps and ambiguously aligned regions were removed by hand. Phylogenetic trees were estimated via the maximum likelihood (ML) method implemented in PhyML v3.0 in MEGA 11, with bootstrap support values calculated from 1000 replicate trees. The best fit nucleic acid substitution models were determined via MEGA 11 and DAMBE.\u003c/p\u003e\u003cp\u003e\u003cb\u003eSynonymous codon usage analysis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eTo estimate the RSCU bias of 2019-nCoV, available coding sequences (retaining coding sequences with the ATG primer and multiple sequences of 3 nucleotides, excluding incorrect coding sequences) of the 2019‐nCoV genome (1 CDS, 9672 codons), and bat‐SL CoVZC45 genome (1 CDS,9680 codons) were used. The estimations were performed via the DAMBE program. Site-specific synonymous codon usage was inspected by eye and calculated by hand.\u003c/p\u003e\u003cp\u003eA literature search was performed in PubMed. The cumulative number of fatalities associated with individual viral infections was retrieved from GenBank.\u003c/p\u003e"},{"header":"RESULTS","content":"\u003cp\u003eTable-1 Logistic regression of amino acid polymorphism with protein structure\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Taba\" border=\"1\"\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eProteins\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eN sequences\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003e% Polymorphic\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eα\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eβ\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eLRT (95% CI) Pr(χ2(1))\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSARS CoV-2 RDRP\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e~\u0026thinsp;30,000\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e39.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e-1.84.\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e2.12\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e27.6\u003c/p\u003e\u003cp\u003e(1.33,2.90)\u003c/p\u003e\u003cp\u003eP\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eIAV PB1(757aa)a\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e107,312\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e95.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e-1.17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e2.41\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e36.71\u003c/p\u003e\u003cp\u003e(1.61, 3.20)\u003c/p\u003e\u003cp\u003eP\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHCV NS5B (566aa)b\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1,2402\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e95.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e3.52\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e32.07\u003c/p\u003e\u003cp\u003e(2.12, 4.92)\u003c/p\u003e\u003cp\u003eP\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eE.Coli \u0026amp;S. Enterica\u003c/em\u003e\u003c/p\u003e\u003cp\u003eProteins(1955aa)c\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e138\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e4.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e-3.87\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e3.53\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e33.12\u003c/p\u003e\u003cp\u003e(2.35,4.71)\u003c/p\u003e\u003cp\u003eP\u0026thinsp;\u0026lt;\u0026thinsp;\u0026lt;\u0026thinsp;0.0001\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"6\"\u003eNOTE.\u0026mdash;CI\u0026thinsp;=\u0026thinsp;confidence interval; LRT\u0026thinsp;=\u0026thinsp;log-likelihood ratio test.\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd colspan=\"6\"\u003ea data from ref (\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e)\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd colspan=\"6\"\u003eb data from ref (\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e)\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd colspan=\"6\"\u003ec data from ref (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e)\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eSARS-CoV‐2 encodes at least 27 proteins, including 15 nonstructural proteins, 4 structural proteins, and 8 auxiliary proteins. RNA- dependent RNA polymerase (RDRP), a nonstructural protein, plays a key role in viral transcription and the genome replication.\u003c/p\u003e\u003cp\u003eRNA viruses are ubiquitous intracellular parasites that are responsible for many emerging diseases, including AIDS and SARS. Here, we discuss the principal mechanisms of RNA virus evolution and highlight areas where future research is required. With the advances in next-generation sequencing and metagenomics, it is time for us to to carry on through research on the mechanism of virus evolution.\u003c/p\u003e\u003cp\u003eIn this paper, we investigated the element that predict the relative likelihood that the site will be polymorphic within SARS CoV-2 protein. We found a strong positive relation between polymorphism and solvent accessibility, suggesting that amino acid sites that are more solvent-accessible are less likely be constrained in identity. This is the same with Bustamante CD\u0026rsquo;s result obtained from the study of the polymorphisms of enzymes in \u003cem\u003eE. coli\u003c/em\u003e and \u003cem\u003eS. enteric\u003c/em\u003e (\u003cspan citationid=\"CR84\" class=\"CitationRef\"\u003e84\u003c/span\u003e). This finding is in accordance with work done on multiple families of proteins showing that solvent accessibility impact amino acid substitution rates (\u003cspan additionalcitationids=\"CR23 CR24 CR25 CR26\" citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e). This result is also in accordance with the works done on the proteins of yeast and many virus proteins (28, ) especially HCV and IVA PB1(\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThe results of logistic regression and liner regression analyses show that the amino acids polymorphisms of SARS CoV-2 RDRP and are positively correlated with RSA, as is shown that their RV and aa entropy are positively correlate with RSA. The parameters of logistic regression of SARS CoV-2 RDRP are nearly the same with IVA PB1, and different with HCV and \u003cem\u003eE. coli\u003c/em\u003e and \u003cem\u003eS. enteric\u003c/em\u003e, which may be attributed to the fact that the evolution times of both SARS CoV-2 and IVA are short, compare with HCV and \u003cem\u003eE. coli\u003c/em\u003e and \u003cem\u003eS. enteric\u003c/em\u003e. The parameter β of SARS CoV-2 RDRP and IVA PB1 are smaller compare with HCV and \u003cem\u003eE. coli\u003c/em\u003e and \u003cem\u003eS. enteric\u003c/em\u003e, reflecting that the inner-out protein structure restriction is relaxed, so there are more polymorphism of aa of these proteins. Unexpectedly, for SARS CoV-2 RDRP, the aa entropy and grouped by aa phycico-chemical property entropy are nearly same, which is different compare with the results of HCV and IVA PB1(\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e). This finding can be attributed to that the evolution of SARS CoV-2 is within a short evolution period(only less than six years), and within short evolution time, the accumulated na mutants are all single mutant within coding triplet; and for single mutants of triplets, the accumulation of coding aa are random, depleted with physico-chemical property restriction for aa which is observed in previous studies of protein evolution and virus evolution(\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e). This findings means that they are common phenomena of the mechanism of protein evolution and virus evolution.\u003c/p\u003e\u003cp\u003eSushant Kumar et. al\u0026rsquo;s molecular modelling study showed that nearly half of the mutants they studied cause destabilisation (negative ΔΔG) in protein structure. They also observed that during initial phase COVID19 pandemic, the rate occurrence of new mutations were high but it slowed down as the time progresses(\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e). These may reflect that within a short evolution time, the restriction imposed by protein structure is relaxed. Taken together with the precious findings, it can be concluded that purifying selection of RNA viruses evolution is weakened during a short evolution period and become strong after a long evolution time. This may be a common mechanism of virus evolution and protein evolution.\u003c/p\u003e\u003cp\u003eApart from polymorphisms, we found that purifying selection is common for this viruses. For SARS CoV-2 RDRP, the mean dN/dS is 0.029, far smaller than 1, meaning that purifying selection is very strong for this protein. With the 932 codon, quite a lot of them are monomorphic (data not shown), which may be attributed to the very short evolution time and very strong restriction imposed by RNA secondary structure. Compared to some other mRNA viruses, which generally have very high mutation rates, coronaviruses tend to evolve several-fold more slowly due to their proof-reading machinery, of the order of 10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e substitutions per site per year. SARS-CoV-2 is currently evolving substantially faster than this (albeit still slowly compared to other nonproofed mRNA viruses), estimated at 7 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e substitutions per site per year (2 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;6\u003c/sup\u003e per day), and has seen a remarkable lineage evolution during the pandemic, as expected from its high prevalence and ongoing adaptation to humans.9 Since the emergence of SARS-CoV-2, estimated to be Autumn 2019, several hundred recurrent mutations have been identified, 80% of these being nonsynonymous changes in the virus proteins,4 including many in the RDRP protein(\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e).\u003c/p\u003e\u003cp\u003ePhylogenetic reconstruction determines the evolutionary relationship and host selection between RDRP in the human-close beta coronaviruses. To better understand the host selection of beta coronaviruses, the relationship of RDRP between SARS‐CoV‐2 and other closely related beta coronaviruses has been analyzed. Although SARS CoV-2 and bat SARS‐like CoV RaTG13, with 96.2% overall genome sequence identity, are inner joint neighbor of SARS‐CoV‐2 (\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e), their RDRP RNA sequences shared 97.8% identity.\u003c/p\u003e\u003cp\u003eGlobal expansion and deep sequencing work have led to an increased number of SARS-CoV‐2 genotypes. Studying selective stress may be helpful for assessing the variability and potential for identifying host changes in SARS‐CoV‐2. On the bases of previous report, selective pressure analysis revealed that genes (ORF10 and ORF7a) have a greater selective pressure. RDRP is a key factor determining the replication of coronaviruses. The RDRP sequences from SARS‐CoV, bat, or pangolin and other SARS-like CoVs and SARS‐CoV‐2 were aligned and phylogenetic reconstructed. Phylogenetic analyses revealed that SARS CoV-2 clusters with SARS CoV in the Sarbecovirus subgenus, and viruses related to SARS CoV-2 were identified from bats and pangolins (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e). Coronaviruses have long and complex genomes with high plasticity in terms of gene content. Our phylogenetic analyses of RNA sequences encoding RDRP indicated that SARS CoV-2 forms a unique cluster from SARS-related viruses (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eIn the present study, SARS CoV-2 was investigated for the presence of largescale internal RNA base pairing in its RNA sequence encoding RDRP. This property, termed the genome scale ordered RNA structure (GORS), has been previously associated with host persistence in other positive sense RNA viruses, and negative sense RNA viruses (\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e), potentially through its shielding effect on viral RNA recognition in the cell. The genomes of SARS CoV-2 are remarkably structured; in contrast to the replication associated RNA structure, GORS is poorly conserved in the positions and identities of base pairing with other sarbecoviruses, even similarly positioned stem loops in SARS CoV-2 and SARS CoV rarelyshare homologous pairings, which is indicative of more rapid evolutionary changes in RNA structure than in the underlying coding sequences. Sites predicted to be base paired in SARS CoV-2 showed less sequence diversity than unpaired sites did, suggesting that disruption of the RNA structure by mutation imposes a fitness cost on the virus that is potentially restricted to its longer evolution. Although functionally uncharacterized, GORS in SARS CoV-2 and other coronaviruses represents important elements in their cellular interactions that may contribute to their persistence and transmissibility (\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e). Our study revealed that the RDRP coding sequence forms compact GOR, which restricts sequence diversity and thus reduces the mutation rate (Fig.\u0026nbsp;4). The SARS Cov-2 RDRP RNA sequence polymorphism is much less common than highly mutating RNA viruses such as HCV (\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e). Therefore, we can use the RDRP coding sequence to infer the phylogenetic relationships among SARS CoV-2 and SARS CoV-2 related viruses more accurately and analyze their evolutionary relationships.\u003c/p\u003e\u003cp\u003eCodon usage bias analysis revealed that at most of the sites, the RDRP codons are far from saturated (data not shown), which reflects that the large GORS structure caused restriction. Sequence analysis revealed that the RDRP encoding RNA sequences have great codon usage bias. For exemple, the first codon, most of the sequences encoding Ser, of most isolates is TCA, and only a few of the isolates (\u0026lt;\u0026thinsp;0.1%) are TCT, TCG, and TCC. For the second codon, most of the isolates encode Ala, and the sequences of most of the isolates are GCT; only a few of the isolates are GCG, but GCA and GCT were not found in our analysis. The third codon encodes Asp, and the sequence of most of the isolates is GAT; only one isolate is GAC. These findings all reflect the strong purifying selection of this virus due to protein structure-function restrictions and RNA GORS restrictions (\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eAnalysis of the RNA sequences of RDRP has shown that they are highly conserved owning to protein structure-function, and GORS causes restriction (\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e). The isolate from 2025 presented only 7 RNA sequence mutations, while the sequence of the bat SARS isolate RaTG13 presented 63 RNA sequences difference compares with SARS COVID 2 isolate Wuhan. So bat SARS viruses have little possibility are the source of SARS COVID 2 from the point of view of virus evolution mechanism. Sequences analysis shown that mouse SARS RDRP RNA have nearly the same sequences with isolate Wuhan, showed that mice or rats may be the source of SARS COVID 2. Wild rodents and shrews serve as vital sentinel species for monitoring zoonotic viruses due to their close interaction with human environments and role as natural reservoirs for diverse viral pathogens. Several studies have explored viral diversity and assessed pathogenic risks in wild rodents and shrews, the full extent of this diversity remains insufciently understood. A study of wild rodents and shrews showed that their Coronaviridae, Hantaviridae, Arteriviridae, Astroviridae, Hepeviridae, Lispiviridae, Nairoviridae, Nodaviridae, Paramyxoviridae, Rhabdoviridae, Picornaviridae, Arenaviridae and Picobirnaviridae are highly likely to infect humans. Notably, rodent derived Rotavirus A, HTNV, and SEOV display almost complete amino acid identity with their human derived counterparts (\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e), indicating that SARS viruses may also be similar to humans and transmit to humans. Rodents (order Rodentia), followed by bats (order Chiroptera), constitute the largest percentage of living mammals on earth. Thus, it is not surprising that these two orders account for many of the reservoirs of the zoonotic RNA viruses discovered to date. Unlike bats, rodents, especially rats and mice, have received less research attention for SARS virus reservoirs, and it is time to perform this type of study.\u003c/p\u003e\u003cp\u003eThe direct host of SARS CoV-2 remains a mystery, although there are many conjections (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e). Our sequence analysis revealed that the dog SARS CoV-2 RDRP encoding RNA sequences are nearly the same as those of the Wuhan isolate, indicating that dogs may be the direct hosts of SARS CoV-2. There are several conjectures that the raccoon dog is the direct host of SARS CoV-2 (\u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e72\u003c/span\u003e, \u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e). The Huanan market in Wuhan was the epicenter of the COVID19 pandemic. Shortly after the Huanan market was closed on January 1, 2020, investigative teams conducted extensive swabbing of surfaces (bench tops, door handles, drains, cages, animal carcasses, etc.) within that facility. Subsequent polymerase chain reaction (PCR) analysis and metagenomic sequencing provided a spatial map of SARS CoV-2 infection across the market\u0026mdash;roughly the size of two soccer fields separated by a road\u0026mdash;and of the animal species present among the various stalls, as animal DNA and RNA were also present (with RNA indicating a more recent presence, as it degrades relatively rapidly). Unfortunately, few live animals from the market have been tested, and none are likely hosts of SARS CoV-2. Raccoon dogs have received much attention (\u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e72\u003c/span\u003e, \u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e). Experiments have shown that SARS CoV-2 easily infects raccoon dogs\u0026mdash;commonly raised for fur in China, but also sold for meat in \u0026ldquo;wet\u0026rdquo; markets such as that in Wuhan\u0026mdash;and that they shed high levels of the virus. This report describes the identification of raccoon dog mtDNA in six samples from two different stalls on the Wuhan market. A sample from a cart that tested positive for SARS CoV-2 also had \u0026ldquo;abundant\u0026rdquo; raccoon dog genetic material. Far less human genetic material was found in the same sample. The researchers say this suggests\u0026mdash;but does not prove-that the raccoon dog or dogs on the cart were more likely to have spread the virus than were humans working the stall or shopping near it (\u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eIn this study, we performed an evolutionary analysis using the genomic sequences of coronaviruses. SARS-CoV first emerged in China in 2002 and then spread to 37 countries in 2003,where it caused a global outbreak with a 9.6% mortality rate. From SARS‐CoV and MERS‐CoV to SARS‐CoV‐2, bats are the natural hosts of coronaviruses, but the intermediate hosts are different. Some have argued that Wuhan is too far from the greatest density of Chinese Rhinolophus bats in Yunnan for the virus to have easily entered by natural means (\u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e). Previous findings suggested that the snake is a probable wildlife animal reservoir for SARS‐CoV‐2 infection on the basis of its relatively synonymous codon usage bias resembling that of snakes compared with other animals(\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e). Pangolins and turtles have also been suggested as possible wildlife animal reservoirs of SARS CoV-2 (\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e). Civets and dromedaries, as intermediate hosts of zoonotic SARS‐CoV and MERS‐CoV, live in different time and space environments. These viruses nevertheless transmit coronaviruses to humans, which makes it difficult to find the intermediate hosts of SARS‐CoV‐2 in the short term (\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e). Analysis of the viral RDRP RNA sequences helps to quickly target the possible intermediate hosts of SARS‐CoV‐2. Compared with illegally traded markets, these markets are more common and popular. This study provides information that, like snakes, pangolins and turtles, dogs may also act as potential intermediate hosts transmitting SARS‐CoV‐2 to humans, and rats or mice may be zoonotic reservoirs of this virus, although much more needs to be confirmed.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\u003cp\u003eThe author comply with ethic approval and consent participate.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and material\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll models, priors and settings used in this analysis are provided in the XML\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe author declear no conflict of interest.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe research received no funding resources\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eDejun Lian conceptualized the article and analyzed the references and wrote the article. Jie lian curated sequences data, reviewed and edited the article. Qi Dong reviewed and edited the article.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eNathalie Chazal, Coronavirus, the King Who Wanted More Than a Crown: From Common to the Highly Pathogenic SARS CoV-2, Is the Key in the Accessory Genes? Front Microbiol. 2021 Jul 14:12:682603. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fmicb.2021.682603\u003c/span\u003e\u003cspan address=\"10.3389/fmicb.2021.682603\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. eCollection 2021.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhou P, Yang XL, Wang XG. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020;579:270\u0026ndash;3. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s4158602020127\u003c/span\u003e\u003cspan address=\"10.1038/s4158602020127\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTsimring LS, Levine H. RNA Virus Evolution via a Fitness-Space Model. Phys Rev Lett. 1996;76(23):4440\u0026ndash;3. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1103/PhysRevLett.76.4440\u003c/span\u003e\u003cspan address=\"10.1103/PhysRevLett.76.4440\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDomingo E, Holland JJ. RNA Virus Mutations and Fitness for Survival. Annu Rev Microbiol. 1997;51:151\u0026ndash;78. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1146/annurev.micro.51.1.151\u003c/span\u003e\u003cspan address=\"10.1146/annurev.micro.51.1.151\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAndr\u0026eacute;s Moya 1, Holmes EC. Fernando Gonz\u0026aacute;lez-Candelas,The population genetics and evolutionary epidemiology of RNA viruses. Nat Rev Microbiol. 2004;2(4):279\u0026ndash;88. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nrmicro863\u003c/span\u003e\u003cspan address=\"10.1038/nrmicro863\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSangita Venkataraman BVLS, Prasad, Selvarajan R. Viruses, RNA Dependent RNA Polymerases: Insights from Structure, Function and Evolution, 2018, 10, 76; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/v10020076\u003c/span\u003e\u003cspan address=\"10.3390/v10020076\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMahan Ghafari. 1 Louis du Plessis,1 Jayna Raghwani,1 Samir Bhatt,2 Bo Xu,3 Oliver G. Pybus,1 and Aris Katzourakis.Purifying Selection Determines the Short-Term Time Dependency of Evolutionary Rates in SARS-CoV-2 and pH1N1 Influenza Mol. Biol Evol. 2022;39(2):msac009. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/molbev/msac009\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msac009\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRukmankesh Mehra, Kasper P, Kepp. Structure and Mutations of SARS-CoV-2 Spike Protein: A Focused Overview. ACS Infect Dis. 2022;8(1):29\u0026ndash;58. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1021/acsinfecdis.1c00433\u003c/span\u003e\u003cspan address=\"10.1021/acsinfecdis.1c00433\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Epub 2021 Dec 2.PMID: 34856799.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSushant Kumar b, Khushboo Kumari b, Gajendra Kumar Azad,Emerging genetic diversity of SARS-CoV-2 RNA dependent RNA polymerase (RdRp) alters its B-cell epitopes. Biol 2022 Jan:75:29\u0026ndash;36. doi: 10.1016/j.biologicals.2021.11.002. Epub 2021 Nov 17.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXuhua Xia, DAMBE6: New Tools for Microbial Genomics, Phylogenetics, and, Evolution M. J Hered. 2017;108(4):431437. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/jhered/esx033\u003c/span\u003e\u003cspan address=\"10.1093/jhered/esx033\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDejun, Lian. The Polymorphisms, Solvent Accessibility and Conservatism of Hepatitis C Virus Nonstructural 5B Protein, Preprint, BioRixv, 2025.02.09.637353; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1101/2025\u003c/span\u003e\u003cspan address=\"10.1101/2025\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. 02.09.637353.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDejunLian. JieLian and qi Dong The Polymorphism, Solvent Accessibility and evolution Conservation of IVA PB1 Protein, preprint.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDuncan C, Ramsey MP, Scherrer T, Zhou, Wilke CO. The Relationship Between Relative Solvent Accessibility and Evolutionary Rate in Protein Evolution. Genetics. 2011;188:479\u0026ndash;88. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1534/genetics.111.128025\u003c/span\u003e\u003cspan address=\"10.1534/genetics.111.128025\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTien MZ, Meyer AG, Sydykova DK, Spielman SJ, Wilke CO. Maximum allowed solvent accessibilites of residues in proteins. PLoS ONE. 2013;8(11):e80635. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0080635\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0080635\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShenkin PS, Erman B, Mastrandrea LD. Information-theoretical entropy as a measure of sequence variability, Proteins Struct. Funct Genet. 1991;11:297\u0026ndash;313.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWILLIAMSON RM. Information Theory Analysis of the Relationship between Primary Sequence Structure and Ligand Recognition among a Class of Facilitated Transporters. J theor Biol. 1995;174(2):179\u0026ndash;88. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1006/jtbi.1995.0090\u003c/span\u003e\u003cspan address=\"10.1006/jtbi.1995.0090\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDayhoff MO, Schwartz RM, Orcutt B. A model of evolutionary change in proteins. Atlas Protein Seq Struc. 1978;5:345\u0026ndash;52.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eManning JR, Jefferson ER, Barton GJ. The contrasting properties of conservation and correlated phylogeny in protein functional residue prediction. BMC Bioinformatics. 2008;9:51. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/1471-2105-9-51\u003c/span\u003e\u003cspan address=\"10.1186/1471-2105-9-51\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShen B, Vihinen M. Conservation and covariance in PH domain sequences: physicochemical profile and Information theoretical analysis of XLA-causing mutations in the Btk PH domain. Protein engineering. Des Selection. 2004;17(3):267\u0026ndash;76. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/protein/gzh030\u003c/span\u003e\u003cspan address=\"10.1093/protein/gzh030\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBabar MM, Zaidi NS. 2015. Protein sequence conservation and stable molecular evolution reveals Influenza Virus Nucleoprotein as a universal druggable target. Infection, Genetics and Evolution. 34:200\u0026thinsp;\u0026ndash;\u0026thinsp;10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.meegid\u003c/span\u003e\u003cspan address=\"10.1016/j.meegid\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. 2015.06.030.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGoldman M, Thorne JL, Jones DT. Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics. 1998;149(1):445\u0026ndash;58. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/genetics/149.1.445\u003c/span\u003e\u003cspan address=\"10.1093/genetics/149.1.445\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChoi SC, Hobolth A, Douglas DM, Robinson M, Kishino H, Thorne JL. Quantifying the Impact of Protein Tertiary Structure on Molecular Evolution. Mol Biol Evol. 2007;24(8):1769\u0026ndash;82. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/molbev/msm097\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msm097\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWorth CL, Gong S, Blundell TL. Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol. 2009;10(10):709\u0026ndash;20. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nrm2762\u003c/span\u003e\u003cspan address=\"10.1038/nrm2762\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEchave J, Wilke CO. Biophysical models of protein evolution: Understanding the patterns of evolutionary sequence divergence. Annu Rev Biophys. 2017;46:85\u0026ndash;103. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1146/annurev-biophys-070816-033819\u003c/span\u003e\u003cspan address=\"10.1146/annurev-biophys-070816-033819\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEchave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet. 2016;17(2):109\u0026ndash;21. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nrg.2015.18\u003c/span\u003e\u003cspan address=\"10.1038/nrg.2015.18\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLin YS, Hsu WL, Hwang JK, Li WS. Proportion of Solvent-Exposed Amino Acids in a Protein and Rate of Protein Evolution. Mol Biol Evol. 2007;24(4):1005\u0026ndash;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/molbev/msm019\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msm019\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFranzosa EA, Xia Y. Structural Determinants of Protein Evolution Are Context-Sensitive at the Residue Level. Mol Biol Evol. 2009;26(10):2387\u0026ndash;95. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/molbev/msp146\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msp146\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShahmoradi A, Sydykova DK, Spielman SJ, Jackson EL, Dawson ET, Meyer AG, Wilke CO. Predicting Evolutionary Site Variability from Structure in Viral Proteins: Buriedness, Packing, Flexibility, and Design. J Mol Evol. 2014;79:130\u0026ndash;42. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s00239-014-9644-x\u003c/span\u003e\u003cspan address=\"10.1007/s00239-014-9644-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDouglas M, Fowler S, Fields. Deep mutational scanning: a new style of protein science. Nat Methods. 2014;11(8):801\u0026ndash;7. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nmeth.3027\u003c/span\u003e\u003cspan address=\"10.1038/nmeth.3027\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWellner A, Gurevich MR, Tawfik DS. Mechanisms of Protein Sequence Divergence and Incompatibility. PLoS Genet. 2013;9(7):e1003665. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pgen.1003665\u003c/span\u003e\u003cspan address=\"10.1371/journal.pgen.1003665\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMiller S, Janin J, Lesk AM, Chothia C. Interior and surface of monomeric proteins. J Mol Biol. 1987;196:641\u0026ndash;56. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/0022-\u003c/span\u003e\u003cspan address=\"10.1016/0022-\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAartjan JW te, Velthuis. Common and unique features of viral RNA-dependent polymerases, Cell Mol Life Sci. 2014;71(22):4403\u0026ndash;4420. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s00018-014-1695-z\u003c/span\u003e\u003cspan address=\"10.1007/s00018-014-1695-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eQin W, Yamashita T, Shirota Y, Lin Y, Wei W, Murakami S. Mutational analysis of the structure and functions of hepatitis C virus RNA-dependent RNA polymerase. Hepatology. 2001;33(3):72\u0026ndash;37. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1053/jhep.2001.22765\u003c/span\u003e\u003cspan address=\"10.1053/jhep.2001.22765\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePal C, Papp B, Martin JL. An integrated view of protein evolution. Nat Rev Genet. 2006;7:337\u0026ndash;48. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nrg1838\u003c/span\u003e\u003cspan address=\"10.1038/nrg1838\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXiaojun, Li. Emergence of SARS-CoV-2 through recombination and strong purifying selection. Sci Adv. 2020;6(27):eabb9153. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1126/sciadv.abb9153\u003c/span\u003e\u003cspan address=\"10.1126/sciadv.abb9153\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Print 2020 Jul.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDarin M, Taverna RA, Goldstein. January, Why are proteins so robust to site mutations?315, Issue 3, 18 2002, Pages 479\u0026ndash;84.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHaiwei H, Guo J, Choe, Loeb LA. Protein tolerance to random amino acid change. PNAS June. 2004 vol;22:9205\u0026ndash;10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e/10.1073/pnas.0403255101\u003c/span\u003e\u003cspan address=\"/10.1073/pnas.0403255101\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKisters-Woike B, Vangierdegom C, M\u0026uuml;ller-Hil B. On the conservation of protein sequences in evolution. TIBS. 2000;25(9):419\u0026ndash;21. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/s0968-0004(01)01877-1\u003c/span\u003e\u003cspan address=\"10.1016/s0968-0004(01)01877-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDePristo MA, Weinreich DM, Hartl DL. Missense meanderings in sequence space: a biophysical view of protein evolution. Nat Rev Genet Sep. 2005;6(9):678\u0026ndash;87. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nrg1672\u003c/span\u003e\u003cspan address=\"10.1038/nrg1672\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBowie JU, Reidhaar-Olson JF, Lim WA, Sauer RT. Deciphering the message in protein sequences: tolerance to amino acid substitutions. Science. 1990;247(4948):1306\u0026ndash;10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1126/science.2315699\u003c/span\u003e\u003cspan address=\"10.1126/science.2315699\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eReidhaar-Olson JF, Sauer RT. Combinatorial cassette mutagenesis as a probe of the informational content of protein sequences. Science. 1988;241(4861):53\u0026ndash;7. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1126/science.3388019\u003c/span\u003e\u003cspan address=\"10.1126/science.3388019\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eReidhaar-Olson JF, Sauer\u0026ensp;RT. Functionally acceptable substitutions in two alpha-helical regions of lambda repressor. Proteins. 1990;7(4):306\u0026ndash;16.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOverington J. Tertiary Structural Constraints on Protein Evolutionary Diversity: Templates, Key Residues and Structure Prediction. Proc. R. Soc. B Vol.241, Num.1301, pp132-145,1990.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMirny Land Shakhnovich EI. Universally Conserved Positions in Protein Folds: Reading Evolutionary Signals about Stability, Folding Kinetics and Function. J Mol Biol. 1999;291:177\u0026ndash;96. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1006/jmbi.1999.2911\u003c/span\u003e\u003cspan address=\"10.1006/jmbi.1999.2911\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLina JJ, Bhattacharjeea MJ, Yua CP, Tsengb YY, Li WH. Many human RNA viruses show extraordinarily stringent selective constraints on protein evolution. PNAS. 2019;116(38):19009\u0026ndash;18. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1073/pnas.1907626116\u003c/span\u003e\u003cspan address=\"10.1073/pnas.1907626116\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHolmes EC. The Evolutionary Genetics of Emerging Viruses. Annu Rev Ecol Evol Syst. 2009;40:353\u0026ndash;72. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1146/annurev.ecolsys.110308.120248\u003c/span\u003e\u003cspan address=\"10.1146/annurev.ecolsys.110308.120248\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWoo J, Robertson DL, Lovell SC. Constraints on HIV-1 diversity from protein structure. J Virol. 2010;84(24):12995\u0026ndash;3003. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/JVI.00702-10\u003c/span\u003e\u003cspan address=\"10.1128/JVI.00702-10\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSaitou N, Nei M. Polymorphism and evolution of influenza A virus genes. Mol Biol Evol. 1986;3(1):57\u0026ndash;74. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/oxfordjournals.molbev.a040381\u003c/span\u003e\u003cspan address=\"10.1093/oxfordjournals.molbev.a040381\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYoshiyuki, Suzuki, Natural Selection on the Influenza Virus Genome. Mol Biol Evol. 2006;23(10):1902\u0026ndash;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/molbev/msl050\u003c/span\u003e\u003cspan address=\"10.1093/molbev/msl050\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Epub 2006 Jul 3.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYeh S-W, Huang T-T. Jen-Wei Liu,et.al, Local packing density is the main structural determinant of the rate of protein sequence evolution at site level. 2014. Biomed Res Int. 2014:572409. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1155/2014/572409\u003c/span\u003e\u003cspan address=\"10.1155/2014/572409\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBajaj M, Blundell T. Evolution and the Tertiary Structure of Proteins, Ann.Rev. Biophys. Bioeng.1984.13:453\u0026thinsp;\u0026ndash;\u0026thinsp;92, doi:0.1146/annurev.bb.13. 060184.002321.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOverington J. Tertiary Structural Constraints on Protein Evolutionary Diversity: Templates, Key Residues and Structure Prediction. Proc. R. Soc. B Vol.241, Num.1301,pp132-145,1990.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSIMMONDS P, TUPLIN A and, EVANS, DJ. Detection of genome-scale ordered RNA structure (GORS) in genomes of positive-stranded RNA viruses: Implications for virus evolution and host persistence, RNA (2004), 10:1337\u0026ndash;51.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVanInsberghe D, McBride DS, DaSilva J, Stark TJ, Lau MSY, Shepard SS, et al. Genetic drift and purifying selection shape within host influenza A virus populations during natural swine infections. PLoS Pathog. 2024;20(4):e1012131. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.ppat.1012131\u003c/span\u003e\u003cspan address=\"10.1371/journal.ppat.1012131\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBabar MM. Najam-us-Sahar Sadaf Zaidi, Protein sequence conservation and stable molecular evolution reveals influenza virus nucleoprotein as a universal druggable target, Infect Genet Evol. 2015 Aug:34:200\u0026thinsp;\u0026ndash;\u0026thinsp;10.doi: 10.1016/j.meegid.2015.06.030. Epub 2015 Jun 30.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBabar MM, Zaidi N-us-S, Tahir S, Muhammad. Global geno-proteomic analysis reveals cross-continental sequence conservation and druggable sites among influenza virus polymerases. Antiviral Res. 2014;112:120\u0026ndash;31. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.antiviral. 2014.10.013\u003c/span\u003e\u003cspan address=\"10.1016/j.antiviral. 2014.10.013\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAartjan JW, Te Velthuis NC, Robb, Achillefs N, Kapanidis. Ervin Fodor,The role of the priming loop in \u003cem\u003eInfluenza A virus\u003c/em\u003e RNA synthesis. Nat Microbiol. 2016;1(5):16029. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nmicrobiol.2016.29\u003c/span\u003e\u003cspan address=\"10.1038/nmicrobiol.2016.29\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Epub 2016 Mar 21.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAustin L, Hughes. Near-Neutrality: the Leading Edge of the Neutral Theory of Molecular Evolution. Ann N Y Acad Sci. 2008;1133:162\u0026ndash;79. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1196/annals.1438.001\u003c/span\u003e\u003cspan address=\"10.1196/annals.1438.001\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMeyer AG, Wilke CO. The utility of protein structure as a predictor of site-wise dN/dS varies widely among HIV-1 proteins. J R Soc Interface. 2015;12:2015057920150579. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://doi.org/10.1098/rsif.2015.0579\u003c/span\u003e\u003cspan address=\"10.1098/rsif.2015.0579\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKoonin EV. Towards a postmodern synthesis of evolutionary biology. Cell Cycle. 2009;8(6):799\u0026ndash;800. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.4161/cc.8.6.8187\u003c/span\u003e\u003cspan address=\"10.4161/cc.8.6.8187\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRoger Frutos J, SerraCobo T, Chen CA, Devaux. COVID-19: Time to exonerate the pangolin from the transmission of SARSCoV2 to humans. Infect Genet Evol 2020 Oct:84:10449310.1016/j.meegid.2020.104493.Epub 2020 Aug 5.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWei Ji WW, Zhao X, Li JZ. Cross-species transmission of the newly identified coronavirus 2019‐nCoV, J Med Virol. 2020; 92:433\u0026ndash;440., \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/jmv.25682\u003c/span\u003e\u003cspan address=\"10.1002/jmv.25682\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSpecial Issue. 2019 Novel Coronavirus Origin, Evolution, Disease, Biology and Epidemiology: Part-I, 2020, Jour Med Virol 92, (4), 433440.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXuhua Xia, DAMBE6: New Tools for Microbial Genomics, Phylogenetics, and, Evolution M. J Hered. 2017;108(4):431437. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/jhered/esx033\u003c/span\u003e\u003cspan address=\"10.1093/jhered/esx033\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSimmonds P, Tuplin A, Evans DJ. Detection of genomescale ordered RNA structure (GORS) in genomes of positive stranded RNA viruses: implications for virus evolution and host persistence. RNA. 2004;10:1337\u0026ndash;51. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1261/rna.7640104\u003c/span\u003e\u003cspan address=\"10.1261/rna.7640104\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSimmonds P, Cuypers L, Irving WL, McLauchlan J, Cooke GS, Barnes E, Consortium STOPHCV, Ansari MA. Impact of virus subtype and host IFNL4 genotype on largescale RNA structure formation in the genome of hepatitis C virus. bioRxiv. 2020. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1101/2020.06.16.155150\u003c/span\u003e\u003cspan address=\"10.1101/2020.06.16.155150\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSimmonds P. Pervasive RNA Secondary Structure in the Genomes of SARS CoV-2 and Other Coronaviruses mBio. 2020;11(6):e0166120. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/mBio.0166120\u003c/span\u003e\u003cspan address=\"10.1128/mBio.0166120\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLukasz Jaroszewski1\u0026sect;. Mallika Iyer2\u0026sect;, Arghavan Alisoltani1\u0026sect;, Mayya Sedova1 and Adam Godzik1, The interplay of SARS CoV-2 evolution and constraints imposed by the structure and functionality of its proteins BioRxiv Preprint /\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003edoi.org/10.1101/2020.08.10.244756\u003c/span\u003e\u003cspan address=\"10.1101/2020.08.10.244756\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang et al. Microbiome Virome landscape of wild rodents and shrews in Central China Microbiome (2025) 1363 \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s40168025020590\u003c/span\u003e\u003cspan address=\"10.1186/s40168025020590\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEvan P, Williams, Briana M, SpruillHarrell, Mariah K Taylor Common Themes in Zoonotic Spillover and Disease Emergence: Lessons Learned from Bat and Rodent Borne, Viruses RNA et al. Viruses.2021;13(8):1509.\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/v13081509\u003c/span\u003e\u003cspan address=\"10.3390/v13081509\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFreuling CM, Breithaupt A, M\u0026uuml;ller T, Sehl J, BalkemaBuschmann A, et al. Susceptibility of raccoon dogs for experimental SARS CoV-2 infection. Emerg Infect Dis. 2020;26:2982\u0026ndash;85.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJon, Cohen. New clues to the pandemic's origin surface, causing uproar. Science. 2023;379(6638):11751176. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1126/science.adh9055\u003c/span\u003e\u003cspan address=\"10.1126/science.adh9055\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Epub 2023 Mar 23.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEdward C, Holmes. The Emergence and Evolution of SARS CoV-2. Annu Rev Virol. 2024;11(1):2142. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1146/annurevvirology093022013037\u003c/span\u003e\u003cspan address=\"10.1146/annurevvirology093022013037\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Epub 2024 Aug 30.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAlexander CritsChristoph, Joshua I, Levy JE, Pekar. Genetic tracing of market wildlife and viruses at the epicenter of the COVID-19 pandemic, Cell.202419;187(19):54685482e1110.1016/j.cell.2024.08.010\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePeter V, Markov M, Ghafari M, Beer K, Lythgoe P, Simmonds NI, Stilianakis A, Katzourakis. The evolution of SARS CoV-2. Nat Rev Microbiol. 2023;21(6):361379. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41579023008782\u003c/span\u003e\u003cspan address=\"10.1038/s41579023008782\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Epub 2023 Apr 5.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXinyuan Cui1,20, Liang KFX. et. al.,Virus diversity, wildlifedomestic animal circulation and potential zoonotic viruses of small mammals, pangolins and zoo animals. Nat Communications|. 2023;142488. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41467023382024\u003c/span\u003e\u003cspan address=\"10.1038/s41467023382024\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMario A, Pe\u0026ntilde;aHern\u0026aacute;ndez. Mia Madel Alfajaro, et. al. SARS CoV-2 related bat viruses evade human intrinsic immunity but lack efficient transmission capacity. Nat Microbiol. 2024;9:2038\u0026ndash;50.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKristian G, Andersen A, Rambaut W, Ian Lipkin EC, Holmes, Robert F, Garry. The proximal origin of SARSCoV2. Nat Med. 2020;26(4):450452. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s4159102008209\u003c/span\u003e\u003cspan address=\"10.1038/s4159102008209\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhixin Liu X, Wei XX, Li J, Yang J, Tan H, Zhu J, Zhang Q, Wu J, Liu L. Composition and divergence of coronavirus spike proteins and host ACE2 receptors predict potential intermediate hosts of SARS CoV-2, J Med Virol.2020;92(6):595601.doi:0.1002/jmv.25726. Epub 2020 Mar 11.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGobena Ameni A, Zewude B, Tulu etal. A Narrative Review on the Pandemic Zoonotic RNA Virus Infections Occurred During the Last 25 Years. J Epidemiol Glob Health. 2024;14(4):1397\u0026ndash;412. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s44197024003047\u003c/span\u003e\u003cspan address=\"10.1007/s44197024003047\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRachele Cagliani D, Forni M, Clerici M, Sironi. Coding potential and sequence conservation of SARS CoV-2 and related animal viruses. Infect Genet Evol. 2020;83:104353. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.meegid.2020.104353\u003c/span\u003e\u003cspan address=\"10.1016/j.meegid.2020.104353\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHayes KH, Luk X, Li J, Fung, Susanna KP, Lau PCY, Woo. Molecular epidemiology, evolution and phylogeny of SARS coronavirus. Infect Genet Evol 2019 Jul:71:2130.\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.meegid.2019.03.001\u003c/span\u003e\u003cspan address=\"10.1016/j.meegid.2019.03.001\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Epub 2019 Mar 4.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDavid A, Meekins, Natasha N, Gaudreault, Juergen A, Richt. Natural and Experimental SARS CoV-2 Infection in Domestic and Wild Animals. Viruses. 2021;13(10):1993. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/v13101993\u003c/span\u003e\u003cspan address=\"10.3390/v13101993\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBustamante CD, Townsend JP, Hartl DL. Solvent Accessibility and Purifying Selection Within Proteins of \u003cem\u003eEscherichia coli\u003c/em\u003e and \u003cem\u003eSalmonella enteric\u003c/em\u003e. Mol Biol Evol. 2000;17(2):301\u0026ndash;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/oxfordjournals.molbev.a026310\u003c/span\u003e\u003cspan address=\"10.1093/oxfordjournals.molbev.a026310\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"SARS CoV-2, Evolution Mechanism, Solvent Accessibility, Origin, Search","lastPublishedDoi":"10.21203/rs.3.rs-6999350/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6999350/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eOver the last few decades, novel viruses that present severe health risks worldwide. The SARS CoV-2 virus, formerly known as a novel coronavirus, broke out in Wuhan (China) and caused major morbidity and mortality globally. Confirmation of intermediate hosts is essential to prevent further spread of the epidemic. The emergence of COVID-19 has triggered many works aimed at stduy of the evolution and identifying the animal intermediate potentially involved in the transmission of SAR SCoV-2 to humans. This study focuses on comparisons of the SARS RNA- dependent RNA polymerase (RDRP) enzyme coding RNA sequences within and between SARS-CoV‐2 and SARS‐CoV, bat SARS‐like CoV, and other coronaviruses, which are helpful for evolutionary analysis to study the evolution mechanism and finding the possible virus reservoirs and the origin of COVID \u0026minus;\u0026thinsp;19.\u003c/p\u003e","manuscriptTitle":"The Evolution Mechanism with Protein Structure and Function and the Origin of SARS CoV-2","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-21 08:37:33","doi":"10.21203/rs.3.rs-6999350/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"8c88c78e-5328-4264-a156-111f920dc525","owner":[],"postedDate":"July 21st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-07-21T08:37:33+00:00","versionOfRecord":[],"versionCreatedAt":"2025-07-21 08:37:33","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6999350","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6999350","identity":"rs-6999350","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00