Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient

doi:10.21203/rs.3.rs-3787764/v1

Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient

2024 · doi:10.21203/rs.3.rs-3787764/v1

preprint OA: closed

Full text JSON View at publisher

Full text 103,669 characters · extracted from preprint-html · click to expand

Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient Ignacio Garcia, Jon Bråte, Even Fossum, Andreas Rohringer, Line V Moen, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3787764/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background The emergence of the SARS-CoV-2 virus led to a global pandemic, prompting extensive research efforts to understand its molecular biology, transmission dynamics, and pathogenesis. Recombination events have been increasingly recognized as a significant contributor to the virus's diversity and evolution, potentially leading to the emergence of novel strains with altered biological properties. Indeed, recombinant lineages such as the XBB variant and its descendants have subsequently dominated globally. Therefore, continued surveillance and monitoring of viral genome diversity is crucial to identify and understand the emergence and spread of novel strains. Methods The case was discovered through routine genomic surveillance of SARS-CoV-2 cases in Norway. Samples were whole genome sequenced by the Illumina NovaSeq platform and SARS-CoV-2 lineage assignment was performed using Pangolin and Nextclade. Mutations were pangolin classified based on the frequency of the mutations present in the AY.98.1 and BA.5 lineages. Results In this study, we report and investigate a SARS-CoV-2 recombination event in a long-term infected immunocompromised COVID-19 patient. Several recombination events between two distinct lineages of the virus, namely AY.98.1 and BA.5, were identified, resulting in a single novel recombinant viral strain with a unique genetic signature. Conclusions The presence of several concomitant recombinants in the patient suggests that these events occur frequently in vivo and can provide insight into the fitness associated with the different combinations of mutations. This study underscores the importance of continued tracking of viral diversity and the potential impact of recombination events on the evolution of the SARS-CoV-2 virus. Trial registration Retrospectively registered SARS-CoV-2 recombinant immunocompromised in-patient recombination event Delta Omicron Figures Figure 1 Figure 2 Figure 3 Background The emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused a global pandemic that affected millions of people worldwide, with significant impacts on public health (World Health Organization, 2023), the economy ( World Development Report , 2022), and social welfare (OECD, 2021 ). The high rate of transmission (Meyerowitz et al., 2021 ) and the ability of the virus to cause severe respiratory illness (Zhou et al., 2020 ) have prompted extensive research efforts to understand its biology and evolution. One crucial aspect of this research is investigating the genetic variability of the virus and the potential for recombination events that can lead to the emergence of new viral strains with altered virulence and transmission characteristics. Recombination occurs when two or more different viral strains infect the same host cell, allowing for the exchange of genetic material between the viruses. In coronaviruses, these events are primarily driven by the RNA-dependent RNA polymerase, which can switch between viral templates during genome replication (Bentley & Evans, 2018 ). This can result in the formation of chimeric viruses that contain genetic material from two or more viral strains (Focosi & Maggi, 2022 ). The emergence of recombinant viruses has been increasingly recognized as a significant contributor to SARS-CoV-2 diversity and evolution. While recombinant SARS-CoV-2 viruses were observed during the first years of the COVID-19 pandemic, these variants did not circulate widely in the population (Burel et al., 2022 ; Sekizuka et al., 2022 ). However, as the number of infections rose with the spread of the Omicron variant, there was also an increase in observed recombinant strains, including the emergence of the XBB lineage, a recombination between the two BA.2.75 subvariants, BJ.1.1 and BM.1.1.1 (Parums, 2023 ), which initially resulted in extensive transmission in Singapore, India, and elsewhere in the fall of 2022 (World Health Organization, 2022). By the spring of 2023, subvariants of XBB had become dominant globally, demonstrating how recombination events can contribute to viral fitness and transmissibility. Chronic SARS-CoV-2 infections in immunocompromised individuals are known to accelerate the viral mutagenesis and significant mutations within the spike protein have been observed in these patients (Harari et al., 2022 ; Li et al., 2022 ). Moreover, the prolonged persistence of the infection in these patients provides a favourable time window for recombination to occur if the patient is exposed to other variants. Indeed, several recombinants have been identified to occur in immunocompromised patients. (Burel at al ., 2022; Zannoli et al., 2023 ) In this study, we report the identification of a recombinant SARS-CoV-2 virus in a long-term infected COVID19 patient. During our surveillance of SARS-CoV-2, we identified the emergence of a recombinant strain between two distinct lineages, AY.98 and BA.5, resulting in a novel viral strain. We gathered and characterized additional samples from the same patient before and after the recombination event. Deep sequencing of all the sequences suggests that recombination events occur frequently in vivo , providing further evidence of the need for continued surveillance and monitoring of viral diversity in immunocompromised patients. Our findings underscore the importance of understanding the molecular mechanisms of recombination and the potential impacts of recombination events on the evolution and emergence of novel strains of SARS-CoV-2. Methods Sample Extraction and Sequencing. All the samples were extracted and processed using the Swift Amplicon SARS-CoV-2 Panel (Swift Biosciences) The samples were sequenced on an Illumina NovaSeq platform at the Norwegian Sequencing Centre (NSC) NorSeq. Generation of SARS-CoV-2 consensus sequences. SARS-Cov-2 consensus sequences were generated using the “Covid-seq” pipeline developed by the NSC ( https://github.com/nsc-norway/covid-seq ). Briefly, PCR primers used during library preparation were removed using NSCTrim ( https://github.com/nsc-norway/NSCtrim ). Then, sequencing adapters, poorly called nucleotides and overall low-quality reads and adapters were removed using fastp (Chen et al., 2018 ). Next, the high-quality-trimmed reads were mapped to the Wuhan-Hu-1 reference genome (NC_045512.2) using Bowtie2 (Langmead & Salzberg, 2012 ). Consensus sequences were generated from the resulting mapping files using samtools, mpileup (Danecek et al., 2021 ) and iVar (Grubaugh et al., 2019 ) with a minimum depth threshold of 10 for calling a nucleotide. Noise calculation. We define noise as the sum of the ratios of all the nucleotides minus the ratio of the most frequent nucleotide (i.e., the one called in the consensus sequence). To calculate the noise of the samples we developed a tool called NoisExtractor ( https://github.com/garcia-nacho/NoisExtractor ). NoisExtractor uses indexed bam files as inputs and for each position of the genome it outputs the noise, depth, the nucleotide with highest frequency and the nucleotide with the second highest frequency and their frequencies respectively. Identification of coinfections/contaminations. As a part of the sequencing routines at NIPH, a quality control is performed for each sample. In this analysis low-quality samples and individual samples containing more than one virus are flagged, as this could indicate a contaminated sample or a coinfection at the patient level. To do this analysis, we developed a machine learning model. This model is based on linear regression in which noise-related parameters (e.g., mean and standard deviation of noise across the genome, binned number of positions with noise, etc) and depth and coverage-related parameters (e.g., binned number of missing positions, average depth, etc) were used to classify a sample as low-quality, high-quality or contaminant. To train the classification model, we used a subset of 1846 manually curated samples that were assigned into 4 different classes: high-quality-high-contamination , high-quality-low-contamination , high-quality-no-contamination and low-quality . The code to perform the quality control and the trained model is available at here: https://github.com/folkehelseinstituttet/FHI_SC2_Pipeline_Illumina . Extraction of sequences for the major and minor variants. Once a possible contamination or coinfection is identified, the sequence of the major variant (most abundant variant) was generated by concatenating the nucleotides with highest frequency at each position of the genome. To generate the sequence of the minor variant (second most abundant variant), the nucleotides in which the noise of sequence was higher than 0.1 were replaced. The nucleotide that replaced the nucleotide with highest frequency (major) was the one with the second highest frequency (minor). We implemented the extraction of sequences by parsing the output of NoisExtractor in R. Identification of recombinant sequences. To identify recombinant sequences, we developed PrecFinder ( https://github.com/garcia-nacho/Precfinder ). For each single mutation in a sequence, PrecFinder calculates the Bayes’ probability of the virus belonging to a particular Pangolin lineage based on the distribution of mutations in different lineages. As the ratios of the different mutations in the virus continues to evolve, the probability is calculated based on a database of sequences which is regularly updated. To find which sequences are recombinants, PrecFinder uses a 1D-convolutional neural network model. The model consists of three sets of a 1D-convolutional neural network layer (1D-CNN) followed by a 1D-MaxPooling layer. The three 1D-CNN have 64, 32 and 12 filters and kernel sizes of 5, 3 and 3 nucleotides respectively. The pool sizes of all the 1D-MaxPooling layers were set to 2. Then, two feed-forward layers with 24 and 12 layers respectively were included. Finally, a softmax classification layer outputs the score to classify the sample. The input of the model consists of a Bayes’ probability matrix of n by m dimensions. Where n is the number of unique lineages present on the database and m is the maximum number of mutations present in at least one sequence of the database. To train the model we used binary-crossentropy as loss function, adagrad as optimizer and a batch size of 128. The training of the model was scheduled for 60 epochs but it was early-stopped if there was no improvement on the accuracy after eight epochs. The weights of the model with highest accuracy were saved. Moreover F1, precision and recall were calculated. As training set, we used the sequences present in the database which consists of a synthetic set of recombinant sequences that were generated using the sequences present in the database. Sequences assigned to different lineages were recombined in silico through one, two or three breaking-points randomly selected in the genome. Moreover, we augmented the dataset through the reordering the n rows of the training set. The model was implemented and trained using Keras (Chollet et al ., 2015) and TensorFlow v2.8 (Abadi et al., 2015 ) in R. Recombinant sequences were also identified using the program sc2rf ( https://github.com/lenaschimmel/sc2rf ). Lineage assignments. SARS-CoV-2 lineage assignment was performed using Pangolin (O’Toole et al., 2021 ) and Nextclade (Aksamentov et al., 2021 ) and the mutations at nucleotide and amino acid levels were identified using Nextclade (Aksamentov et al., 2021 ). To identify the AY.98.1 and BA.5 specific mutations, 2000 AY.98.1 and BA.5 sequences were downloaded from NCBI GenBank using cov-sampler (Cheng et al., 2022 ). Sequences with low-quality and/or wrong lineage assignment according to Nextclade were removed, and the mutations present in the remaining sequences were extracted using Nextclade (Aksamentov et al., 2021 ). Based on the frequency of the mutations present on the AY.98.1 and BA.5 lineages, the mutations present in our sequences were classified either as AY.98.1-specific, BA.5-specific or other, where other means that the mutation is not found on any of the lineages or that it can be found in both. All plots to visualize Pangolin lineages were generated in R (R Core Team, 2022 ) using the library ggplot2 (Wickham, 2016 ). Cultivation of recombinant virus. Vero E6/TMPRSS2 cells (NIBSC #100978) were cultivated in complete Dulbecco’s Modified Eagle Medium (cDMEM) supplemented with 10% fetal bovine serum (FBS) and 1mg/ml G418. In a biosafety level 3 (BSL3) laboratory, clinical samples collected from the patient were added to the cells at approx. 60% confluency in a T-25 flask for 1h at 37°C. The inoculate was subsequently removed and replaced with fresh viral culture medium (DMEM supplemented with 2% FBS, 100 units/ml penicillin, 100 ug/ml streptomycin and 25 mM HEPES). The infected cells were incubated for 3–4 days at 37°C and the supernatant was then diluted 1:1000 and passaged onto fresh cells for a second passage. After 3–4 more days the second passage of virus was harvested. Both the first and the second passage of the virus were sequenced. Fitness estimation. To estimate the fitness of the different virus strains, we identified the substitutions that they carried at the amino acid level using Nextclade. Then, we connected those mutations with the fitness estimated from Bloom and Neher ( 2023 ). The fitness of each of the variants was computed as the sum of the fitnesses of the individual mutations present in the sample. If a sequence contained mutations absent in the fitness database, no fitness was assigned to that mutation. Results Identification of a co-infection. As part of our SARS-CoV-2 surveillance at the Norwegian Institute of Public Health (NIPH), the purity and consistency in the sequence data is monitored. Through this quality control, we identified a sample with high levels of noise, or sequence variation, after the mapping of the reads which typically indicate either contamination or co-infection (Fig. 1 A). To rule out contamination or other sequencing artifacts as causes for the observation, we repeated the entire analysis from RNA extraction, cDNA generation, PCR amplification, library preparation, to sequencing. The re-processed sample showed the exactly same noise pattern (Fig. 1 A and 2 A “day 0”). Strikingly, the sequence variation was restricted mainly to the first two-thirds of the genome (Fig. 1 A), which warranted further investigation. We attempted to re-create the genomic sequences of a potential major and minor variant in the sample (i.e., co-occurring strains of different abundances) (See Methods for details on the generations of the major and minor sequences) and we found that the major sequence was classified as a delta variant (Pangolin lineage: AY.98.1/NextClade clade: 21J) and the minor sequence as a variant of omicron (BA.5/21B) (Fig. 1 A and Fig. 1 B). Identification of delta/omicron recombinant strains. Fine-grained mutation-profile analyses of the two strains (i.e., major and minor variants) showed that both sequences were actually recombinants. The major strain had AY.98-specific mutations in the first 15Kb of the genome (flanked by G210T and G15451A ) (Fig. 1 B). Then, there was an 8Kb region overlapping with the Spike gene that contains just BA.5 mutations (flanked by C17410T and C25584T ). The minor strain, on the contrary, had only BA.5-specific mutations on the first 15.7kb of the genome flanked by the BA.5-specific C44T and C15714T mutations. Then, there was a 3Kb region where there is a mixture of delta and omicron mutations. Next, there was a 4Kb region covering the Spike gene where there are BA.5 mutations only (flanked by C21618T and C25584T BA.5 mutations) (Fig. 1 B). The final part of the genome of both strains was identical and it carries only AY.98.1-specific mutations (Fig. 1 B). This suggests that the sample isolated on day 0 was actually a mixture of at least two recombinant strains, one AY.98.1-BA.5-AY.98.1 recombinant and one a BA.5-AY.98.1 recombinant. Two independent recombinant-detection tools, sc2rf and PrecFinder classified both strains as recombinants (Fig. 1 B, Fig. S1 A and Fig. S1 B). Virus evolution in an immunocompromised long-term infected COVID-19 patient. Driven by these results, we became interested in tracing the sample's origin, and we found that it had been obtained from a long-term infected COVID-19 patient. Since the patient was already being monitored at the hospital, we could obtain five additional samples from the hospital collected from 288 days before the day 0 sample and up to 10 days before (Fig. 2 A and 2 B). All these samples had much lower levels of noise compared to day 0, suggesting that the patient was only infected by a single viral strain at these time points. Some degree of noise was observed on day − 171 and day − 130 ), but analysis of any major and minor strains in these samples, as well as all the other samples taken before day 0 , showed only AY.98.1 strains without any BA.5-specific mutations (Fig. 2 C). This suggest that the recombination events happened sometime in the ten days before day 0 . To gain information on the relative fitness of the two recombinants, we decided to follow up the patient to see the viruses competing in vivo . We collected five extra samples from the patient at 22, 69, 70, 93 and 103 days after day 0 (Fig. 2 A and Fig. 2 B). We found that the new samples had low noise levels, consistent with the presence of just a single lineage (Fig. 2 A). Indeed, when we extracted major and minor variants from these new samples, we found that all of them belonged to a new recombinant lineage. This survivor lineage was the recombinant with an AY.98.1 backbone and a BA.5 Spike. We noticed that the lineage was similar but not identical to the major lineage found on the sample collected on day 0 (Fig. 2 C). By looking at the mutation profiles (Fig. 2 C) together with the noise across the genome (Fig. 2 A) it seems unlikely that the minor lineage found on the day 0 was the one that outcompeted the other lineages. We therefore hypothesized two scenarios that could lead to the results observed 22 days after day 0 . (i) The two recombinant lineages found on the day 0 recombined again, so that the one that became dominant afterwards lost four BA.5-specific mutations (i.e., C17410T , A18163G, C19955T, and A20055G) and acquired three AY.98.1 specific mutations (i.e., C16466T, C19220T, C19524T). (ii) Alternatively, on the sample taken on day 0 there was indeed a mixture of at least three recombinants: a BA.5-AY.98.1 similar to the minor variant on day 0 , a AY.98.1-BA.5-AY.98.1 similar to the major variant on day 0 , and another AY.98.1-BA.5-AY.91 recombinant similar to the major variant present on the sample taken 22 days after day 0 . A mixture of three such recombinants with approximated ratios 65%, 10% and 25% respectively, would produce a noise pattern like the one observed on day 0 (Fig. S1 C versus Fig. 2 A “ day 0 ”). Although it is impossible to distinguish between these scenarios a posteriori , we believe that the second scenario is more plausible. Anyway, both scenarios require multiple recombination events suggesting that recombination between different SARS-CoV-2 lineages may occur frequently during co-infection. Cultivation of the recombinant lineage. To investigate the ability of the recombinant strain to propagate in vitro , we cultivated the virus extracted from the sample taken 70 days after the day 0 . We found that the virus was able to replicated and that after two passages the sequence was the same as the original recombinant (Fig. 2 C, “day + 70 in-vitro” ). Interestingly, we found that all the noisy positions present in the original sample, were also noisy in the cultured samples (Fig. S1 D), suggesting that the sample contained a mix of strains with different mutations at these positions. However, none of them seemed to provide a strong fitness advantage, at least in vitro . Fitness advantage of the recombinant. We hypothesized that the competition of two similar viruses inside a patient would be the perfect arena to infer which mutations would provide fitness advantages in vivo . After excluding the mutations gained or lost because of the recombination on day 0 , we found 21 mutations with presence/absence patterns that suggested they had been gained and/or lost during the evolution of the virus within the patient (Fig. 3 ). We found 13 mutations that were incorporated into the genome at some point during the infection (Fig. 3 B). We found seven mutations that were gained and then subsequently lost some time after and one mutation (S:H49Y) with a pattern that suggests that it might have been gained by the virus at two different timepoints (day − 171 and day 93). Then, we analysed the fitness difference associated with these 21 mutations by associating them with the fitness differences calculated by Bloom and Neher ( 2023 ). Surprisingly, we found that seven of the thirteen mutations gained and fixated into the genome are estimated to yield a negative fitness difference; conversely, six of eight mutations gained and lost had positive fitness differences. Finally, we estimated the fitness of all the variants identified by adding the fitness associated with all the mutations present in their genomes. We found that although the virus has gained fitness during the infection, the main driver for the fitness increase was the recombination event (Fig. 3 C). Moreover, the presence of a significant number of gained-and-lost mutations (Fig. 3 B) together with the differences in the estimated fitness between major and minor variants (Fig. 3 C) suggests the presence of different subvariants with different mutations competing within the patient. Discussion Homologous recombination in coronaviruses is thought to occur when the enzyme RNA-dependent RNA polymerase (RdRp) separates from one RNA template while keeping the nascent RNA and then continues building the strand at the same position using a different template molecule (Focosi & Maggi, 2022 ). Although coronaviruses have evolved to use recombination as part of their replication processes to produce a pool of recombined RNA molecules, the role of this viral molecular mechanism in generating novel recombinant lineages remains uncertain. To our knowledge, this is the first report of a recombinant SARS-CoV-2 virus between these the omicron BA.5 and delta AY.98 lineages and the first time that we have witnessed consecutive sequencing snapshots of the competition of several SARS-CoV-2 lineages in one infected individual. Although other studies have found recombinant viruses in sequential samples acquired from long-term infected patients (Burel et al., 2022 ), this is the first time that we have obtained and analysed samples in which at least two recombinant lineages were competing each other in vivo . Moreover, we have developed and released a set of tools to detect and analyse this type of events in the future (i.e., Precfinder, NoisExtractor, Co-infection detection tool). The results of this study reveal the emergence of a recombinant virus with an AY.98.1 backbone and a BA.5 Spike gene isolated from a long-term infected COVID-19 patient in Norway. The most likely scenario for this recombinant to arise is that, while at the hospital, the long-term patient infected with an AY.98.1 virus came in contact with another person infected by a BA.5 virus leading to a coinfection and that shortly after the two lineages recombined. The recombined strain that eventually became the dominant strain in the patient probably arose within 10 days prior to the first detection of the recombinant lineage. However, our observations suggest that there were actually multiple recombination events within the patient, both between the omicron and delta variants but also secondary events between different recombinants. These results suggest that recombination can occur frequently during coinfection, and they highlight the importance of close monitoring and early detection of such events. Moreover, our findings suggest that the several recombinant viruses may have been competing in the patient. And the fact that some apparently harmful mutations were retained over time, while beneficial mutations were lost, suggest that the evolution of the virus within the patient might be affected by clonal interference, a phenomenon where beneficial mutations may disappear from a population because of competition between sub-variants carrying the different mutations (Strelkowa & Lässig, 2012 ). By analysing the fitness associated with each of the observed mutations in the viral population in the long-term infected patient, we found that recombination had a major impact on the fitness increase of the virus. Indeed, the fitness gained due to mutations acquired or lost during the infection seems to be lower than the fitness gained because of the incorporation of an Omicron Spike into a Delta backbone via recombination. Recombination in betacoronaviruses may therefore serve as a powerful mechanism to overcome clonal interference and ensure mixing of genetic material between lineages. Clonal interference is strongest in asexual organisms, or when there is a strong linkage disequilibrium, but recombination might serve to overcome clonal interference. Indeed, one hypothesis that might explain the success of RNA-viruses is that the high recombination rates in RNA-viruses might help them to overcome the burdens of clonality. However, it is possible that our fitness estimations differ from the actual fitness of the virus because of three reasons. First, the database that we used to estimate the fitness was constructed using epidemiological data gathered from viral databases and it is possible that the mutations important for the fitness of the virus at the population level differ from the mutations important for its transmission between cells within the patient. This might be especially relevant for patients with a weakened immune system unable of clearing the infection for months. Second, when we computed the overall fitness of each lineage, we did not account for epistatic relationships between mutations and it is possible that genetic interactions between mutations become important determinants for the overall fitness of the virus. Third, the fitness associated with deletions of amino acids was not taken into account since the dataset does not have information about the fitness changes due to sequence deletions or insertions. Therefore, further research is needed to investigate the potential implications of the mutations gained by the virus during its evolution within the patient (i.e., ORF7a:E22D, ORF1ab:V86F, ORF1ab:V4102A, ORF1ab:N1080I, ORF1ab:P1427S, ORF1ab:C1889Y, ORF1ab:S1272G, ORF1ab:Q4100H, ORF1ab:M1156I, ORF1ab:A2909V, N:P326L, ORF1ab:I3619V, ORF1ab:T1538I) in term of fitness, transmissibility, virulence, and vaccine effectiveness. Conclusions The identification of recombinant viruses in a long-term infected COVID-19 patient raises questions about the potential for similar events to occur in other patients and populations, as well as the implications for ongoing efforts to control the spread of the virus. Our findings highlight the importance of continued surveillance and monitoring of SARS-CoV-2 genomes, particularly in high-risk populations such as long-term infected immunocompromised COVID-19 patients, to detect and respond to potential recombination events and other evolutionary changes in the virus. These patients are possibly one of the most probable causes for new novel recombinant SARS-CoV-2 variants. Overall, our study provides important insights into the genetic diversity and evolution of SARS-CoV-2 and underscores the need for ongoing research and surveillance efforts to better understand and combat this global health threat. Abbreviations SARS-CoV-2 Severe Acute Respiratoy Syndrome coronavirus-2 ENA European Nucleotide Archive NIPH Norwegian Insitute of Public Health Declarations Data availability Sequences in fastq and fasta format are stored in the European Nucleotide Archive (ENA) under the Project ID PRJEB71327. Acknowledgments We acknowledge all the hard work carried out in the clinic and their contribution to the infection control monitoring which has led to the discovery of such important single cases as this case. We also acknowledge the contributions of providers of publically available sequences. We are also very grateful for the whole team of highly skilled technicians involved in whole genome sequencing of the samples at the NIPH, and especially Rasmus Kopperud Riis. We also sincerely thank the Norwegian Sequencing Centre (NSC) NorSeq for partnering during the pandemic to achieve large volume sequencing capacity. Ethics Approval Ethical approval has not been sought for these analyses since the work has been carried out as part of the monitoring of infectious diseases at the national public health institute covered by the national infection control act. The ethics committee/scientific department at the local hospital, Levanger Hospital, has nevertheless been consulted and approval has been given to publish the results in current form. Consent for publication The ethics committee/scientific advice department, at the local hospital, Levanger Hospital, has been consulted and approval has been given to publish the results in current form without consent from the actual patient. Competing interests The authors declare that they have no competing interests. Funding Not applicable Author contributions IG, OH and KB conceptualized the study. OF og KZ provided the samples and the clinical information. IG and JB analysed the data. EF performed the cultivation of the virus in vitro. IG, JB, EF, LM, AR, OH and KB wrote the manuscript. All authors reviewed and edited the manuscript, and approved the final version. References Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jozefowicz R., , Jia, Y., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Schuster, M., & others (2015) TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org . Aksamentov, I., Roemer, C., Hodcroft, E., & Neher, R. (2021). Nextclade: clade assignment, mutation calling and quality control for viral genomes. Journal of Open Source Software , 6 , 3773. https://doi.org/10.21105/joss.03773 Bentley, K., & Evans, D. J. (2018). Mechanisms and consequences of positive-strand RNA virus recombination. Journal of General Virology , 99 (10), 1345-1356. https://doi.org/https://doi.org/10.1099/jgv.0.001142 Bloom, J. D., & Neher, R. A. (2023). Fitness effects of mutations to SARS-CoV-2 proteins. Virus Evolution , 9 (2):vead55. https://doi.org/10.1093/ve/vead055 Burel, E., Colson, P., Lagier, J.-C., Levasseur, A., Bedotto, M., Lavrard-Meyer, P., Fournier, P.-E., La Scola, B., & Raoult, D. (2022). Sequential Appearance and Isolation of a SARS-CoV-2 Recombinant between Two Major SARS-CoV-2 Variants in a Chronically Infected Immunocompromised Patient. Viruses , 14 (6), 1266. https://www.mdpi.com/1999-4915/14/6/1266 Carabelli, A. M., Peacock, T. P., Thorne, L. G., Harvey, W. T., Hughes, J., 6, C.-G. U. C. d. S. T. I., Peacock, S. J., Barclay, W. S., de Silva, T. I., & Towers, G. J. (2023). SARS-CoV-2 variant biology: immune escape, transmission and fitness. Nature Reviews Microbiology , 1-16. Chen, S., Zhou, Y., Chen, Y., & Gu, J. (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34 (17), i884-i890. https://doi.org/10.1093/bioinformatics/bty560 Cheng, Y., Ji, C., Han, N., Li, J., Xu, L., Chen, Z., Yang, R., Zhou, H.-Y., & Wu, A. (2022). covSampler: A subsampling method with balanced genetic diversity for large-scale SARS-CoV-2 genome data sets. Virus Evolution , 8 (2). https://doi.org/10.1093/ve/veac071 Chollet, F. & others, (2015). Keras. https://keras.io Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., McCarthy, S. A., Davies, R. M., & Li, H. (2021). Twelve years of SAMtools and BCFtools. GigaScience , 10 (2). https://doi.org/10.1093/gigascience/giab008 Focosi, D., & Maggi, F. (2022). Recombination in Coronaviruses, with a Focus on SARS-CoV-2. Viruses , 14 (6). https://doi.org/10.3390/v14061239 Grubaugh, N. D., Gangavarapu, K., Quick, J., Matteson, N. L., De Jesus, J. G., Main, B. J., Tan, A. L., Paul, L. M., Brackney, D. E., Grewal, S., Gurfield, N., Van Rompay, K. K. A., Isern, S., Michael, S. F., Coffey, L. L., Loman, N. J., & Andersen, K. G. (2019). An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biology , 20 (1), 8. https://doi.org/10.1186/s13059-018-1618-7 Harari, S., Tahor, M., Rutsinsky, N., Meijer, S., Miller, D., Henig, O., Halutz, O., Levytskyi, K., Ben-Ami, R., Adler, A., Paran, R., & Adi Stern (2022). Drivers of adaptive evolution during chronic SARS-CoV-2 infections. Nature Medicine, 28, 1501-1508 . https://doi.org/10.1038/s41591-022-01882-4 Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods , 9 (4), 357-359. https://doi.org/10.1038/nmeth.1923 Li, P., de Vries, A. C., Kamar, N., Peppelenbosch, M. P., Pan, Q. (2022). Monitoring and managing SARS-CoV-2 evolution in immunocompromised populations. Lancet Microbe , 3(5), e325-e326. https://doi.org/10.1016/S2666-5247(22)00061-1 Meyerowitz, E. A., Richterman, A., Gandhi, R. T., & Sax, P. E. (2021). Transmission of SARS-CoV-2: A Review of Viral, Host, and Environmental Factors. Ann Intern Med , 174 (1), 69-79. https://doi.org/10.7326/m20-5008 O’Toole, Á., Scher, E., Underwood, A., Jackson, B., Hill, V., McCrone, J. T., Colquhoun, R., Ruis, C., Abu-Dahab, K., Taylor, B., Yeats, C., du Plessis, L., Maloney, D., Medd, N., Attwood, S. W., Aanensen, D. M., Holmes, E. C., Pybus, O. G., & Rambaut, A. (2021). Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool. Virus Evolution , 7 (2). https://doi.org/10.1093/ve/veab064 OECD. (2021). Risks that matter 2020: The long reach of COVID-19. https://doi.org/doi:https://doi.org/10.1787/44932654-en Organisation, W. H. (2023). WHO Coronavirus (COVID-19) Dashboard . https://covid19.who.int/ Parums, D. V. (2023). Editorial: The XBB.1.5 ('Kraken') Subvariant of Omicron SARS-CoV-2 and its Rapid Global Spread. Med Sci Monit , 29 , e939580. https://doi.org/10.12659/msm.939580 R Core Team. (2022). A language and environment for statistical computing. R Foundation for Statistical Computing , https://www.R-project.org Sekizuka, T., Saito, M., Itokawa, K., Sasaki, N., Tanaka, R., Eto, S., Someno, R., Ogamino, A., Yokota, E., Saito, T., & Kuroda, M. (2022). Recombination between SARS-CoV-2 Omicron BA.1 and BA.2 variants identified in a traveller from Nepal at the airport quarantine facility in Japan. Journal of Travel Medicine , 29 (6). https://doi.org/10.1093/jtm/taac051 Strelkowa, N., & Lässig, M. (2012). Clonal interference in the evolution of influenza. Genetics, 192 (2), 671-682. https://doi.org/10.1534/genetics.112.143396 Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis . Springer-Verlag New York. https://ggplot2.tidyverse.org World Development Report 2022: Finance for an Equitable Recovery . (2022). https://doi.org/10.1596/978-1-4648-1730-4 World Health, O. (2022). COVID-19 weekly epidemiological update, edition 115, 26 October 2022 . https://apps.who.int/iris/handle/10665/363853 Zannoli, S., Brandolini, M., Marino, M. M., Denicolò, A., Mancini, A., Taddei, F., Arfilli, V., Manera, M., Gatti, G., Battisti, A., Grumiro, L., Scalcione, A., Dirani, G., Sambri, V. (2023). SARS-CoV-2 coinfection in immunocompromised host leads to the generation of recombinant strain. International Journal of Infectious Diseases , 131, 65-70. https://doi.org/10.1016/j.ijid.2023.03.014 Zhou, P., Yang, X.-L., Wang, X.-G., Hu, B., Zhang, L., Zhang, W., Si, H.-R., Zhu, Y., Li, B., Huang, C.-L., Chen, H.-D., Chen, J., Luo, Y., Guo, H., Jiang, R.-D., Liu, M.-Q., Chen, Y., Shen, X.-R., Wang, X., . . . Shi, Z.-L. (2020). A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature , 579 (7798), 270-273. https://doi.org/10.1038/s41586-020-2012-7 Additional Declarations No competing interests reported. Supplementary Files FigS121112023.pdf Figure S1. Recombinant viruses confirmation. A.Classification of the major and minor sequences obtained on day 0 according to PrecFinder. B. Classification of the major and minor sequences obtained on day 0 according to sc2rf. C. Simulated noise of the scenario in which three distinct recombinants with ratios 65%m 25% and 10% were mixed. The parental lineage of the different fragments of the genome was represented with red (BA.5) and blue (AY.98.1) colors. D. Noise ratio of the sample obtained from the patient on day 70 (left) and the extracted virus cultivated for one (middle) or two passages (right). The noise outliers and missing positions were labeled with red and blue dots respectively as described in Fig. 1A. The nucleotide positions for the bases with high noise were also labeled. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3787764","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":263722030,"identity":"4d9b17e7-07ce-4a16-88fd-a623d3ada839","order_by":0,"name":"Ignacio Garcia","email":"","orcid":"","institution":"Norwegian Institute of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Ignacio","middleName":"","lastName":"Garcia","suffix":""},{"id":263722031,"identity":"7fe9d4b3-84e4-4e1e-8218-bdd7cc18c251","order_by":1,"name":"Jon Bråte","email":"","orcid":"","institution":"Norwegian Institute of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Jon","middleName":"","lastName":"Bråte","suffix":""},{"id":263722032,"identity":"4279d315-6ba9-4fdf-8541-a3adadc99c72","order_by":2,"name":"Even Fossum","email":"","orcid":"","institution":"Norwegian Institute of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Even","middleName":"","lastName":"Fossum","suffix":""},{"id":263722036,"identity":"2e279e43-ed39-4211-bbce-a065f14b70d0","order_by":3,"name":"Andreas Rohringer","email":"","orcid":"","institution":"Norwegian Institute of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Andreas","middleName":"","lastName":"Rohringer","suffix":""},{"id":263722038,"identity":"b3d3acf4-0684-4bd1-a257-39c8afbbdb63","order_by":4,"name":"Line V Moen","email":"","orcid":"","institution":"Norwegian Institute of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Line","middleName":"V","lastName":"Moen","suffix":""},{"id":263722039,"identity":"dd53b751-95d6-402d-8b10-ca604a77a3b0","order_by":5,"name":"Olav Hungnes","email":"","orcid":"","institution":"Norwegian Institute of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Olav","middleName":"","lastName":"Hungnes","suffix":""},{"id":263722040,"identity":"714cda0a-863c-4840-be1a-cdd2b1898cb2","order_by":6,"name":"Olav Fjaere","email":"","orcid":"","institution":"Levanger Hospital, Nord-Trøndelag Hospital Trust","correspondingAuthor":false,"prefix":"","firstName":"Olav","middleName":"","lastName":"Fjaere","suffix":""},{"id":263722041,"identity":"cc195866-5caf-4fbd-8890-1d1d21da0e17","order_by":7,"name":"Kyriakos Zaragkoulias","email":"","orcid":"","institution":"Levanger Hospital, Nord-Trøndelag Hospital Trust","correspondingAuthor":false,"prefix":"","firstName":"Kyriakos","middleName":"","lastName":"Zaragkoulias","suffix":""},{"id":263722042,"identity":"82b2a0cc-01bc-4afe-b22d-844222183289","order_by":8,"name":"Karoline Bragstad","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABA0lEQVRIiWNgGAWjYBACCRDxwADGAoHjjc8YEghpSUBoAbLOHDYjQgucBdJyI9kMr8Mk288+fJBQcEeOQbr34ccvf/7I8918zPbgAQNQBAeQ5kk3NkgweGbMIHPcWFq2zcBw5u1kdoMEBqAIDiDHkMYmkWBwOLFBIo1BWrLBgHHD7fxjEgkMQBFcWvifgbXUA7Uw/5b4Y2C/4eZhNrxapCUgtiQwABmSH9gMEjfcYMavRXLGM2agXw4btskcY7NmbDNOnnkmGWwITr9InE9jfPDhz2F5fuk25ps//sjZ9h0/zCb5o+IwzhCDAzYgZuaBcw0IaoAAxh9EKhwFo2AUjIKRBQBxEFTPEGjBBAAAAABJRU5ErkJggg==","orcid":"","institution":"Norwegian Institute of Public Health","correspondingAuthor":true,"prefix":"","firstName":"Karoline","middleName":"","lastName":"Bragstad","suffix":""}],"badges":[],"createdAt":"2023-12-21 15:44:17","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3787764/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3787764/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":49234692,"identity":"9a98b66e-8e41-42ee-a72f-edd77898cfc5","added_by":"auto","created_at":"2024-01-05 17:48:02","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":101065,"visible":true,"origin":"","legend":"\u003cp\u003eSARS-CoV-2 coinfection on a long-term infected patient. \u003cstrong\u003eA\u003c/strong\u003e. Noise ratio of a sample and Pangolin and Nextclade classification of major and minor lineages extracted from the sample. The vertical lines represent the noise for each of the positions on the genome. The positions with a noise higher than 5 times the median of the noise across the genome between the standard deviation of the noise were considered noise outliers and labeled with a red dot. The positions with mission coverage were labeled with a blue dot. \u003cstrong\u003eB.\u003c/strong\u003e Pangolin lineage probability plot for the mutations present on the major and minor lineages.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-3787764/v1/b351a952b1bc3176c7c22a5b.png"},{"id":49236543,"identity":"fa12fc44-5429-415d-bd89-56a3598baa48","added_by":"auto","created_at":"2024-01-05 17:56:02","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":148425,"visible":true,"origin":"","legend":"\u003cp\u003eIdentification of SARS-CoV-2 recombinants. \u003cstrong\u003eA.\u003c/strong\u003eNoise ratio for all the consecutive samples analyzed. The noise outliers and missing positions were labeled with red and blue dots respectively as described in Fig. 1A. \u003cstrong\u003eB.\u003c/strong\u003e Timeline of the sequences obtained from the patient together with the type lineage that was obtained in each of them. \u0026nbsp;\u003cstrong\u003eC.\u003c/strong\u003eNucleotide substitutions for all the sequences. The AY.98.1-specific mutations and BA.5-specific mutations were represented with blue and red dots respectively. Non-specific or unknown mutations were represented as gray dots.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-3787764/v1/2fb3bd7699977dcb726fff6d.png"},{"id":49234694,"identity":"c96a633d-089d-467c-8445-b1685814e431","added_by":"auto","created_at":"2024-01-05 17:48:02","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":228902,"visible":true,"origin":"","legend":"\u003cp\u003eFitness calculation.\u003cstrong\u003e A.\u003c/strong\u003e Amino acid substitutions for all the sequences. \u003cstrong\u003eB.\u003c/strong\u003e Mutations gained and or lost during the infection. The mutations present on the major and minor lineages were represented as red and blue dots respectively. Purple dots represent mutations present in both minor and minor lineages. The fitness associated with each of the mutations was obtained from the calculations performed by Bloom and Neher (2023) and represented as a barplot on the right. The fitness of mutations gained during the infection were represented with green bars and the fitness of the mutations \u003cem\u003egained-and-lost\u003c/em\u003e were represented with yellow bars. \u003cstrong\u003eC.\u003c/strong\u003e Line and scatter plot of the overall fitness calculated for each sample. The individual fitness associated for each mutation present in a sample were added. The total fitness of the major lineages and minor lineages were represented with red and blue dots respectively and the line connects the fitness of the major lineages.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-3787764/v1/f38f176940cd555984c0089a.png"},{"id":49237795,"identity":"7bf488a2-6114-43ff-8c1f-b6d926bce9c5","added_by":"auto","created_at":"2024-01-05 18:04:02","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":691582,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3787764/v1/ad20aa97-abda-4260-a452-6f496893cd1c.pdf"},{"id":49234695,"identity":"6cd9ff4e-21a5-423a-bfc4-96f90f3f4cea","added_by":"auto","created_at":"2024-01-05 17:48:03","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":304272,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFigure S1.\u003c/strong\u003e Recombinant viruses confirmation. \u003cstrong\u003eA.\u003c/strong\u003eClassification of the major and minor sequences obtained on day\u003cem\u003e 0\u003c/em\u003eaccording to PrecFinder. \u003cstrong\u003eB.\u003c/strong\u003e Classification of the major and minor sequences obtained on day 0 according to sc2rf. \u003cstrong\u003eC.\u003c/strong\u003e Simulated noise of the scenario in which three distinct recombinants with ratios 65%m 25% and 10% were mixed. The parental lineage of the different fragments of the genome was represented with red (BA.5) and blue (AY.98.1) colors. \u003cstrong\u003eD. \u003c/strong\u003eNoise ratio of the sample obtained from the patient on day 70 (left) and the extracted virus cultivated for one (middle) or two passages (right). The noise outliers and missing positions were labeled with red and blue dots respectively as described in Fig. 1A. The nucleotide positions for the bases with high noise were also labeled.\u003c/p\u003e","description":"","filename":"FigS121112023.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3787764/v1/68cca6616df860e61fe2baea.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient","fulltext":[{"header":"Background","content":"\u003cp\u003eThe emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused a global pandemic that affected millions of people worldwide, with significant impacts on public health (World Health Organization, 2023), the economy (\u003cem\u003eWorld Development Report\u003c/em\u003e, 2022), and social welfare (OECD, \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). The high rate of transmission (Meyerowitz et al., \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) and the ability of the virus to cause severe respiratory illness (Zhou et al., \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2020\u003c/span\u003e) have prompted extensive research efforts to understand its biology and evolution. One crucial aspect of this research is investigating the genetic variability of the virus and the potential for recombination events that can lead to the emergence of new viral strains with altered virulence and transmission characteristics.\u003c/p\u003e \u003cp\u003eRecombination occurs when two or more different viral strains infect the same host cell, allowing for the exchange of genetic material between the viruses. In coronaviruses, these events are primarily driven by the RNA-dependent RNA polymerase, which can switch between viral templates during genome replication (Bentley \u0026amp; Evans, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). This can result in the formation of chimeric viruses that contain genetic material from two or more viral strains (Focosi \u0026amp; Maggi, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). The emergence of recombinant viruses has been increasingly recognized as a significant contributor to SARS-CoV-2 diversity and evolution.\u003c/p\u003e \u003cp\u003eWhile recombinant SARS-CoV-2 viruses were observed during the first years of the COVID-19 pandemic, these variants did not circulate widely in the population (Burel et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Sekizuka et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). However, as the number of infections rose with the spread of the Omicron variant, there was also an increase in observed recombinant strains, including the emergence of the XBB lineage, a recombination between the two BA.2.75 subvariants, BJ.1.1 and BM.1.1.1 (Parums, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2023\u003c/span\u003e), which initially resulted in extensive transmission in Singapore, India, and elsewhere in the fall of 2022 (World Health Organization, 2022). By the spring of 2023, subvariants of XBB had become dominant globally, demonstrating how recombination events can contribute to viral fitness and transmissibility.\u003c/p\u003e \u003cp\u003eChronic SARS-CoV-2 infections in immunocompromised individuals are known to accelerate the viral mutagenesis and significant mutations within the spike protein have been observed in these patients (Harari et al., \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Li et al., \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). Moreover, the prolonged persistence of the infection in these patients provides a favourable time window for recombination to occur if the patient is exposed to other variants. Indeed, several recombinants have been identified to occur in immunocompromised patients. (Burel \u003cem\u003eat al\u003c/em\u003e., 2022; Zannoli et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2023\u003c/span\u003e)\u003c/p\u003e \u003cp\u003eIn this study, we report the identification of a recombinant SARS-CoV-2 virus in a long-term infected COVID19 patient. During our surveillance of SARS-CoV-2, we identified the emergence of a recombinant strain between two distinct lineages, AY.98 and BA.5, resulting in a novel viral strain. We gathered and characterized additional samples from the same patient before and after the recombination event. Deep sequencing of all the sequences suggests that recombination events occur frequently \u003cem\u003ein vivo\u003c/em\u003e, providing further evidence of the need for continued surveillance and monitoring of viral diversity in immunocompromised patients. Our findings underscore the importance of understanding the molecular mechanisms of recombination and the potential impacts of recombination events on the evolution and emergence of novel strains of SARS-CoV-2.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eSample Extraction and Sequencing.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eAll the samples were extracted and processed using the Swift Amplicon SARS-CoV-2 Panel (Swift Biosciences) The samples were sequenced on an Illumina NovaSeq platform at the Norwegian Sequencing Centre (NSC) NorSeq.\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eGeneration of SARS-CoV-2 consensus sequences.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eSARS-Cov-2 consensus sequences were generated using the \u0026ldquo;Covid-seq\u0026rdquo; pipeline developed by the NSC (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/nsc-norway/covid-seq\u003c/span\u003e\u003cspan address=\"https://github.com/nsc-norway/covid-seq\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Briefly, PCR primers used during library preparation were removed using NSCTrim (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/nsc-norway/NSCtrim\u003c/span\u003e\u003cspan address=\"https://github.com/nsc-norway/NSCtrim\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Then, sequencing adapters, poorly called nucleotides and overall low-quality reads and adapters were removed using fastp (Chen et al., \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eNext, the high-quality-trimmed reads were mapped to the \u003cem\u003eWuhan-Hu-1\u003c/em\u003e reference genome (NC_045512.2) using Bowtie2 (Langmead \u0026amp; Salzberg, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). Consensus sequences were generated from the resulting mapping files using samtools, mpileup (Danecek et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) and iVar (Grubaugh et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2019\u003c/span\u003e) with a minimum depth threshold of 10 for calling a nucleotide.\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eNoise calculation.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eWe define noise as the sum of the ratios of all the nucleotides minus the ratio of the most frequent nucleotide (i.e., the one called in the consensus sequence). To calculate the noise of the samples we developed a tool called NoisExtractor (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/garcia-nacho/NoisExtractor\u003c/span\u003e\u003cspan address=\"https://github.com/garcia-nacho/NoisExtractor\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). NoisExtractor uses indexed bam files as inputs and for each position of the genome it outputs the noise, depth, the nucleotide with highest frequency and the nucleotide with the second highest frequency and their frequencies respectively.\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eIdentification of coinfections/contaminations.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eAs a part of the sequencing routines at NIPH, a quality control is performed for each sample. In this analysis low-quality samples and individual samples containing more than one virus are flagged, as this could indicate a contaminated sample or a coinfection at the patient level. To do this analysis, we developed a machine learning model. This model is based on linear regression in which noise-related parameters (e.g., mean and standard deviation of noise across the genome, binned number of positions with noise, etc) and depth and coverage-related parameters (e.g., binned number of missing positions, average depth, etc) were used to classify a sample as low-quality, high-quality or contaminant. To train the classification model, we used a subset of 1846 manually curated samples that were assigned into 4 different classes: \u003cem\u003ehigh-quality-high-contamination\u003c/em\u003e, \u003cem\u003ehigh-quality-low-contamination\u003c/em\u003e, \u003cem\u003ehigh-quality-no-contamination\u003c/em\u003e and \u003cem\u003elow-quality\u003c/em\u003e. The code to perform the quality control and the trained model is available at here: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/folkehelseinstituttet/FHI_SC2_Pipeline_Illumina\u003c/span\u003e\u003cspan address=\"https://github.com/folkehelseinstituttet/FHI_SC2_Pipeline_Illumina\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eExtraction of sequences for the major and minor variants.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eOnce a possible contamination or coinfection is identified, the sequence of the major variant (most abundant variant) was generated by concatenating the nucleotides with highest frequency at each position of the genome. To generate the sequence of the minor variant (second most abundant variant), the nucleotides in which the noise of sequence was higher than 0.1 were replaced. The nucleotide that replaced the nucleotide with highest frequency (major) was the one with the second highest frequency (minor). We implemented the extraction of sequences by parsing the output of NoisExtractor in R.\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eIdentification of recombinant sequences.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eTo identify recombinant sequences, we developed PrecFinder (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/garcia-nacho/Precfinder\u003c/span\u003e\u003cspan address=\"https://github.com/garcia-nacho/Precfinder\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). For each single mutation in a sequence, PrecFinder calculates the Bayes\u0026rsquo; probability of the virus belonging to a particular Pangolin lineage based on the distribution of mutations in different lineages. As the ratios of the different mutations in the virus continues to evolve, the probability is calculated based on a database of sequences which is regularly updated.\u003c/p\u003e \u003cp\u003eTo find which sequences are recombinants, PrecFinder uses a \u003cem\u003e1D-convolutional neural network\u003c/em\u003e model. The model consists of three sets of a \u003cem\u003e1D-convolutional neural network layer\u003c/em\u003e (1D-CNN) followed by a \u003cem\u003e1D-MaxPooling\u003c/em\u003e layer. The three \u003cem\u003e1D-CNN\u003c/em\u003e have 64, 32 and 12 filters and kernel sizes of 5, 3 and 3 nucleotides respectively. The pool sizes of all the \u003cem\u003e1D-MaxPooling\u003c/em\u003e layers were set to 2. Then, two \u003cem\u003efeed-forward\u003c/em\u003e layers with 24 and 12 layers respectively were included. Finally, a \u003cem\u003esoftmax\u003c/em\u003e classification layer outputs the score to classify the sample. The input of the model consists of a Bayes\u0026rsquo; probability matrix of \u003cem\u003en\u003c/em\u003e by \u003cem\u003em\u003c/em\u003e dimensions. Where \u003cem\u003en\u003c/em\u003e is the number of unique lineages present on the database and \u003cem\u003em\u003c/em\u003e is the maximum number of mutations present in at least one sequence of the database. To train the model we used \u003cem\u003ebinary-crossentropy\u003c/em\u003e as loss function, \u003cem\u003eadagrad\u003c/em\u003e as optimizer and a batch size of 128. The training of the model was scheduled for 60 epochs but it was \u003cem\u003eearly-stopped\u003c/em\u003e if there was no improvement on the accuracy after eight epochs. The weights of the model with highest accuracy were saved. Moreover F1, precision and recall were calculated. As training set, we used the sequences present in the database which consists of a synthetic set of recombinant sequences that were generated using the sequences present in the database. Sequences assigned to different lineages were recombined \u003cem\u003ein silico\u003c/em\u003e through one, two or three \u003cem\u003ebreaking-points\u003c/em\u003e randomly selected in the genome. Moreover, we augmented the dataset through the reordering the \u003cem\u003en\u003c/em\u003e rows of the training set. The model was implemented and trained using Keras (Chollet \u003cem\u003eet al\u003c/em\u003e., 2015) and TensorFlow v2.8 (Abadi et al., \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) in R.\u003c/p\u003e \u003cp\u003eRecombinant sequences were also identified using the program sc2rf (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/lenaschimmel/sc2rf\u003c/span\u003e\u003cspan address=\"https://github.com/lenaschimmel/sc2rf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eLineage assignments.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eSARS-CoV-2 lineage assignment was performed using Pangolin (O\u0026rsquo;Toole et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) and Nextclade (Aksamentov et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2021\u003c/span\u003e) and the mutations at nucleotide and amino acid levels were identified using Nextclade (Aksamentov et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2021\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eTo identify the AY.98.1 and BA.5 specific mutations, 2000 AY.98.1 and BA.5 sequences were downloaded from NCBI GenBank using cov-sampler (Cheng et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). Sequences with low-quality and/or wrong lineage assignment according to Nextclade were removed, and the mutations present in the remaining sequences were extracted using Nextclade (Aksamentov et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). Based on the frequency of the mutations present on the AY.98.1 and BA.5 lineages, the mutations present in our sequences were classified either as AY.98.1-specific, BA.5-specific or other, where other means that the mutation is not found on any of the lineages or that it can be found in both. All plots to visualize Pangolin lineages were generated in R (R Core Team, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) using the library ggplot2 (Wickham, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2016\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eCultivation of recombinant virus.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eVero E6/TMPRSS2 cells (NIBSC #100978) were cultivated in complete Dulbecco\u0026rsquo;s Modified Eagle Medium (cDMEM) supplemented with 10% fetal bovine serum (FBS) and 1mg/ml G418. In a biosafety level 3 (BSL3) laboratory, clinical samples collected from the patient were added to the cells at approx. 60% confluency in a T-25 flask for 1h at 37\u0026deg;C. The inoculate was subsequently removed and replaced with fresh viral culture medium (DMEM supplemented with 2% FBS, 100 units/ml penicillin, 100 ug/ml streptomycin and 25 mM HEPES). The infected cells were incubated for 3\u0026ndash;4 days at 37\u0026deg;C and the supernatant was then diluted 1:1000 and passaged onto fresh cells for a second passage. After 3\u0026ndash;4 more days the second passage of virus was harvested. Both the first and the second passage of the virus were sequenced.\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eFitness estimation.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eTo estimate the fitness of the different virus strains, we identified the substitutions that they carried at the amino acid level using Nextclade. Then, we connected those mutations with the fitness estimated from Bloom and Neher (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). The fitness of each of the variants was computed as the sum of the fitnesses of the individual mutations present in the sample. If a sequence contained mutations absent in the fitness database, no fitness was assigned to that mutation.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eIdentification of a co-infection.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eAs part of our SARS-CoV-2 surveillance at the Norwegian Institute of Public Health (NIPH), the purity and consistency in the sequence data is monitored. Through this quality control, we identified a sample with high levels of noise, or sequence variation, after the mapping of the reads which typically indicate either contamination or co-infection (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). To rule out contamination or other sequencing artifacts as causes for the observation, we repeated the entire analysis from RNA extraction, cDNA generation, PCR amplification, library preparation, to sequencing. The re-processed sample showed the exactly same noise pattern (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA and \u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eA \u0026ldquo;day 0\u0026rdquo;).\u003c/p\u003e \u003cp\u003eStrikingly, the sequence variation was restricted mainly to the first two-thirds of the genome (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA), which warranted further investigation. We attempted to re-create the genomic sequences of a potential major and minor variant in the sample (i.e., co-occurring strains of different abundances) (See Methods for details on the generations of the major and minor sequences) and we found that the major sequence was classified as a delta variant (Pangolin lineage: AY.98.1/NextClade clade: 21J) and the minor sequence as a variant of omicron (BA.5/21B) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA and Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eIdentification of delta/omicron recombinant strains.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eFine-grained mutation-profile analyses of the two strains (i.e., major and minor variants) showed that both sequences were actually recombinants. The major strain had AY.98-specific mutations in the first 15Kb of the genome (flanked by \u003cem\u003eG210T\u003c/em\u003e and \u003cem\u003eG15451A\u003c/em\u003e) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). Then, there was an 8Kb region overlapping with the Spike gene that contains just BA.5 mutations (flanked by \u003cem\u003eC17410T\u003c/em\u003e and \u003cem\u003eC25584T\u003c/em\u003e). The minor strain, on the contrary, had only BA.5-specific mutations on the first 15.7kb of the genome flanked by the BA.5-specific \u003cem\u003eC44T\u003c/em\u003e and \u003cem\u003eC15714T\u003c/em\u003e mutations. Then, there was a 3Kb region where there is a mixture of delta and omicron mutations. Next, there was a 4Kb region covering the Spike gene where there are BA.5 mutations only (flanked by \u003cem\u003eC21618T\u003c/em\u003e and \u003cem\u003eC25584T\u003c/em\u003e BA.5 mutations) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). The final part of the genome of both strains was identical and it carries only AY.98.1-specific mutations (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB).\u003c/p\u003e \u003cp\u003eThis suggests that the sample isolated on \u003cem\u003eday 0\u003c/em\u003e was actually a mixture of at least two recombinant strains, one AY.98.1-BA.5-AY.98.1 recombinant and one a BA.5-AY.98.1 recombinant.\u003c/p\u003e \u003cp\u003eTwo independent recombinant-detection tools, \u003cem\u003esc2rf\u003c/em\u003e and \u003cem\u003ePrecFinder\u003c/em\u003e classified both strains as recombinants (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB, Fig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eA and Fig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eB).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eVirus evolution in an immunocompromised long-term infected COVID-19 patient.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eDriven by these results, we became interested in tracing the sample's origin, and we found that it had been obtained from a long-term infected COVID-19 patient. Since the patient was already being monitored at the hospital, we could obtain five additional samples from the hospital collected from 288 days before the \u003cem\u003eday 0\u003c/em\u003e sample and up to 10 days before (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eA and \u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). All these samples had much lower levels of noise compared to \u003cem\u003eday\u003c/em\u003e 0, suggesting that the patient was only infected by a single viral strain at these time points. Some degree of noise was observed on \u003cem\u003eday \u0026minus;\u0026thinsp;171\u003c/em\u003e and \u003cem\u003eday \u0026minus;\u0026thinsp;130\u003c/em\u003e), but analysis of any major and minor strains in these samples, as well as all the other samples taken before \u003cem\u003eday 0\u003c/em\u003e, showed only AY.98.1 strains without any BA.5-specific mutations (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eC). This suggest that the recombination events happened sometime in the ten days before \u003cem\u003eday 0\u003c/em\u003e.\u003c/p\u003e \u003cp\u003eTo gain information on the relative fitness of the two recombinants, we decided to follow up the patient to see the viruses competing \u003cem\u003ein vivo\u003c/em\u003e. We collected five extra samples from the patient at 22, 69, 70, 93 and 103 days after \u003cem\u003eday 0\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eA and Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). We found that the new samples had low noise levels, consistent with the presence of just a single lineage (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). Indeed, when we extracted major and minor variants from these new samples, we found that all of them belonged to a new recombinant lineage. This survivor lineage was the recombinant with an AY.98.1 backbone and a BA.5 Spike. We noticed that the lineage was similar but not identical to the major lineage found on the sample collected on \u003cem\u003eday 0\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eC).\u003c/p\u003e \u003cp\u003eBy looking at the mutation profiles (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eC) together with the noise across the genome (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eA) it seems unlikely that the minor lineage found on the \u003cem\u003eday 0\u003c/em\u003e was the one that outcompeted the other lineages. We therefore hypothesized two scenarios that could lead to the results observed 22 days after \u003cem\u003eday 0\u003c/em\u003e. (i) The two recombinant lineages found on the \u003cem\u003eday 0\u003c/em\u003e recombined again, so that the one that became dominant afterwards lost four BA.5-specific mutations (i.e., \u003cem\u003eC17410T\u003c/em\u003e, A18163G, C19955T, and A20055G) and acquired three AY.98.1 specific mutations (i.e., C16466T, C19220T, C19524T). (ii) Alternatively, on the sample taken on \u003cem\u003eday 0\u003c/em\u003e there was indeed a mixture of at least three recombinants: a BA.5-AY.98.1 similar to the minor variant on \u003cem\u003eday 0\u003c/em\u003e, a AY.98.1-BA.5-AY.98.1 similar to the major variant on \u003cem\u003eday 0\u003c/em\u003e, and another AY.98.1-BA.5-AY.91 recombinant similar to the major variant present on the sample taken 22 days after \u003cem\u003eday 0\u003c/em\u003e. A mixture of three such recombinants with approximated ratios 65%, 10% and 25% respectively, would produce a noise pattern like the one observed on day 0 (Fig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eC versus Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eA \u0026ldquo;\u003cem\u003eday 0\u003c/em\u003e\u0026rdquo;). Although it is impossible to distinguish between these scenarios \u003cem\u003ea posteriori\u003c/em\u003e, we believe that the second scenario is more plausible. Anyway, both scenarios require multiple recombination events suggesting that recombination between different SARS-CoV-2 lineages may occur frequently during co-infection.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eCultivation of the recombinant lineage.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eTo investigate the ability of the recombinant strain to propagate \u003cem\u003ein vitro\u003c/em\u003e, we cultivated the virus extracted from the sample taken 70 days after the \u003cem\u003eday 0\u003c/em\u003e. We found that the virus was able to replicated and that after two passages the sequence was the same as the original recombinant (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e2\u003c/span\u003eC, \u003cem\u003e\u0026ldquo;day\u0026thinsp;+\u0026thinsp;70 in-vitro\u0026rdquo;\u003c/em\u003e). Interestingly, we found that all the noisy positions present in the original sample, were also noisy in the cultured samples (Fig. \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eD), suggesting that the sample contained a mix of strains with different mutations at these positions. However, none of them seemed to provide a strong fitness advantage, at least \u003cem\u003ein vitro\u003c/em\u003e.\u003c/p\u003e \u003cp\u003e \u003cspan type=\"ItalicUnderline\" class=\"ItalicUnderline\" name=\"Emphasis\"\u003eFitness advantage of the recombinant.\u003c/span\u003e \u003c/p\u003e \u003cp\u003eWe hypothesized that the competition of two similar viruses inside a patient would be the perfect arena to infer which mutations would provide fitness advantages \u003cem\u003ein vivo\u003c/em\u003e. After excluding the mutations gained or lost because of the recombination on \u003cem\u003eday 0\u003c/em\u003e, we found 21 mutations with presence/absence patterns that suggested they had been gained and/or lost during the evolution of the virus within the patient (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e3\u003c/span\u003e). We found 13 mutations that were incorporated into the genome at some point during the infection (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). We found seven mutations that were gained and then subsequently lost some time after and one mutation (S:H49Y) with a pattern that suggests that it might have been gained by the virus at two different timepoints (day \u0026minus;\u0026thinsp;171 and day 93).\u003c/p\u003e \u003cp\u003eThen, we analysed the fitness difference associated with these 21 mutations by associating them with the fitness differences calculated by Bloom and Neher (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). Surprisingly, we found that seven of the thirteen mutations gained and fixated into the genome are estimated to yield a negative fitness difference; conversely, six of eight mutations gained and lost had positive fitness differences.\u003c/p\u003e \u003cp\u003eFinally, we estimated the fitness of all the variants identified by adding the fitness associated with all the mutations present in their genomes. We found that although the virus has gained fitness during the infection, the main driver for the fitness increase was the recombination event (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). Moreover, the presence of a significant number of \u003cem\u003egained-and-lost\u003c/em\u003e mutations (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e3\u003c/span\u003eB) together with the differences in the estimated fitness between major and minor variants (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e3\u003c/span\u003eC) suggests the presence of different subvariants with different mutations competing within the patient.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eHomologous recombination in coronaviruses is thought to occur when the enzyme RNA-dependent RNA polymerase (RdRp) separates from one RNA template while keeping the nascent RNA and then continues building the strand at the same position using a different template molecule (Focosi \u0026amp; Maggi, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). Although coronaviruses have evolved to use recombination as part of their replication processes to produce a pool of recombined RNA molecules, the role of this viral molecular mechanism in generating novel recombinant lineages remains uncertain.\u003c/p\u003e \u003cp\u003eTo our knowledge, this is the first report of a recombinant SARS-CoV-2 virus between these the omicron BA.5 and delta AY.98 lineages and the first time that we have witnessed consecutive \u003cem\u003esequencing snapshots\u003c/em\u003e of the competition of several SARS-CoV-2 lineages in one infected individual. Although other studies have found recombinant viruses in sequential samples acquired from long-term infected patients (Burel et al., \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2022\u003c/span\u003e), this is the first time that we have obtained and analysed samples in which at least two recombinant lineages were competing each other \u003cem\u003ein vivo\u003c/em\u003e. Moreover, we have developed and released a set of tools to detect and analyse this type of events in the future (i.e., Precfinder, NoisExtractor, Co-infection detection tool).\u003c/p\u003e \u003cp\u003eThe results of this study reveal the emergence of a recombinant virus with an AY.98.1 backbone and a BA.5 Spike gene isolated from a long-term infected COVID-19 patient in Norway. The most likely scenario for this recombinant to arise is that, while at the hospital, the long-term patient infected with an AY.98.1 virus came in contact with another person infected by a BA.5 virus leading to a coinfection and that shortly after the two lineages recombined. The recombined strain that eventually became the dominant strain in the patient probably arose within 10 days prior to the first detection of the recombinant lineage. However, our observations suggest that there were actually multiple recombination events within the patient, both between the omicron and delta variants but also secondary events between different recombinants. These results suggest that recombination can occur frequently during coinfection, and they highlight the importance of close monitoring and early detection of such events.\u003c/p\u003e \u003cp\u003eMoreover, our findings suggest that the several recombinant viruses may have been competing in the patient. And the fact that some apparently harmful mutations were retained over time, while beneficial mutations were lost, suggest that the evolution of the virus within the patient might be affected by clonal interference, a phenomenon where beneficial mutations may disappear from a population because of competition between sub-variants carrying the different mutations (Strelkowa \u0026amp; L\u0026auml;ssig, \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). By analysing the fitness associated with each of the observed mutations in the viral population in the long-term infected patient, we found that recombination had a major impact on the fitness increase of the virus. Indeed, the fitness gained due to mutations acquired or lost during the infection seems to be lower than the fitness gained because of the incorporation of an \u003cem\u003eOmicron Spike\u003c/em\u003e into a \u003cem\u003eDelta backbone\u003c/em\u003e via recombination. Recombination in betacoronaviruses may therefore serve as a powerful mechanism to overcome clonal interference and ensure mixing of genetic material between lineages. Clonal interference is strongest in asexual organisms, or when there is a strong linkage disequilibrium, but recombination might serve to overcome clonal interference. Indeed, one hypothesis that might explain the success of RNA-viruses is that the high recombination rates in RNA-viruses might help them to overcome the burdens of clonality.\u003c/p\u003e \u003cp\u003eHowever, it is possible that our fitness estimations differ from the actual fitness of the virus because of three reasons. First, the database that we used to estimate the fitness was constructed using epidemiological data gathered from viral databases and it is possible that the mutations important for the fitness of the virus at the population level differ from the mutations important for its transmission between cells within the patient. This might be especially relevant for patients with a weakened immune system unable of clearing the infection for months. Second, when we computed the overall fitness of each lineage, we did not account for epistatic relationships between mutations and it is possible that genetic interactions between mutations become important determinants for the overall fitness of the virus. Third, the fitness associated with deletions of amino acids was not taken into account since the dataset does not have information about the fitness changes due to sequence deletions or insertions.\u003c/p\u003e \u003cp\u003eTherefore, further research is needed to investigate the potential implications of the mutations gained by the virus during its evolution within the patient (i.e., ORF7a:E22D, ORF1ab:V86F, ORF1ab:V4102A, ORF1ab:N1080I, ORF1ab:P1427S, ORF1ab:C1889Y, ORF1ab:S1272G, ORF1ab:Q4100H, ORF1ab:M1156I, ORF1ab:A2909V, N:P326L, ORF1ab:I3619V, ORF1ab:T1538I) in term of fitness, transmissibility, virulence, and vaccine effectiveness.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThe identification of recombinant viruses in a long-term infected COVID-19 patient raises questions about the potential for similar events to occur in other patients and populations, as well as the implications for ongoing efforts to control the spread of the virus.\u003c/p\u003e \u003cp\u003eOur findings highlight the importance of continued surveillance and monitoring of SARS-CoV-2 genomes, particularly in high-risk populations such as long-term infected immunocompromised COVID-19 patients, to detect and respond to potential recombination events and other evolutionary changes in the virus. These patients are possibly one of the most probable causes for new novel recombinant SARS-CoV-2 variants.\u003c/p\u003e \u003cp\u003eOverall, our study provides important insights into the genetic diversity and evolution of SARS-CoV-2 and underscores the need for ongoing research and surveillance efforts to better understand and combat this global health threat.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eSARS-CoV-2\u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;Severe Acute Respiratoy Syndrome coronavirus-2\u003c/p\u003e\n\u003cp\u003eENA\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;European Nucleotide Archive\u003c/p\u003e\n\u003cp\u003eNIPH \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;Norwegian Insitute of Public Health\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003eData availability \u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eSequences in fastq and fasta format are stored in the European Nucleotide Archive (ENA) under the\u003c/p\u003e\n\u003cp\u003eProject ID PRJEB71327.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eWe acknowledge all the hard work carried out in the clinic and their contribution to the infection control monitoring which has led to the discovery of such important single cases as this case. We also acknowledge the contributions of providers of publically available sequences. We are also very grateful for the whole team of highly skilled technicians involved in whole genome sequencing of the samples at the NIPH, and especially Rasmus Kopperud Riis. We also sincerely thank the Norwegian Sequencing Centre (NSC) NorSeq for partnering during the pandemic to achieve large volume sequencing capacity.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eEthics Approval\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eEthical approval has not been sought for these analyses since the work has been carried out as part of the monitoring of infectious diseases at the national public health institute covered by the national infection control act. The ethics committee/scientific department at the local hospital, Levanger Hospital, has nevertheless been consulted and approval has been given to publish the results in current form.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe ethics committee/scientific advice department, at the local hospital, Levanger Hospital, has been consulted and approval has been given to publish the results in current form without consent from the actual patient.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eAuthor contributions\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIG, OH and KB conceptualized the study. OF og KZ provided the samples and the clinical information. IG and JB analysed the data. EF performed the cultivation of the virus in vitro. \u0026nbsp;IG, JB, EF, LM, AR, OH and KB wrote the manuscript. All authors reviewed and edited the manuscript, and approved the final version.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eAbadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jozefowicz R., , Jia, Y., Kaiser, L., Kudlur, M., Levenberg, J., Man\u0026eacute;, D., Schuster, M., \u0026amp; others (2015) TensorFlow: Large-scale machine learning on heterogeneous systems. \u003cem\u003eSoftware available from tensorflow.org\u003c/em\u003e.\u003c/li\u003e\n \u003cli\u003eAksamentov, I., Roemer, C., Hodcroft, E., \u0026amp; Neher, R. (2021). Nextclade: clade assignment, mutation calling and quality control for viral genomes. \u003cem\u003eJournal of Open Source Software\u003c/em\u003e,\u003cem\u003e\u0026nbsp;6\u003c/em\u003e, 3773. https://doi.org/10.21105/joss.03773\u003c/li\u003e\n \u003cli\u003eBentley, K., \u0026amp; Evans, D. J. (2018). Mechanisms and consequences of positive-strand RNA virus recombination. \u003cem\u003eJournal of General Virology\u003c/em\u003e,\u003cem\u003e\u0026nbsp;99\u003c/em\u003e(10), 1345-1356. https://doi.org/https://doi.org/10.1099/jgv.0.001142\u003c/li\u003e\n \u003cli\u003eBloom, J. D., \u0026amp; Neher, R. A. (2023). Fitness effects of mutations to SARS-CoV-2 proteins. \u003cem\u003eVirus Evolution\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e(2):vead55. https://doi.org/10.1093/ve/vead055\u003c/li\u003e\n \u003cli\u003eBurel, E., Colson, P., Lagier, J.-C., Levasseur, A., Bedotto, M., Lavrard-Meyer, P., Fournier, P.-E., La Scola, B., \u0026amp; Raoult, D. (2022). Sequential Appearance and Isolation of a SARS-CoV-2 Recombinant between Two Major SARS-CoV-2 Variants in a Chronically Infected Immunocompromised Patient. \u003cem\u003eViruses\u003c/em\u003e,\u003cem\u003e\u0026nbsp;14\u003c/em\u003e(6), 1266. https://www.mdpi.com/1999-4915/14/6/1266\u003c/li\u003e\n \u003cli\u003eCarabelli, A. M., Peacock, T. P., Thorne, L. G., Harvey, W. T., Hughes, J., 6, C.-G. U. C. d. S. T. I., Peacock, S. J., Barclay, W. S., de Silva, T. I., \u0026amp; Towers, G. J. (2023). SARS-CoV-2 variant biology: immune escape, transmission and fitness. \u003cem\u003eNature Reviews Microbiology\u003c/em\u003e, 1-16.\u003c/li\u003e\n \u003cli\u003eChen, S., Zhou, Y., Chen, Y., \u0026amp; Gu, J. (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. \u003cem\u003eBioinformatics\u003c/em\u003e,\u003cem\u003e\u0026nbsp;34\u003c/em\u003e(17), i884-i890. https://doi.org/10.1093/bioinformatics/bty560\u003c/li\u003e\n \u003cli\u003eCheng, Y., Ji, C., Han, N., Li, J., Xu, L., Chen, Z., Yang, R., Zhou, H.-Y., \u0026amp; Wu, A. (2022). covSampler: A subsampling method with balanced genetic diversity for large-scale SARS-CoV-2 genome data sets. \u003cem\u003eVirus Evolution\u003c/em\u003e,\u003cem\u003e\u0026nbsp;8\u003c/em\u003e(2). https://doi.org/10.1093/ve/veac071\u003c/li\u003e\n \u003cli\u003eChollet, F. \u0026amp; others, (2015). Keras. https://keras.io\u003c/li\u003e\n \u003cli\u003eDanecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., McCarthy, S. A., Davies, R. M., \u0026amp; Li, H. (2021). Twelve years of SAMtools and BCFtools. \u003cem\u003eGigaScience\u003c/em\u003e,\u003cem\u003e\u0026nbsp;10\u003c/em\u003e(2). https://doi.org/10.1093/gigascience/giab008\u003c/li\u003e\n \u003cli\u003eFocosi, D., \u0026amp; Maggi, F. (2022). Recombination in Coronaviruses, with a Focus on SARS-CoV-2. \u003cem\u003eViruses\u003c/em\u003e,\u003cem\u003e\u0026nbsp;14\u003c/em\u003e(6). https://doi.org/10.3390/v14061239\u003c/li\u003e\n \u003cli\u003eGrubaugh, N. D., Gangavarapu, K., Quick, J., Matteson, N. L., De Jesus, J. G., Main, B. J., Tan, A. L., Paul, L. M., Brackney, D. E., Grewal, S., Gurfield, N., Van Rompay, K. K. A., Isern, S., Michael, S. F., Coffey, L. L., Loman, N. J., \u0026amp; Andersen, K. G. (2019). An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. \u003cem\u003eGenome Biology\u003c/em\u003e,\u003cem\u003e\u0026nbsp;20\u003c/em\u003e(1), 8. https://doi.org/10.1186/s13059-018-1618-7\u003c/li\u003e\n \u003cli\u003eHarari, S., Tahor, M., Rutsinsky, N., Meijer, S., Miller, D., Henig, O., Halutz, O., Levytskyi, K., Ben-Ami, R., Adler, A., Paran, R., \u0026amp; Adi Stern (2022). Drivers of adaptive evolution during chronic SARS-CoV-2 infections. \u003cem\u003eNature Medicine, 28, 1501-1508\u003c/em\u003e. https://doi.org/10.1038/s41591-022-01882-4\u003c/li\u003e\n \u003cli\u003eLangmead, B., \u0026amp; Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. \u003cem\u003eNature Methods\u003c/em\u003e,\u003cem\u003e\u0026nbsp;9\u003c/em\u003e(4), 357-359. https://doi.org/10.1038/nmeth.1923\u003c/li\u003e\n \u003cli\u003eLi, P., de Vries, A. C., Kamar, N., Peppelenbosch, M. P., Pan, Q. (2022). Monitoring and managing SARS-CoV-2 evolution in immunocompromised populations. \u003cem\u003eLancet Microbe\u003c/em\u003e, 3(5), e325-e326. https://doi.org/10.1016/S2666-5247(22)00061-1\u003c/li\u003e\n \u003cli\u003eMeyerowitz, E. A., Richterman, A., Gandhi, R. T., \u0026amp; Sax, P. E. (2021). Transmission of SARS-CoV-2: A Review of Viral, Host, and Environmental Factors. \u003cem\u003eAnn Intern Med\u003c/em\u003e,\u003cem\u003e\u0026nbsp;174\u003c/em\u003e(1), 69-79. https://doi.org/10.7326/m20-5008\u003c/li\u003e\n \u003cli\u003eO\u0026rsquo;Toole, \u0026Aacute;., Scher, E., Underwood, A., Jackson, B., Hill, V., McCrone, J. T., Colquhoun, R., Ruis, C., Abu-Dahab, K., Taylor, B., Yeats, C., du Plessis, L., Maloney, D., Medd, N., Attwood, S. W., Aanensen, D. M., Holmes, E. C., Pybus, O. G., \u0026amp; Rambaut, A. (2021). Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool. \u003cem\u003eVirus Evolution\u003c/em\u003e,\u003cem\u003e\u0026nbsp;7\u003c/em\u003e(2). https://doi.org/10.1093/ve/veab064\u003c/li\u003e\n \u003cli\u003eOECD. (2021). Risks that matter 2020: The long reach of COVID-19. https://doi.org/doi:https://doi.org/10.1787/44932654-en Organisation, W. H. (2023). \u003cem\u003eWHO Coronavirus (COVID-19) Dashboard\u003c/em\u003e. https://covid19.who.int/\u003c/li\u003e\n \u003cli\u003eParums, D. V. (2023). Editorial: The XBB.1.5 (\u0026apos;Kraken\u0026apos;) Subvariant of Omicron SARS-CoV-2 and its Rapid Global Spread. \u003cem\u003eMed Sci Monit\u003c/em\u003e,\u003cem\u003e\u0026nbsp;29\u003c/em\u003e, e939580. https://doi.org/10.12659/msm.939580\u003c/li\u003e\n \u003cli\u003eR Core Team. (2022). A language and environment for statistical computing. \u003cem\u003eR Foundation for Statistical Computing\u003c/em\u003e, https://www.R-project.org\u003c/li\u003e\n \u003cli\u003eSekizuka, T., Saito, M., Itokawa, K., Sasaki, N., Tanaka, R., Eto, S., Someno, R., Ogamino, A., Yokota, E., Saito, T., \u0026amp; Kuroda, M. (2022). Recombination between SARS-CoV-2 Omicron BA.1 and BA.2 variants identified in a traveller from Nepal at the airport quarantine facility in Japan. \u003cem\u003eJournal of Travel Medicine\u003c/em\u003e,\u003cem\u003e\u0026nbsp;29\u003c/em\u003e(6). https://doi.org/10.1093/jtm/taac051\u003c/li\u003e\n \u003cli\u003eStrelkowa, N., \u0026amp; L\u0026auml;ssig, M. (2012). Clonal interference in the evolution of influenza. Genetics, \u003cem\u003e192\u003c/em\u003e(2), 671-682. https://doi.org/10.1534/genetics.112.143396\u003c/li\u003e\n \u003cli\u003eWickham, H. (2016). \u003cem\u003eggplot2: Elegant Graphics for Data Analysis\u003c/em\u003e. Springer-Verlag New York. https://ggplot2.tidyverse.org\u003c/li\u003e\n \u003cli\u003e\u003cem\u003eWorld Development Report 2022: Finance for an Equitable Recovery\u003c/em\u003e. (2022). https://doi.org/10.1596/978-1-4648-1730-4\u003c/li\u003e\n \u003cli\u003eWorld Health, O. (2022). \u003cem\u003eCOVID-19 weekly epidemiological update, edition 115, 26 October 2022\u003c/em\u003e. https://apps.who.int/iris/handle/10665/363853\u003c/li\u003e\n \u003cli\u003eZannoli, S., Brandolini, M., Marino, M. M., Denicol\u0026ograve;, A., Mancini, A., Taddei, F., Arfilli, V., Manera, M., Gatti, G., Battisti, A., Grumiro, L., Scalcione, A., Dirani, G., Sambri, V. (2023). SARS-CoV-2 coinfection in immunocompromised host leads to the generation of recombinant strain. \u003cem\u003eInternational Journal of Infectious Diseases\u003c/em\u003e, 131, 65-70. https://doi.org/10.1016/j.ijid.2023.03.014\u003c/li\u003e\n \u003cli\u003eZhou, P., Yang, X.-L., Wang, X.-G., Hu, B., Zhang, L., Zhang, W., Si, H.-R., Zhu, Y., Li, B., Huang, C.-L., Chen, H.-D., Chen, J., Luo, Y., Guo, H., Jiang, R.-D., Liu, M.-Q., Chen, Y., Shen, X.-R., Wang, X., . . . Shi, Z.-L. (2020). A pneumonia outbreak associated with a new coronavirus of probable bat origin. \u003cem\u003eNature\u003c/em\u003e,\u003cem\u003e\u0026nbsp;579\u003c/em\u003e(7798), 270-273. https://doi.org/10.1038/s41586-020-2012-7\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"SARS-CoV-2, recombinant, immunocompromised, in-patient recombination event, Delta, Omicron","lastPublishedDoi":"10.21203/rs.3.rs-3787764/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3787764/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cem\u003eBackground\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe emergence of the SARS-CoV-2 virus led to a global pandemic, prompting extensive research efforts to understand its molecular biology, transmission dynamics, and pathogenesis. Recombination events have been increasingly recognized as a significant contributor to the virus's diversity and evolution, potentially leading to the emergence of novel strains with altered biological properties. Indeed, recombinant lineages such as the XBB variant and its descendants have subsequently dominated globally. Therefore, continued surveillance and monitoring of viral genome diversity is crucial to identify and understand the emergence and spread of novel strains.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eMethods\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe case was discovered through routine genomic surveillance of SARS-CoV-2 cases in Norway. Samples were whole genome sequenced by the Illumina NovaSeq platform and SARS-CoV-2 lineage assignment was performed using Pangolin and Nextclade. Mutations were pangolin classified based on the frequency of the mutations present in the AY.98.1 and BA.5 lineages.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eResults\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eIn this study, we report and investigate a SARS-CoV-2 recombination event in a long-term infected immunocompromised COVID-19 patient. Several recombination events between two distinct lineages of the virus, namely AY.98.1 and BA.5, were identified, resulting in a single novel recombinant viral strain with a unique genetic signature.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eConclusions\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe presence of several concomitant recombinants in the patient suggests that these events occur frequently \u003cem\u003ein vivo \u003c/em\u003eand can provide insight into the fitness associated with the different combinations of mutations. This study underscores the importance of continued tracking of viral diversity and the potential impact of recombination events on the evolution of the SARS-CoV-2 virus.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eTrial registration\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eRetrospectively registered\u003c/p\u003e","manuscriptTitle":"Recombinant SARS-CoV-2 Delta/Omicron BA.5 emerging in an immunocompromised long-term infected COVID-19 patient","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-01-05 17:47:56","doi":"10.21203/rs.3.rs-3787764/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"0bef909d-0fd0-4482-a0c5-52c86261b1e4","owner":[],"postedDate":"January 5th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-01-06T05:14:14+00:00","versionOfRecord":[],"versionCreatedAt":"2024-01-05 17:47:56","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3787764","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3787764","identity":"rs-3787764","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00