Laboratory validation of a clinical metagenomic next-generation sequencing assay for respiratory virus detection and discovery | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Laboratory validation of a clinical metagenomic next-generation sequencing assay for respiratory virus detection and discovery Charles Chiu, Jessica Tan, Venice Servellita, Doug Stryke, Emily Kelly, and 21 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4492202/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 12 Nov, 2024 Read the published version in Nature Communications → Version 1 posted You are reading this latest preprint version Abstract Tools for rapid identification of novel and/or emerging viruses are urgently needed for clinical diagnosis of unexplained infections and pandemic preparedness. Here we developed and clinically validated a largely automated metagenomic next-generation sequencing (mNGS) assay for agnostic detection of respiratory viral pathogens from upper respiratory swab and bronchoalveolar lavage samples in <24 hours. The mNGS assay achieved mean limits of detection of 543 copies/mL, viral load quantification with 100% linearity, and 93.6% sensitivity, 93.8% specificity, and 93.7% accuracy compared to gold-standard clinical multiplex RT-PCR. Performance increased to 97.9% overall predictive agreement after discrepancy testing and clinical adjudication, which was superior to that of RT-PCR (95.0% overall agreement). To enable discovery of novel, sequence-divergent human viruses with pandemic potential, de novo assembly and translated nucleotide algorithms were incorporated into the automated SURPI+ computational pipeline used by the mNGS assay for pathogen detection. Using in silico analysis, we showed after removal of all human viral sequences from the reference database that 70 (100%) of 70 representative human viral pathogens could still be identified based on homology to related animal or plant viruses. Our assay, which was granted breakthrough device designation from the US Food and Drug Administration (FDA) in August of 2023, demonstrates the feasibility of routine mNGS testing in clinical and public health laboratories, thus enabling a robust and rapid response to the next viral respiratory pandemic. Health sciences/Medical research/Translational research Health sciences/Diseases/Infectious diseases/Viral infection Biological sciences/Microbiology/Virology/Metagenomics Biological sciences/Microbiology/Clinical microbiology Biological sciences/Microbiology/Infectious-disease diagnostics metagenomic next-generation sequencing assay development agnostic detection respiratory virus detection pandemic preparedness SARS-CoV-2 viral diagnostics SURPI+ computational pipeline for pathogen detection viral load quantification diagnostic assay performance viral multiplex RT-PCR Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Introduction Respiratory infections are among the most common infections globally and are associated with significant morbidity and mortality 1-3 . Despite their importance, half of adult patients hospitalized in the United States with community-acquired pneumonia, which is most commonly caused by respiratory viruses, have no causative pathogen identified 2-5 . Respiratory infections caused by viruses can be especially challenging to diagnose because of the diversity of potential agents 6-8 . In particular, emerging pandemic viruses represent an unpredictable threat which traditional diagnostic tools such as nucleic acid amplification tests have not been designed to detect 9 . The importance of unbiased assays for rapid identification of viral pathogens, especially those with sequence-divergent genomes, became evident during the discovery of SARS-CoV-2 10,11 Metagenomic next-generation sequencing (mNGS) has emerged as an attractive diagnostic method for identifying causative agents in unexplained infections as it provides a comprehensive and agnostic approach by which all potential pathogens can be identified in a single assay without the need for specific primers and probes 12,13 . mNGS has been used for broadly diagnosing infections, whether viral, bacterial, fungal, or parasitic, from multiple specimen types 14-16 , and its clinical utility has been demonstrated for neurological and bloodstream infections 16-18 . However, despite the favorable performance of mNGS testing as shown by multiple studies, general adoption of mNGS technologies in clinical microbiology laboratories has been hindered by high costs, complex protocols, lack of automation, insufficient standardization of bioinformatic pipelines, prolonged turnaround times (24-72 hours), lack for regulatory guidelines for clinical validation, and overall lower sensitivity for detection of common pathogens relative to targeted approaches such as polymerase chain reaction (PCR) assays 19 . Here we describe the development, optimization, and clinical validation of a streamlined and largely automated mNGS laboratory-developed test (LDT) with a sample-to-result turnaround time of less than 24 hours for identification of common as well as unexpected and/or novel viral respiratory pathogens. The computational SURPI+ pipeline used by the mNGS assay was modified to provide enhanced analysis capabilities, including viral load quantification, incorporation of curated reference genome databases such as FDA dAtabase for Reference Grade micrObial Sequences (FDA-ARGOS), and sensitive identification of novel, sequence-divergent viruses by de novo assembly and translated nucleotide alignment. We comprehensively evaluated assay performance metrics, including limits of detection, linearity, precision, inclusivity and exclusivity, contamination, interference, matrix effect, stability, accuracy, and capacity to detect novel viruses. Results Development and Optimization of an mNGS Assay for Detection of Viral Respiratory Pathogens We developed an mNGS assay for the detection of viral pathogens from respiratory secretions, including upper respiratory swab and bronchoalveolar lavage (BAL) fluid samples (Figure 1) . We leveraged our 7-year experience running clinical mNGS assays for pathogen detection from cerebrospinal fluid 20 by optimizing the sample preparation and bioinformatics analysis protocols to maximize sensitivity and decrease assay sample-to-result turnaround time. We tested different combinations of centrifugation, heat, and addition of a DNA/RNA stabilization medium prior to total nucleic acid extraction and found that centrifugation alone produced the highest yield of detected viral reads. To decrease turnaround times, we used a 15-minute protocol for human rRNA depletion and reduced incubation times for the reverse transcription and second-strand cDNA synthesis steps to 15 and 9 minutes, respectively. The final assay used 450 μL of sample input volume and consisted of the following steps: (1) centrifugation (~15 min), total nucleic acid extraction and DNase treatment for isolation of total RNA (~1 hr), (2) cDNA synthesis with ribosomal RNA (rRNA) depletion (~1 hr), (3) barcoded adapter ligation, library PCR amplification and purification on an automated instrument (~6.5 hr), (4) library pooling (~5 min), (5) Illumina (San Diego, CA) sequencing (5 or 13 hr, depending on whether a MiniSeq or NextSeq sequencer is used), and (6) bioinformatics analysis for viral detection and quantification using the SURPI+ pipeline (~1 hr). Overall sample-to-answer assay turnaround time was 14 - 24 hours. We used MS2 phage and External RNA Controls Consortium (ERCC) RNA Spike-In Mix (Invitrogen, Waltham, MA) added into each sample as internal qualitative and quantitative controls, respectively. The MS2 phage and ERCC sequencing results were also used to evaluate and interpret the background level in the sample, generally originating from the human host (Supplementary Tables 1 and 2) .A commercial reference panel (Accuplex Panel, SeraCare, Milford, MA) consisting of quantified SARS-CoV-2, influenza A, influenza B, and respiratory syncytial virus (RSV) was spiked into pooled virus-negative nasopharyngeal swab matrix (see Methods for details) as an external positive control (PC) for the assay, with pooled virus-negative nasopharyngeal swabs from healthy uninfected donors as the negative matrix serving as an external negative control (NC). The SURPI+ computational pipeline, run as a container on either a server or cloud, was used for the identification of viral respiratory pathogens from mNGS data 21,22 . Three enhancements were made ( Figure 2A ). First, we added the capability for viral load quantification using the PC and a standard curve generated for each sample from the ERCC reads. Second, “tagging” of Genbank accession numbers in the SURPI+ database was incorporated to allow inclusion of curated viral reference genomes, such as those deposited in the FDA-ARGOS database 23 , for virus identification by alignment and results reporting . Third, a custom algorithm consisting of de novo assembly of metagenomic reads and translated nucleotide, or amino acid, alignment of the reads to a viral protein database was developed to enable detection of novel, sequence-divergent viruse 23 . Following the review of clinical charts, we investigated the correlation between viral load concentration, quantified in copies per milliliter (cp/mL) (Figure 2B). The severity of the infection which was categorized on a scale ranging from asymptomatic to mild, moderate, and severe. We observed significant differences in median viral loads between patients with asymptomatic/mild and moderate/severe infections (P < 0.001) (Supplemental Fig. 5a). Further stratification of patients into asymptomatic, mild, moderate, and severe infections highlighted an increasing trend in viral load concentrations. Through pairwise comparisons, we noted significant differences between asymptomatic and moderate (P < 0.01), as well as between mild and moderate (P < 0.01) infections. Overall, differences in median viral loads across all severity levels were significant (P < 0.001) (Supplemental Fig. 5b). Quality control metrics were based on those previously established for a validated cerebrospinal fluid mNGS assay 21 and include a minimum of 5 million preprocessed reads per sample, >75% of data with quality score >30 (Q>30), and successful detection of the internal spiked MS2 phage control and all four respiratory viruses in the PC. A threshold criterion of ≥3 non-overlapping viral reads or contigs aligning to the target viral genome was considered a positive detection. Overall, 93% (156 of 167) of both positive (n= 111) and negative (n=56) nasopharyngeal swab samples met QC metrics, those that did not meet QC metrics were excluded from the analysis. Analytical Sensitivity We adopted Clinical and Laboratory Standards Institute (CLSI) guidelines for NGS-based infectious diseases testing (MM24) 24 and validation of multiplex nucleic acid assays (MM17) 25 to conduct a comprehensive evaluation of assay performance metrics (Table 1) . To determine limits of detection (LoD), negative nasopharyngeal swab matrix was spiked with the Accuplex Verification Panel and diluted at concentrations ranging from 5,000 to 100 copies/mL, with 10 to 40 replicates at each concentration. By 95% probit analysis, the LoD was determined for each of the four representative organisms in the panel (SARS-CoV-2, Influenza A, Influenza B, and RSV). We found LoDs ranging from 439 to 706 copies/mL for the four respiratory viruses in the positive control (Figure 3) . The achieved average LoD of 550 copies/mL was comparable within one log to reported LoDs from specific reverse transcription-polymerase chain reaction (RT-PCR) assays for detection of viral respiratory pathogens 26 . Linearity To evaluate the assay’s capability to accurately quantitate viral load for detected viruses, a linearity panel was generated using five log dilutions of a quantified high-titer SARS-CoV-2 positive nasal swab sample and compared to a commercially available AccuSpan TM HCV RNA Linearity Panel. For both panels, the calculated linearity was 100% after running duplicates or triplicate replicates across a minimum of four 10-fold dilutions (Supplementary Figure 1). The absolute log 10 deviation of calculated from expected viral loads was <0.52 log 10 , which was favorable in comparison to the interquartile ranges for virus-specific qPCR assays between different laboratories 27 . Precision We measured intra-assay precision by testing two PC and two NC samples within the same run using different barcodes across 20 runs and inter-assay precision by testing 20 PC and 20 NC samples using different barcodes across 20 separate runs. Essential agreement (EA) was 100% and intra- and inter-assay precision were within our a priori established limits of <10% and <30% (log-transformed coefficients of variation in reads per million), respectively (Table 1) . Inclusivity and Exclusivity To evaluate the ability of the mNGS assay to detect a wide range of targets (inclusivity), we obtained commercially available culture supernatants from 17 respiratory viruses representing different sublineages and subspecies. Viruses were spiked into negative control matrix at concentrations ranging from 1.3 x 10 3 to 1.2 x 10 7 50% tissue culture infective dose (TCID50) per mL in 1:10 ratio (Table 2) .All 17 (100%) of 17 viruses in these contrived samples were correctly identified by mNGS assay at the sublineage or subspecies level. Additionally, we identified subtypes of rhinovirus and enterovirus from PCR-positive clinical samples that were not differentiated by multiplex RT-PCR ( Supplementary Figure 2A ). We also evaluated the ability of the mNGS assay to identify uncommon or rare viral pathogens associated with respiratory infections (n=8 virus-positive tracheal aspirate samples) or central nervous system (CNS) infections (n=4 cerebrospinal fluid samples) in severely ill hospitalized patients (Table 2, Supplementary Figure 2B). The assay detected 11 (100%) of 11 viruses in these samples. To assess the exclusivity of the mNGS assay, we spiked two mixtures of microorganisms, including a previously reported positive control mNGS panel consisting of 7 representative pathogens 21 and a commercial reference panel consisting of 10 bacterial and fungal species, into negative nasopharyngeal swab matrix and analyzed multiple aliquots (Table 1 and Supplementary Table 3) . Detected reads from non-viral pathogenic organisms did not result in any false-positive detections for viral pathogens. Contamination,, Matrix Effect and Stability We evaluated potential cross-contamination between nearby sample wells and carryover contamination across successive runs from 10 SARS-CoV-2 high-titer clinical samples and 24 controls (cycle threshold, or C t = 16-20) loaded in a modified checkerboard pattern (with at least one space between samples) on a 96-well plate, to mimic a single run on the Illumina NextSeq instrument. Only one possible cross-contamination event was observed, with a single SARS-CoV-2 read detected in one of the negative control wells at a subthreshold reporting level. We also evaluated the effects of interference from human RNA, bacterial DNA, and potential interfering substances on mNGS assay performance. Hemolysis, lipids, bilirubin, and human genomic RNA spiked into PC matrix at concentrations of 0.1 – 100 µg/mL did not interfere with respiratory virus detection, but background DNA/RNA spiked into PC matrix at concentrations ³1 x 10 7 cells/mL resulted in failure to detect viruses due to high background. To evaluate the potential matrix effect from samples with high host background, we analyzed 14 PCR-positive highly mucoid bronchoalveolar lavage (BAL) samples obtained from lung transplant or cystic fibrosis patients undergoing surveillance bronchoscopy (Supplementary Table 4) . All 14 samples had high host background, and 13 (92.9%) of 14 samples had very high host background. As a result, 6 (42.9%) of 14 samples had neither detection of the internal spiked MS2 phage control nor of a respiratory virus, and thus excluded from further analysis, as they not pass equencing quality control criteria (Supplementary Table 1). The respiratory viral pathogen was detected in all (100%) of the remaining 8 samples. We concluded that highly mucoid samples can inhibit the assay due to high host background. Finally, we evaluated mNGS assay stability; qualitative detection was not affected by keeping samples for up to 7 days at 4°C or subjecting the samples to 3 freeze/thaw cycles. Accuracy To evaluate accuracy, 191 residual samples after routine clinical testing were obtained from the UCSF Clinical Microbiology Laboratory, including 110 virus-positive samples (104 upper respiratory swab samples and 6 BAL fluids) from patients with acute respiratory infection (Supplementary Dataset 1) , along with 81 virus-negative samples (52 upper respiratory swab samples and 29 BAL fluids) (Figure 4) .As more than one target may be positive with mNGS and respiratory viral multiplex panel (RVP) testing using FDA-approved in vitro diagnostic (IVD) assays, sensitivity/specificity analyses were performed by assessing each result independently to assign true/false-positive/negative calls (see Methods for details). Compared to results from RVP RT-PCR testing, the mNGS assay exhibited 93.6% (103 of 110) sensitivity, 93.8% (76 of 81) specificity, and 93.7% (179 of 191) accuracy. Discrepancy testing and clinical adjudication (DTCA) of 14 mNGS positive-RVP negative samples using blinded chart review by two board-certified infectious diseases physician (PB and CYC) and orthogonal assays run by the California Department of Public Health Viral and Rickettsial Disease Laboratory confirmed the presence of 9 respiratory viruses missed by RVP, allowing them to be reclassified as true positives (Supplementary Table 5) . Viruses detected by mNGS but not targeted by RVP were not considered false-positive results. In one case, while the original RVP and orthogonal PCR testing returned negative results, mNGS identified rhinovirus C with high confidence. A review of the viral sequences revealed 12 non-overlapping reads across the human rhinovirus C genome (Supplementary Figure 3) . Cross-contamination was ruled out, as no other sample in the sequencing batch tested positive for rhinovirus. A nucleotide BLAST (blastn) search confirmed sequences with high homology (95-98% identity) to known rhinovirus C strains (Supplementary Data 1) . Although the exact primer binding sites for the clinical RT-PCR assays used in the current study are unknown, we identified, for the rhinovirus C sample, the presence of mismatches in primer and probe regions from previously reported RT-PCR assays targeting the 5’-untranslated region (UTR) 28,29 (Supplementary Figure 3C), which explained the detection by mNGS despite negative RT-PCR results. Similarly, DTCA was performed on the 7 mNGS negative / RVP positive samples along with repeating the RVP assay (if possible, on a different instrument). This reassessment resulted in 5.5 samples being reclassified as true negatives (1 sample harbored two organisms adjudicated as one true negative and one false negative) (Supplementary Table 6) . Compared to a composite standard that incorporates discrepancy testing and clinical adjudication, positive, negative, and overall predictive agreements of the mNGS assay were 98.7% (110.5 of 113), 98.1% (76.5 of 78), and 97.9% (187 of 191), respectively. Detection of divergent viruses To benchmark the capability of the modified SURPI+ pipeline for detection of novel, highly divergent viruses in silico , we created a simulated sequencing output file containing many known human viral pathogens of clinical and public health significance, including those with pandemic potential (Figure 5, left) . We then removed all viral reference sequences of the same type (for example, all human polyomviruses, coronaviruses, or parainfluenza viruses) or corresponding to the same genus or species from the SURPI+ 2019 reference database (Figure 5, middle) . Next, we used the SURPI+ pipeline to analyze the simulated sequencing file against both the original and “filtered” reference databases. In this analysis, 98.6% (69 of 70) of human viruses were detected at a sequencing depth of 100 reads per million (RPM) and 100% (70 of 70) at 1000 RPM based on homology to known animal or plant viruses (Figure 5, right) . Of note, bunyaviruses pathogenic to humans, which are among the most divergent viruses, were still identified by translated nucleotide (amino acid) alignment to plant viruses (for example, detection of Venezuelan equine encephalitis virus based on homology to vanilla latent virus in Figure 3 ). Discussion We validated a clinical mNGS assay in a CLIA laboratory as a Laboratory Developed Test (LDT) for agnostic viral respiratory pathogen detection intended to aid in patient diagnosis and public health surveillance. Our main goal was to develop, optimize, and streamline a protocol for respiratory viral mNGS testing that could be deployed and run routinely in clinical or public health laboratories. The mNGS assay developed here has favorable performance characteristics compared to clinical RVP testing, including a limit of detection of ~500 copies/mL, viral load quantification with 100% linearity, and sensitivity, specificity, and accuracy ranging from 93.6 – 93.8%. However, in contrast to targeted assays such as RVP, the mNGS assay is capable of detecting, in principle, all known as well as novel viral pathogens in respiratory samples. In addition, mNGS assay performance was found to be superior to RVP (97.9% versus 95.0% overall agreement) after discrepancy testing and clinical adjudication. The correlations we observed between viral load and disease severity highlight the potential for complementary quantitative viral load measurements to aid to distinguish beween asymptomatic infection and/or colonization and overt and/or severe respiratory disease, thereby informing clinical management and treatment, as has been previously demonstrated for certain non-respiratory viruses such as CMV 30 .Following completion of the validation, our assay received breakthrough device designation from the US Food and Drug Administration (FDA) in August of 2023. Widespread implementation of highly accurate, rapid mNGS assays such as this, with enhanced capacity to detect novel viruses, will support robust preparation for and rapid response to the next viral pandemic. Speed is a critical factor for diagnosis of respiratory infections, especially in critically ill patients with lower respiratory involvement and in outbreak investigations of novel or emerging viruses with pandemic potential. We also aimed to develop an assay that could be deployable widely in clinical and public health laboratories. Thus, we optimized many of the steps of the mNGS assay and moved the key RNA/cDNA library preparation step to an automated platform, the MagicPrep NGS system (Tecan Genomics, Inc., Männedorf, Switzerland). We further demonstrated that sequencing can be performed on the Illumina MiniSeq using the Rapid Reagent Kit for a faster 5-hour turnaround time or on the Illumina NextSeq 550Dx using the Mid-Output Reagent Kit for a 13-hour turnaround time, depending on laboratory needs and priorities. All together, these modifications resulted in an assay with a turnaround time of 14-24 hours and ~2 hours of hands-on technician time. Orthogonal testing and clinical adjudication performed on discordant results demonstrated that the RVP assay is an imperfect gold standard on which to judge mNGS performance. The mNGS assay was able to not only detect uncommon infections from viruses not covered on existing RVP panels, but also, in multiple cases, detect viruses that would in principle be detectable by RVP but tested negative. Unlike RVP, mNGS does not rely on specific primers or probes and is thus less susceptible to primer failure due to viral evolution, as evidenced by the mNGS positive and RVP negative rhinovirus case presented here, and which can result in decreased assay sensitivity or false negative results due to viral mutation, which is an inevitable feature of SARS-CoV-2 and many other RNA viruses 31 . Notably, a previous study evaluating the usefulness of published PCR primers in detecting rhinovirus infection reported that none of the published rhinovirus-specific PCR primer pairs could detect all human rhinoviruses in 101 genotyped clinical specimens 32 . In addition, the broader sampling of the viral genome by mNGS may result in increased sensitivity of virus detection compared to RVP due to increased robustness to variability in the relative levels of viral gene expression by infected cells 33 . Most of the false-negative mNGS samples were confirmed as true negative after chart review and repeating the RVP assay. Most likely, these represented false-positive results during the original RVP run, given the high cycle thresholds (>36), suggesting low viral titers, or samples that had degraded over time and/or after multiple freezing and thawing cycles. In the study, we used several approaches to demonstrate the capacity of the mNGS assay to identify novel and/or emerging viruses with divergent genomes. The assay was successful in detecting uncommon and unusual viral pathogens associated with both severe respiratory infections (bronchoalveolar lavage fluid) and central nervous infections (CSF spiked into respiratory sample matrix). mNGS testing also enabled subtyping of specific viral strains with increased virulence, such as enterovirus D68, which has been linked to acute flaccid myelitis in children 34,35 , and rhinovirus C, which has been associated with invasive pulmonary and bloodstream infection in immunocompromised patients 36,37 . Importantly, the mNGS assay was also able to detect DNA viruses, such as adenovirus and bocavirus, in both clinical and contrived samples, despite the incorporation of DNase treatment in the protocol. Detection of DNA viruses is presumably based on detection of transcribed viral mRNA in infected cells, although may also enabled by incomplete DNA digestion from.the DNase enzyme. To evaluate the capacity for mNGS testing using a modified SURPI+ computational pipeline to identify novel viruses, we performed an in silico analysis of a contrived metagenomic dataset consisting of reads from the genomes of human viruses of pandemic potential spiked into background using a reference database depleted of all known human viral sequences. This analysis was done to simulate whether “novel” human viruses with pandemic potential could be identified based on homology to known plant and animal viruses. All 70 of the human viral pathogens tested were successfully identified, including those with only remote homology to other viruses. Indeed, chikungunya virus, in the Alphavirus genus of the Togaviridae family, was only identified (after removal of all alphaviruses) because of distant homology to vanilla latent virus in the family Alphaflexivirdae . Notably, alphaflexiviruses contain a distinct lineage of alphavirus-like replication proteins that lack a recognized protease domain 38 . Here we show in silico that the pipeline is able to detect highly diverse viruses from families that are known to be potentially pathogenic to humans and that emerge from animal reservoirs (for example, Bunyaviridae, Flaviviridae, and Adenoviridae ). If a novel, highly divergent virus from an uncharacterized family were detected, with little to no homology, much more work would be needed to ascertain its clinical significance, or whether it is even capable of infecting humans, including formal assessment of Koch’s postulates with modificatons by Rivers for causality 39 . Our validation study has limitations. First, we tested very few bronchoalveolar lavage fluid samples from patients with acute respiratory infection (n=6) and very few clinical samples harboring rare or unusual respiratory viruses (n=7), and further validation of assay performance with these kinds of samples is needed. Second, mNGS testing was performed exclusively on samples from US patients, so viral pathogen diversity may not represent all populations globally. Third, we did not formally prove that the mNGS assay would be able to detect a novel, sequence-divergent virus, but instead demonstrated the ability of the test to detect such a virus using an in silico analysis, an approach which nonetheless has been used in previous studies to benchmark mNGS bioinformatic pipelines for viral pathogen discovery 40,41 . Finally, we did not address the utility of the mNGS assay for routine diagnosis in patients with unexplained infections, or for outbreak surveillance in public health, which will likely require future prospective clinical and/or epidemiologic investigation. Even though the respiratory mNGS assay described here has demonstrated high performance characteristics for sensitivity and specificity for the detection of viral pathogens, it is currently unlikely to replace multiplex respiratory panels as a first-line test since these are inexpensive and have more rapid turnaround times than mNGS. The projected costs of ~$300 USD per sample (Supplementary Table 7) make the respiratory mNGS assay more expensive than standard RVP tests, for which costs in our clinical laboratory range from $77 to $149 USD. However, the benefits of greatly expanded scope of detection, capability to identify novel emerging viruses, and comparable performance likely outweigh the costs for certain clinical and public health scenarios. The test could be particularly useful in public health laboratories that are more likely to receive and test samples from patients infected with unusual or novel viruses that are not part of the standard RVP testing. Of note, a modified protocol based on the assay was used to identify adeno-associated virus 2 in co-infections with adenoviruses and herpesviruses in cases of acute severe hepatitis in children as part of a nationwide US outbreak 42 . The mNGS assay could also be implemented as a second-line test in clinical laboratories for patients with presumed viral bronchiolitis and pneumonia when RVP testing is negative. This strategy would be useful for diagnosis of rare and/or unexpected infections in immunocompromised patients or returning travelers, for whom there is a wider differential diagnosis. Resource availability Lead Contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Charles Chiu ( [email protected] ). Materials Availability This study did not generate any new reagents. Data and Code Availability Human-subtracted raw sequence data were submitted to the Sequence Read Archive (SRA) database. (BioProject accession number PRJNA1084017 and umbrella BioProject accession number PRJNA171119). Sequence metadata, custom scripts and code for data analyses and visualization are available in a Zenodo data repository (https://doi.org/10.5281/zenodo.10553379). Methods details Human Sample Collection Residual laboratory-confirmed virus-positive upper respiratory swab or BAL samples from clinical patient testing were retrieved from the UCSF Clinical Microbiology Laboratory and stored according to protocols approved by the UCSF Institutional Review Board (protocol no. 11-05519) . Acceptable upper respiratory swab samples included (1) bilateral nasopharyngeal swabs, (2) bilateral anterior nares swabs, (3) oropharyngeal swabs, (4) combined nasopharyngeal and oropharyngeal swabs, and (5) combined oropharyngeal/mid-turbinate nasal swabs. All samples were required to meet minimal sample handling, storage, and volume requirements for inclusion in our study. Samples were stored at 4°C for <24 hr prior to being de-identified, aliquoted, and stored in -80°C freezer prior to mNGS processing, thus undergoing one freeze-thaw cycle. Inclusion and Ethics All residual samples meeting minimal requirements were included in the study. Samples were de-identified prior to processing. External controls preparation External positive control (PC) was prepared by spiking a pooled negative nasal swab matrix with a commercially available reference material, the Accuplex Verification Panel (SeraCare, Milford, MA). This panel consists of a mixture of non-infectious SARS-CoV-2, influenza A, influenza B, and RSV genomes encapsidated in a synthetic protein coat to mimic the structure of a viral capsid. This PC material was “spiked in” at a titer of approximately 10 4 copies/mL for each virus control, which is 1–2 logs higher than the estimated limit of detection of the assay(~500 copies/mL). The negative matrix was prepared by pooling nasopharyngeal swab samples from asymptomatic individuals and was used as an external negative control (NC). Nucleic acid extraction 500 µL of upper respiratory swab or BAL fluid was centrifuged at 16,000 x g for 10 minutes. The MagMAX™ Viral/Pathogen II (MVP II) Nucleic Acid Isolation Kit (Thermo Fisher Scientific, Waltham, MA) and the KingFisher™ Flex Purification System with a 96 deep-well head (Thermo Fisher Scientific, Waltham, MA) were used for total nucleic acid extraction. This protocol was modified to include DNase treatment as a host depletion step during extraction. Bacteriophage MS2 (Zeptometrix, Buffalo, NY) was added to all samples including the negative control as an internal qualitative control. Library preparation and sequencing Simultaneous reverse transcription of purified RNA, spiked in with ERCC RNA controls (Invitrogen, Waltham, MA), and ribosomal RNA (rRNA) depletion were carried out using NEBNext® Ultra™ II RNA First Strand Synthesis Module (New England Biolabs, Ipswich, MA) and QIAseq FastSelect-rRNA HMR Kit (Qiagen, Germantown, MD), respectively, followed by second strand cDNA synthesis using Sequenase™ Version 2.0 DNA Polymerase (Thermo Fisher Scientific, Waltham, MA). Complementary DNA (cDNA) was purified using AMPure XP beads (Beckman Coulter, Brea, CA) and loaded on the MagicPrep NGS instrument (Tecan Genomics, Inc., Männedorf, Switzerland) to undergo end-repair, adapter ligation and barcoding, amplification (25 cycles) and purification. Libraries were quantified and normalized using the Qubit dsDNA HS Assay (Thermo Fisher Scientific, Waltham, MA) on the Qubit Flex (Thermo Fisher Scientific, Waltham, MA). Final pooled libraries were sequenced as single-end reads on either the Illumina (San Diego, CA) MiniSeq using the Rapid Reagent Kit (100 cycles) or on the Illumina NextSeq 550 using the Mid-Output or High-Output Kit (150 cycles). Bioinformatics The SURPI+ computational pipeline, run as a container (v1.0.0) on either a secure server or cloud infrastructure, was used for identification of respiratory viral pathogens from mNGS data. Reads were preprocessed by trimming of adapters and removal of low-complexity and low-quality sequences, followed by computational subtraction of human reads. The Scalable Nucleotide Alignment Program (SNAP) 43 nucleotide aligner was run using an edit distance of 16 against the National Center for Biotechnology Information (NCBI) nucleotide (NT) database (March 2019, with inclusion of the SARS-CoV-2 WuHan-Hu-1 genome accession number NC_045512) filtered to retain only viral reads. The pipeline was modified to include “tagging”, or annotation, of entries from reference sequences that constitute a subset of the NCBI NT database, such as FDA-ARGOS 23 . Note that the FDA-ARGOS database, while quality controlled and regulated, contains only 1,428 microbial strains, the majority of which are bacterial. It had also not been updated with recent viruses such as SARS-CoV-2; thus, we did not detect any reads matching to viral genomes in this study. The pipeline is also able to accommodate additional reference databases as needed such as GISAID 44 . The pipeline was also modified to include optional de novo assembly of reads into contiuous sequences (contigs) and translated nucleotide sequence alignment of both reads and contigs using SPAdes 45 and e 46 , respectively. Viral reads are identified using DIAMOND at a e-value cutoff of 10 -5 . Coverage maps were automatically generated by mapping reads classified by SURPI as viral to the most likely reference genome. Quality control metrics for the assay were based on those previously established for cerebrospinal fluid 21 , and include a minimum of 5 million preprocessed reads per sample, >75% of data with quality score >30 (Q>30), and successful detection of the 4 respiratory viruses in the PC and the internal spiked MS2 phage control. A criterion of ≥3 non-overlapping viral reads or contigs aligning to the target viral genome was considered a positive detection. Evaluation of mNGS analytical performance characteristics The automated standard operating procedures and sequencing runs for these clinical validation studies were performed by a state-licensed clinical laboratory scientist.LoD was determined for each of the four representative organisms in the PC by probit analysis using a series of dilutions ranging from 100 to 5,000 copies/mL, with 10 to 40 replicates at each concentration. Linearity was demonstrated by plotting the standard curve. To validate the quantification using the ERCC and the positive control, we serially diluted an HCV positive plasma to known concentration ranging from 4 x 10 6 to 4 x 10 3 copies/mL in triplicates. We then compared the quantitative measure to the known measure. Precision was determined using repeat analysis of two PC and two NC samples across 20 runs (intra-assay reproducibility) and by testing 20 PC and 20 NC across 20 separate runs (inter-assay reproducibility). To assess inclusivity, commercially available cultured supernatants were obtained to assess the assay’s ability to detect the intended targets. Each of the 17 respiratory viruses, titers ranging from 1.3 x 10 4 to 1.2 x 10 8 TCID50/mL, were spiked into the negative control matrix at 1:10 dilutions. These viruses represented known sublineages and subspecies and we evaluated their identification by our assay. We also tested samples of confirmed virus-positive BAL (n=7) and CSF samples (n=4) spiked into negative matrix to evaluate the detection of unusual viruses. To assess the exclusivity of the mNGS assay, we spiked a previously established mixture of seven representative pathogenic organisms to verify the false positive detection for viral pathogens. We evaluated cross-contamination between adjacent sample wells and carryover contamination across successive runs from samples with high viral loads. Interference was determined using PC spiked with known amount of hemolytic blood, lipids, bilirubin, human RNA, bacterial DNA/RNA. The effect of mucus in BAL positive fluid was also assessed. Stability was determined by keeping samples for up to 7 days at 4°C or subjecting the samples to 3 freeze/thaw cycles. Accuracy was determined using 191 clinical samples comprising 110 virus-positive samples (103 upper respiratory swab samples and 7 BAL fluids) from patients with acute respiratory infection, along with 81 virus-negative samples (52 upper respiratory swab samples and 29 BAL fluids). Samples were obtained from patients at the University of California, San Francisco (UCSF). The viral RT-PCR comparator assays that were used include the Genmark ePlex (Carlsbad, CA), Luminex NxTAG (Austin, TX), and/or Luminex Verigene RP Flex Respiratory Pathogen Panels. mNGS results were compared with original clinical testing and then with a composite reference standard including discrepancy testing and clinical adjudication. In the second comparison, when results were discordant, orthogonal testing was performed using a different instrument or an independent CLIA laboratory (the California Department of Public Health) in addition to clinical adjudication to reclassify mNGS results. The second comparison was reported as positive percent agreement (PPA) and negative percent agreement (NPA), as selective discrepancy testing can bias sensitivity and specificity results. Orthogonal discrepancy testing at the California Department of Public Health Specimens were tested by real-time PCR based on CDC protocols using a viral respiratory panel, an unpublished CDPH laboratory-developed test (LDT). Viruses that can be detected by this panel include human metapneumovirus, respiratory syncytial virus, adenovirus, parainfluenza virus (types 1, 2, 3, and 4), enterovirus/rhinovirus, and human coronaviruses 229E, OC43, NL63, and HKU1. In silico analysis for identification of novel and/or divergent viruses using the SURPI+ pipeline To measure accurate detection of novel and/or divergent viruses, an in silico analysis was performed. Representative viral reference genomes corresponding to outbreak viruses of clinical and public health significance with pandemic potential were retrieved from the NCBI GenBank database, partitioned into non-overlapping segments, and then randomly sampled and spiked in silico into a negative nasal swab matrix sequencing library. We then took a higher-level set of taxonomic identifiers (species, genus, and/or family) corresponding to these viruses and removed all entries with these taxonomic identifiers from the SURPI+ reference dataset. Next, we used the SURPI+ pipeline to analyze the simulated sequencing file against both the original and “restricted reference” databases and evaluated the performance of the pipeline in detecting “simulated” novel and/or divergent viruses that lacked a reference sequence. Statistical analyses Sensitivity and specificity analyses were performed as follows: as more than one target may be positive with mNGS and RVP, each result was independently assessed in every sample and true/false-negative/positive were accordingly assigned to each result. However, the total number of observations was kept constant (one sample = one observation = 1). For instance, in the case a test detected two organisms, namely the real culprit pathogen and a contaminant, the former was assigned 0.5 true-positive (TP) and the latter 0.5 false-positive (FP), in order as their sum was always equal to 1. In addition, as we used RVP as a comparator which includes a limited number of targets, mNGS positive-RVP negative results that were not a target for the RVP were not considered as false-positive results. Statistical analyses were performed using scipy (version 1.5.3) and rstatix (version 0.7.0) packages as implemented in Python (version 3.7.12) and R (version 4.0.3), respectively. Probit regression analyses were done using scipy (version 1.5.3), numpy (version 1.19.1) and statsmodels (version 0.12.2) as implemented in Python software (version 3.7.12). Declarations Acknowledgments We thank the staff at the UCSF Clinical Microbiology Laboratory for help in collecting nasopharyngeal swab and bronchoalveolar lavage fluid samples. This work was financially supported in part by BARDA EZ-BAA award 75A50122C00022 (C.Y.C.), US CDC grants 75D30122C15360 and 75D30121C12641 (C.Y.C.), Abbott Laboratories (C.Y.C.), and the Chan-Zuckerberg Biohub (C.Y.C.). The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review or approval of the manuscript; and decision to submit the manuscript for publication. Disclaimer The content of this paper is solely the responsibility of the authors and does not represent the official views or opinions of the National Institutes of Health, Kaiser Permanente, California Department of Public Health or the California Health and Human Services Agency. Use of trade names and commercial sources is for identification only and does not imply endorsement by the California Department of Public Health or the California Health and Human Services Agency. Competing interests C.Y.C. is a founder of Delve Bio and on the scientific advisory board for Delve Bio, Flightpath Biosciences, Biomeme, Mammoth Biosciences, BiomeSense and Poppy Health. He is also an inventor on US patent 11380421, “Pathogen detection using next generation sequencing”, under which algorithms for taxonomic classification, filtering and pathogen detection are used by SURPI+ software. C.Y.C. receives research support from Delve Bio and Abbott Laboratories, Inc. The other authors declare no competing interests. Author contributions J. Tan, V.S., D.S., and C.Y.C conceived and designed the study. J. Tan, V.S., D.S., N.S., A.F., H.J.H., J.N., M.O., N.B., J. Tang, D.I., B.F., H.R., M.H., C.M., D.A.W., and C.Y.C coordinated the sequencing efforts and laboratory studies. J. Tan, A.C., H.C., and S.Y. processed samples. J. Tan, V.S., D.S., E.K., A.C., H.C., S.Y., M.D.L., P.B., and C.Y.C. analyzed data. J. Tan, N.S., A.F., J.N., M.O., P.M.M., and C.L. collected samples. J. Tan, V.S., E.K., P.B., M.D.L and C.Y.C. wrote the manuscript. J. Tan, V.S., E.K., P.B., and C.Y.C. prepared the figures. J. Tan, V.S., D.S., E.K., N.S., A.F., H.J.H., J.N., M.O., N.B., J. Tang, D.I., B.F., H.R., M.H., D.A.W., P.M.M., C.R.L., M.D.L., P.B., and C.Y.C edited the manuscript. J. Tan, V.S., E.K., M.D.L., P.B., and C.Y.C. revised the manuscript. All authors read the manuscript and agree to its contents. References DALYs, G.B.D. , et al. Global, regional, and national disability-adjusted life years (DALYs) for 306 diseases and injuries and healthy life expectancy (HALE) for 188 countries, 1990-2013: quantifying the epidemiological transition. Lancet 386 , 2145-2191, doi: 10.1016/S0140-6736(15)61340-X (2015). Jain, S. , et al. Community-Acquired Pneumonia Requiring Hospitalization among U.S. Adults. N Engl J Med 373 , 415-427, doi: 10.1056/NEJMoa1500245 (2015). Jain, S. , et al. Community-acquired pneumonia requiring hospitalization among U.S. children. N Engl J Med 372 , 835-845, doi: 10.1056/NEJMoa1405870 (2015). Musher, D.M. & Thorner, A.R. Community-acquired pneumonia. N Engl J Med 371 , 1619-1628, doi: 10.1056/NEJMra1312885 (2014). Charlton, C.L. , et al. Practical Guidance for Clinical Microbiology Laboratories: Viruses Causing Acute Respiratory Tract Infections. Clin Microbiol Rev 32 , doi: 10.1128/CMR.00042-18 (2019). Evans, S.E. , et al. Nucleic Acid-based Testing for Noninfluenza Viral Pathogens in Adults with Suspected Community-acquired Pneumonia. An Official American Thoracic Society Clinical Practice Guideline. Am J Respir Crit Care Med 203 , 1070-1087, doi: 10.1164/rccm.202102-0498ST (2021). Jain, S. Epidemiology of Viral Pneumonia. Clin Chest Med 38 , 1-9, doi: 10.1016/j.ccm.2016.11.012 (2017). Schlaberg, R. , et al. Viral Pathogen Detection by Metagenomics and Pan-Viral Group Polymerase Chain Reaction in Children With Pneumonia Lacking Identifiable Etiology. J Infect Dis 215 , 1407-1415, doi: 10.1093/infdis/jix148 (2017). Jones, K.E. , et al. Global trends in emerging infectious diseases. Nature 451 , 990-993, doi: 10.1038/nature06536 (2008). Zhou, P. , et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579 , 270-273, doi: 10.1038/s41586-020-2012-7 (2020). Lu, R. , et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet 395 , 565-574, doi: 10.1016/S0140-6736(20)30251-8 (2020). Chiu, C.Y. & Miller, S.A. Clinical metagenomics. Nat Rev Genet 20 , 341-355, doi: 10.1038/s41576-019-0113-7 (2019). Simner, P.J., Miller, S. & Carroll, K.C. Understanding the Promises and Hurdles of Metagenomic Next-Generation Sequencing as a Diagnostic Tool for Infectious Diseases. Clin Infect Dis 66 , 778-788, doi: 10.1093/cid/cix881 (2018). Blauwkamp, T.A. , et al. Analytical and clinical validation of a microbial cell-free DNA sequencing test for infectious disease. Nat Microbiol 4 , 663-674, doi: 10.1038/s41564-018-0349-6 (2019). Gaston, D.C. , et al. Evaluation of Metagenomic and Targeted Next-Generation Sequencing Workflows for Detection of Respiratory Pathogens from Bronchoalveolar Lavage Fluid Specimens. J Clin Microbiol 60 , e0052622, doi: 10.1128/jcm.00526-22 (2022). Wilson, M.R. , et al. Clinical Metagenomic Sequencing for Diagnosis of Meningitis and Encephalitis. N Engl J Med 380 , 2327-2340, doi: 10.1056/NEJMoa1803396 (2019). Lee, R.A., Al Dhaheri, F., Pollock, N.R. & Sharma, T.S. Assessment of the Clinical Utility of Plasma Metagenomic Next-Generation Sequencing in a Pediatric Hospital Population. J Clin Microbiol 58 , doi: 10.1128/JCM.00419-20 (2020). Han, D. , et al. The Real-World Clinical Impact of Plasma mNGS Testing: an Observational Study. Microbiol Spectr 11 , e0398322, doi: 10.1128/spectrum.03983-22 (2023). Miller, S. & Chiu, C. The Role of Metagenomics and Next-Generation Sequencing in Infectious Disease Diagnosis. Clin Chem 68 , 115-124, doi: 10.1093/clinchem/hvab173 (2021). Benoit, P. , et al. Metagenomic next-generation sequencing of cerebrospinal fluid for diagnosis of central nervous system infections: 7-year performance of a clinically validated test. medRxiv , doi: (2024). Miller, S. , et al. Laboratory validation of a clinical metagenomic sequencing assay for pathogen detection in cerebrospinal fluid. Genome Res 29 , 831-842, doi: 10.1101/gr.238170.118 (2019). Naccache, S.N. , et al. A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples. Genome Res 24 , 1180-1192, doi: 10.1101/gr.171934.113 (2014). Sichtig, H. , et al. FDA-ARGOS is a database with public quality-controlled reference genomes for diagnostic use and regulatory science. Nat Commun 10 , 3313, doi: 10.1038/s41467-019-11306-6 (2019). Clinical Laboratory Standards Institute. Molecular Methods for Genotyping and Strain Typing of Infectious Organisms, 1st Edition. Vol. 24 (ed. Institute, C.a.L.S.) (Clinical and Laboratory Standards Institute, Wayne, Pennsylvania, 2021). Clinical Laboratory Standards Institute. Validation and Verification of Multiplex Nucleic Acid Assays, 2nd Edition. Vol. 9 (ed. Institute, C.a.L.S.) (Clinical and Laboratory Standards Institute, Wayne, Pennsylvania, 2018). Espy, M.J. , et al. Real-time PCR in clinical microbiology: applications for routine laboratory testing. Clin Microbiol Rev 19 , 165-256, doi: 10.1128/CMR.19.1.165-256.2006 (2006). Hayden, R.T. , et al. Progress in Quantitative Viral Load Testing: Variability and Impact of the WHO Quantitative International Standards. J Clin Microbiol 55 , 423-430, doi: 10.1128/JCM.02044-16 (2017). Andeweg, A.C., Bestebroer, T.M., Huybreghs, M., Kimman, T.G. & de Jong, J.C. Improved detection of rhinoviruses in clinical samples by using a newly developed nested reverse transcription-PCR assay. J Clin Microbiol 37 , 524-530, doi: 10.1128/JCM.37.3.524-530.1999 (1999). Lu, X. , et al. Real-time reverse transcription-PCR assay for comprehensive detection of human rhinoviruses. J Clin Microbiol 46 , 533-539, doi: 10.1128/JCM.01739-07 (2008). Razonable, R.R. & Hayden, R.T. Clinical utility of viral load in management of cytomegalovirus infection after solid organ transplantation. Clin Microbiol Rev 26 , 703-727, doi: 10.1128/CMR.00015-13 (2013). Clark, C., Schrecker, J., Hardison, M. & Taitel, M.S. Validation of reduced S-gene target performance and failure for rapid surveillance of SARS-CoV-2 variants. PLoS One 17 , e0275150, doi: 10.1371/journal.pone.0275150 (2022). Faux, C.E. , et al. Usefulness of published PCR primers in detecting human rhinovirus infection. Emerg Infect Dis 17 , 296-298, doi: 10.3201/eid1702.101123 (2011). Russell, A.B., Trapnell, C. & Bloom, J.D. Extreme heterogeneity of influenza virus infection in single cells. Elife 7 , doi: 10.7554/eLife.32303 (2018). Greninger, A.L. , et al. A novel outbreak enterovirus D68 strain associated with acute flaccid myelitis cases in the USA (2012-14): a retrospective cohort study. Lancet Infect Dis 15 , 671-682, doi: 10.1016/S1473-3099(15)70093-9 (2015). Messacar, K. , et al. Enterovirus D68 and acute flaccid myelitis-evaluating the evidence for causality. Lancet Infect Dis 18 , e239-e247, doi: 10.1016/S1473-3099(18)30094-X (2018). Lupo, J. , et al. Disseminated rhinovirus C8 infection with infectious virus in blood and fatal outcome in a child with repeated episodes of bronchiolitis. J Clin Microbiol 53 , 1775-1777, doi: 10.1128/JCM.03484-14 (2015). Sayama, A. , et al. Comparison of Rhinovirus A-, B-, and C-Associated Respiratory Tract Illness Severity Based on the 5'-Untranslated Region Among Children Younger Than 5 Years. Open Forum Infect Dis 9 , ofac387, doi: 10.1093/ofid/ofac387 (2022). Kreuze, J.F. , et al. ICTV Virus Taxonomy Profile: Alphaflexiviridae. J Gen Virol 101 , 699-700, doi: 10.1099/jgv.0.001436 (2020). Guo, C. & Wu, J.Y. Pathogen Discovery in the Post-COVID Era. Pathogens 13 , doi: 10.3390/pathogens13010051 (2024). Wood, D.E. & Salzberg, S.L. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol 15 , R46, doi: 10.1186/gb-2014-15-3-r46 (2014). Flygare, S. , et al. Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling. Genome Biol 17 , 111, doi: 10.1186/s13059-016-0969-1 (2016). Servellita, V. , et al. Adeno-associated virus type 2 in US children with acute severe hepatitis. Nature 617 , 574-580, doi: 10.1038/s41586-023-05949-1 (2023). Zaharia, M. , et al. Alignment in a SNAP: Cancer Diagnosis in the Genomic Age. Laboratory Investigation 92 , 458a-458a, doi: (2012). Shu, Y. & McCauley, J. GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro Surveill 22 , doi: 10.2807/1560-7917.ES.2017.22.13.30494 (2017). Bankevich, A. , et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19 , 455-477, doi: 10.1089/cmb.2012.0021 (2012). Buchfink, B., Xie, C. & Huson, D.H. Fast and sensitive protein alignment using DIAMOND. Nat Methods 12 , 59-60, doi: 10.1038/nmeth.3176 (2015). Tables Table 1. Performance characteristics of the UCSF viral respiratory mNGS assay Metrics Method Expected target Results Limit of detection (LoD) Detection of PC dilution by probit analysis 90% R 2 = 100 % Precision Intra-Assay: PC and NC within the same run across 20 runs. Concordance 100% EA Log-transformed CV <10% Concordance 100% EA Log-transformed CV <10% Inter-Assay: PC and NC across 20 separate runs 100% EA <30% 100% EA <30% Inclusivity Detection of viruses from diluted culture supernatant 100% detection 100% detection (17/17) Detection of viruses in positive BAL/CSF diluted samples 100% detection 100% detection (11/11) Exclusivity Detection of viruses in known organism mixtures a No false-positive No false-positive Contamination Detection of cross-contamination on the sample wells No carryover contamination Cross-contamination of 0.1% between adjacent wells but no carryover contamination Interference Detection of PC spiked with hemolytic blood Detection at all concentrations Detection at all concentrations Detection of PC spiked with Human RNA Detection at all concentrations Detection at all concentrations Detection of PC spiked with bacterial DNA/RNA Detection at concentration ≤ 10 7 cells/mL Detection at concentration ≤ 10 7 cells/mL Detection of virus-positive overtly mucoid BAL samples Detection in all BAL samples Target detected in 13/14 (92.9%) valid sample runs Stability Detection of targets in samples held at 4°C for 7 days or after 3 freeze-thaw cycles 100% concordance 100% concordance Accuracy Detection in virus positive and negative samples (n=191) Sensitivity > 90% Specificity > 90% Accuracy > 90% PPA > 90% NPA > 90% Original testing Sensitivity: 93.6% Specificity: 93.8 % Accuracy: 93.7 % After discrepancy testing and clinical adjudication PPA: 98.7% NPA: 98.1% Overall: 97.9% Detection of divergent viruses Detection by an in silico analysis of divergent viruses (n=70) Sensitivity >95% Specificity >95% Sensitivity: 98.6% Specificity: 100% (PC) Positive control consisting of 4 respiratory viruses spiked into pooled nasopharyngeal swab matrix; (IC) spiked internal control consisting of a RNA MS2 phage; (NC) Negative control; (EA) Essential agreement, (CV) Coefficient of variation, (PPA) positive percent agreement; (NPA) negative percent agreement. a Two mixtures were assessed. The first mixture included detectable concentrations of CMV, HIV, Klebisella pneumoniae , Streptococcus agalactiae , Aspergillus niger , Cryptococcus neoformans and Toxoplasma gondii , and corresponds to positive control material from a previously validated CSF assay 21 . The second mixture was a commercial reference panel, the ZymoBIOMICS Microbial Community Standard (Zymo Research, Tustin, CA), and consisted of 10 bacterial and fungal pathogens at varying concentrations ( Listeria monocytogenes - 12%, Pseudomonas aeruginosa - 12%, Bacillus subtilis - 12%, Escherichia coli - 12%, Salmonella enterica - 12%, Lactobacillus fermentum - 12%, Enterococcus faecalis - 12%, Staphylococcus aureus - 12%, Saccharomyces cerevisiae - 2%, and Cryptococcus neoformans - 2%) that were spiked into negative nasopharyngeal swab matrix. Table 2. Detection of a broad range of viruses in contrived samples Contrived Sample type Correctly identified Virus by mNGS assay Positive cerebrospinal fluid (CSF) spiked in negative matrix Lymphocytic Choriomeningitis Virus (LCMV) Herpes simplex virus 2 (HSV-2) Varicella-zoster virus (VZV) Herpes simplex virus 1 (HSV-1) and Epstein-Barr Virus (EBV) Positive bronchoalveolar lavage (BAL) spiked in negative matrix Parainfluenza Virus Type 4 Parechovirus A Influenza C Virus Human Bocavirus Primate Bocaparvovirus 1 Coronavirus 229E Coronavirus NL63 Viral culture fluid spiked in negative control matrix (1:10) Adenovirus Type 1 Coronavirus 229E Coronavirus NL63 Coxsackie Virus Type A1 Echovirus Human Metapneumovirus 16 Influenza B Virus Measles Virus Mumps Virus Parainfluenza Virus Type 2 Parainfluenza Virus Type 3 Parainfluenza Virus Type 4A Parechovirus Type 1 Rhinovirus A16 Rhinovirus B14 Rubella Virus Influenza B Virus Additional Declarations Yes there is potential Competing Interest. C.Y.C. is a founder of Delve Bio and on the scientific advisory board for Delve Bio, Flightpath Biosciences, Biomeme, Mammoth Biosciences, BiomeSense and Poppy Health. He is also an inventor on US patent 11380421, “Pathogen detection using next generation sequencing”, under which algorithms for taxonomic classification, filtering and pathogen detection are used by SURPI+ software. C.Y.C. receives research support from Delve Bio and Abbott Laboratories, Inc. The other authors declare no competing interests. Supplementary Files SupplementaryDataset1.xlsx Supplementary Dataset 1. Clinical diagnosis and disease severity for patients whose respiratory samples were analyzed as part of the mNGS accuracy evaluation. Abbreviations: CAR-T, chimeric antigen receptor T-cell; COVID-19, coronavirus disease 2019; CMV, cytomegalovirus; CXR, chest x-ray; Flu, influenza; ICU, intensive care unit; PCR, polymerase chain reaction;RSV, respiratory syncytial virus; SOB, shortness of breath. SupplementaryMaterialver5.docx Supplementary Material Cite Share Download PDF Status: Published Journal Publication published 12 Nov, 2024 Read the published version in Nature Communications → Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4492202","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":308677205,"identity":"9b056afb-e15c-4c2b-a094-fc9fbdd4d877","order_by":0,"name":"Charles Chiu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA4UlEQVRIiWNgGAWjYDACCSB+AGYwNjB8ANLsDRBB/FoSoFoYZwBpngPEa2FgYOYhRgv/7OZnHxIq6kCMNmnbNjt7Hgbmg7d58Fly55jxjIQzh4GMg23SuW3JiT0MbMnW+LQYSCQYMyS2HQAyEkFaDiTYM/CYSePXkv6ZIfFfHUSLZdsBoMP4vxHQkgO0pYEZooWx7QBjDwMPG14tEjdyihkSjh3mkbiR2GzZcw7oF2Y2Y8s5eLTwz0jfzPChpk4OyHh440cZMMTYmx/eeINHCwwguYSZCOWjYBSMglEwCvADAFN8Qsb1Mpt2AAAAAElFTkSuQmCC","orcid":"https://orcid.org/0000-0003-2915-2094","institution":"University of California, San Francisco","correspondingAuthor":true,"prefix":"","firstName":"Charles","middleName":"","lastName":"Chiu","suffix":""},{"id":308677206,"identity":"53392861-283c-4d2f-89cb-f52c90c87164","order_by":1,"name":"Jessica Tan","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Jessica","middleName":"","lastName":"Tan","suffix":""},{"id":308677207,"identity":"e4f6dbb0-8924-45ce-ad00-09d532f2c026","order_by":2,"name":"Venice Servellita","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Venice","middleName":"","lastName":"Servellita","suffix":""},{"id":308677208,"identity":"9f4cec5e-8cf9-4680-a713-274e0a9c5696","order_by":3,"name":"Doug Stryke","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Doug","middleName":"","lastName":"Stryke","suffix":""},{"id":308677209,"identity":"5aa0623f-4124-43ec-91d6-7a04d38db8b8","order_by":4,"name":"Emily Kelly","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Emily","middleName":"","lastName":"Kelly","suffix":""},{"id":308677210,"identity":"b5c96f5b-619e-4c0e-833c-c6c4c08c4e4d","order_by":5,"name":"Jessica Streithorst","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Jessica","middleName":"","lastName":"Streithorst","suffix":""},{"id":308677211,"identity":"1a7c755b-b8bd-4914-8bbb-d8aa49068164","order_by":6,"name":"Nanami Sumimoto","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Nanami","middleName":"","lastName":"Sumimoto","suffix":""},{"id":308677212,"identity":"1617e06a-9b4f-4edd-82f0-b618ff2de427","order_by":7,"name":"Abiodun Foresythe","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Abiodun","middleName":"","lastName":"Foresythe","suffix":""},{"id":308677213,"identity":"a47aba79-e402-4b4f-83dc-e5c462c4adea","order_by":8,"name":"Hee Jae Huh","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Hee","middleName":"Jae","lastName":"Huh","suffix":""},{"id":308677214,"identity":"0d746611-e19d-4c62-a8a7-5fc46b047e1a","order_by":9,"name":"Jenny Nguyen","email":"","orcid":"","institution":"UCSF","correspondingAuthor":false,"prefix":"","firstName":"Jenny","middleName":"","lastName":"Nguyen","suffix":""},{"id":308677215,"identity":"97d071be-ba48-4879-a237-040755313671","order_by":10,"name":"Miriam Oseguera","email":"","orcid":"https://orcid.org/0009-0004-9621-2848","institution":"UCSF","correspondingAuthor":false,"prefix":"","firstName":"Miriam","middleName":"","lastName":"Oseguera","suffix":""},{"id":308677216,"identity":"d3882673-3811-412f-afad-9b89368c7e57","order_by":11,"name":"Noah Brazer","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Noah","middleName":"","lastName":"Brazer","suffix":""},{"id":308677217,"identity":"af4edb11-1e14-47b7-9bfe-582aa3c7f3ec","order_by":12,"name":"Jack Tang","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Jack","middleName":"","lastName":"Tang","suffix":""},{"id":308677218,"identity":"7e7b1412-bb9a-439a-925e-a4fc23179ee9","order_by":13,"name":"Danielle Ingebrigsten","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Danielle","middleName":"","lastName":"Ingebrigsten","suffix":""},{"id":308677219,"identity":"6f3cd5d2-8712-4bcd-942f-96c6734c120a","order_by":14,"name":"Becky Fung","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Becky","middleName":"","lastName":"Fung","suffix":""},{"id":308677220,"identity":"465a1f3f-4b4a-44ad-83de-f261521ada0f","order_by":15,"name":"Helen Reyes","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Helen","middleName":"","lastName":"Reyes","suffix":""},{"id":308677221,"identity":"84fac958-9679-4a1d-92bd-29da9db89534","order_by":16,"name":"Melissa Hillberg","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Melissa","middleName":"","lastName":"Hillberg","suffix":""},{"id":308677222,"identity":"29d78ef1-fbb7-42ab-b8bd-4689c88f98bd","order_by":17,"name":"Alice Chen","email":"","orcid":"","institution":"California Department of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Alice","middleName":"","lastName":"Chen","suffix":""},{"id":308677223,"identity":"56c9e424-36bd-44d8-b17a-b18384392980","order_by":18,"name":"Hugo Guevara","email":"","orcid":"","institution":"California Department of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Hugo","middleName":"","lastName":"Guevara","suffix":""},{"id":308677224,"identity":"9b0c1f83-c42f-40fd-a1e2-91ef39d40eb4","order_by":19,"name":"Shigeo Yagi","email":"","orcid":"","institution":"California Department of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Shigeo","middleName":"","lastName":"Yagi","suffix":""},{"id":308677225,"identity":"ee2ed1af-a504-4ec3-827a-d76bca79a3ca","order_by":20,"name":"Christina Morales","email":"","orcid":"","institution":"California Department of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Christina","middleName":"","lastName":"Morales","suffix":""},{"id":308677226,"identity":"575ec24a-08cb-45b8-b144-1de9731572d1","order_by":21,"name":"Debra Wadford","email":"","orcid":"https://orcid.org/0000-0002-8630-427X","institution":"California Department of Public Health Viral and Rickettsia Disease Laboratory","correspondingAuthor":false,"prefix":"","firstName":"Debra","middleName":"","lastName":"Wadford","suffix":""},{"id":308677227,"identity":"726724df-5ffd-4908-bcd5-f9caf9a4d51c","order_by":22,"name":"Peter Mourani","email":"","orcid":"https://orcid.org/0000-0002-1829-3775","institution":"Arkansas Children's","correspondingAuthor":false,"prefix":"","firstName":"Peter","middleName":"","lastName":"Mourani","suffix":""},{"id":308677228,"identity":"59c3a858-457e-4fea-9460-6ba428a5900d","order_by":23,"name":"Charles Langelier","email":"","orcid":"https://orcid.org/0000-0002-6708-4646","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Charles","middleName":"","lastName":"Langelier","suffix":""},{"id":308677229,"identity":"8296df18-0654-41ea-ae67-b4ef5e66bdd7","order_by":24,"name":"Mikaël de Lorenzi-Tognon","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Mikaël","middleName":"","lastName":"de Lorenzi-Tognon","suffix":""},{"id":308677230,"identity":"e2551b18-a8b5-426e-88ea-a1bb4d5019c8","order_by":25,"name":"Patrick Benoit","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Patrick","middleName":"","lastName":"Benoit","suffix":""}],"badges":[],"createdAt":"2024-05-28 16:20:24","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4492202/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4492202/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41467-024-51470-y","type":"published","date":"2024-11-12T05:00:00+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":57494385,"identity":"46d3adf1-e0b0-4594-b7c8-67c5a520146e","added_by":"auto","created_at":"2024-05-31 12:19:11","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":152846,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSchematic of the mNGS assay workflow. (A)\u003c/strong\u003e RNA from respiratory samples is extracted and treated with DNase. Internal control is added to assess human background during sequencing. Human rRNA is depleted during cDNA synthesis. Libraries are generated on the automated Tecan MagicPrep NGS instrument. Libraries are normalized, pooled, and loaded onto the sequencer. \u003cstrong\u003e(B)\u003c/strong\u003e Sequences are processed using SURPI+ software for alignment and classification. Reads are preprocessed by trimming of adapters and removal of low-quality/low-complexity sequences, followed by computational subtraction of human readsReads are mapped to the closest matched genome to identify non overlapping regions using NCBI GenBank and FDA-ARGOS database. To aid in analysis, automated result summaries, heat maps of raw/normalized read counts, and coverage/pairwise identity plots are generated for clinical interpretation. Total turnaround time is between 14 and 22 hours depending on type of sequencer used.\u003c/p\u003e","description":"","filename":"Figure1revisedver2.png","url":"https://assets-eu.researchsquare.com/files/rs-4492202/v1/d86c2f9c7c3ce20508a14a15.png"},{"id":57494964,"identity":"f1027145-262a-4514-9f7e-f2c98e162f1e","added_by":"auto","created_at":"2024-05-31 12:27:11","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":124412,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEnhancements to the SURPI+ Bioinformatics Pipeline for Pathogen Identification\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e(A)\u003c/strong\u003e Schematic diagram of SURPI+ software independentmodifications. First, we enabled reporting of the estimated viral load using a quantitative internal spiked control (ERCC). A standard curve is generated for each sample using the normalized ERCC results and absolute quantification by comparison of the ERCC data with the external PC. Second, translated viral reads and nucleotides are aligned to reference sequences in GenBankNR and GenBank NT, respectively. Each read is annotated with the lowest taxonomic rank that comprises a given threshold fraction of the read's total alignments to that rank, including alignments to the reference-grade sequences in the FDA-ARGOS database. Number of reads mapped to GenBank NT and FDA-ARGOS is shown on the clinical report. Third, de novo viral genome assembly and translated nucleotide (amino acid) alignments are done using the SPAdes and DIAMOND algorithms, respectively. After assembly of contigs, both the assembled contigs and the unaligned reads are then processed through DIAMOND to identify sequences that may correspond to novel, highly divergent viruses. \u003cstrong\u003e(B)\u003c/strong\u003e Representative viral reference genomes corresponding to outbreak viruses of clinical and public health significance with pandemic potential are retrieved from the NCBI GenBank database, partitioned into non-overlapping segments, and then randomly sampled and spiked \u003cem\u003ein silico \u003c/em\u003einto a negative nasal swab matrix sequencing library. A higher-level set of taxonomic identifiers (species, genus, and/or family) corresponding to these viruses are removed from the SURPI+ reference dataset and the simulated sequencing file is analyzed using both the original and “restricted reference” databases. \u003cstrong\u003e(C)\u003c/strong\u003e Viruses can be detected using the modified SURPI+ pipeline despite lacking a taxonomic reference at levels down to 10-100 reads per million (RPM). Abbreviations: EEEV, Eastern equine encephalitis virus; ERCC; External RNA Controls Consortium; FDA-ARGOS, FDA dAtabase for Reference Grade micrObial Sequences; HFV, hemorrhagic fever virus; HIV, human immunodeficiency virus; JCPyV, JC polyomavirus; PC, positive control; PyV, polyomavirus; TSPyV, trichodysplasia spinulosa polyomavirus; SURPI, sequence-based ultrarapid pathogen identification; VEEV, Venezuelan equine encephalitis virus; WEEV, Western equine encephalitis virus.\u003c/p\u003e","description":"","filename":"Figure2revisedver3.png","url":"https://assets-eu.researchsquare.com/files/rs-4492202/v1/eb2ab307f5b4214d1a740a96.png"},{"id":57494383,"identity":"4774ac3c-dd9d-4e17-9c0c-9113ad37a81f","added_by":"auto","created_at":"2024-05-31 12:19:11","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":53195,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eLimits of detection (LoD) study\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eProbit regression analysis curves plotting the viral titer in copies/mL (y-axis) against the calculated detection probability (x-axis) of (A) SARS-CoV-2, (B) Influenza A, (C) Influenza B and (D) Respiratory Syncytial Virus (RSV). The detection probability corresponding to 95% is denoted with a blue circle for each virus. Shaded areas represent the 95% confidence intervals for each curve. Probit analyses were done using Python software (version 3.7.12). Results show a LoD ranging from 439 to 706 copies/mL for the 4 respiratory viruses in the positive control.\u003c/p\u003e","description":"","filename":"Figure3revised.png","url":"https://assets-eu.researchsquare.com/files/rs-4492202/v1/a6351e59efd03e8faa8a4d9b.png"},{"id":57493879,"identity":"e9340900-0f50-4898-b15e-186afc3c6cca","added_by":"auto","created_at":"2024-05-31 12:11:11","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":64462,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAccuracy evaluation for the mNGS assay\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003ePie charts and 2x2 contingency tables showing the distribution of detected viruses and performance metrics.\u003cstrong\u003e (A) \u003c/strong\u003emNGS against RVP, \u003cstrong\u003e(B)\u003c/strong\u003e mNGS against DTCA, and \u003cstrong\u003e(C) \u003c/strong\u003eClinical RVP against DTCA. RVP testing using FDA IVD assays includes detection of respiratory syncytial virus, parainfluenza viruses 1 to 3, metapneumovirus, rhinovirus/enterovirus, influenza A virus, and influenza B virus, and adenovirus.\u003cstrong\u003e \u003c/strong\u003eDiscrepant samples that were mNGS-positive/RVP-negative or mNGS-negative/RVP-positive underwent orthogonal testing by targeted virus-specific PCR at the state public health laboratory and medical chart review for the most likely diagnosis by clinical adjudication. Abbreviations: mNGS, metagenomic next-generation Sequencing; PCR, polymerase chain reaction; RVP, viral respiratory panel; DTCA, discrepancy testing and clinical adjudication; PPA, positive percent agreement; NPA, negative percent agreement; OPA, overall percent agreement; RSV, respiratory syncytial virus; FDA, Food and Drug Administration; IVD, in vitro diagnostic.\u003c/p\u003e","description":"","filename":"Figure4revisedver2.png","url":"https://assets-eu.researchsquare.com/files/rs-4492202/v1/df3ea29feca356707cd49c0e.png"},{"id":57493884,"identity":"0ca45d09-fb2b-4781-bcb7-87b01813ae23","added_by":"auto","created_at":"2024-05-31 12:11:11","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":137944,"visible":true,"origin":"","legend":"\u003cp\u003eLegend not included with this version.\u003c/p\u003e","description":"","filename":"Figure5revisedver2.png","url":"https://assets-eu.researchsquare.com/files/rs-4492202/v1/95eac8a685e593123a03bbcc.png"},{"id":68892902,"identity":"15dfdeb1-ef37-461a-9e5e-a627819f14f7","added_by":"auto","created_at":"2024-11-13 08:07:33","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1522047,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4492202/v1/e9a7f5b7-db98-41b5-a336-7b3a5f64e24e.pdf"},{"id":57493883,"identity":"650574e4-5c47-4d2e-9106-953bb9997f7a","added_by":"auto","created_at":"2024-05-31 12:11:11","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":21036,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSupplementary Dataset 1. Clinical diagnosis and disease severity for patients whose respiratory samples were analyzed as part of the mNGS accuracy evaluation. \u003c/strong\u003eAbbreviations: CAR-T, chimeric antigen receptor T-cell; COVID-19, coronavirus disease 2019; CMV, cytomegalovirus; CXR, chest x-ray; Flu, influenza; ICU, intensive care unit; PCR, polymerase chain reaction;RSV, respiratory syncytial virus; SOB, shortness of breath.\u003c/p\u003e","description":"","filename":"SupplementaryDataset1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4492202/v1/eb8b82f73ff19fbb059e8efc.xlsx"},{"id":57493886,"identity":"bdc8cded-0f2d-4f32-99ff-196355600a07","added_by":"auto","created_at":"2024-05-31 12:11:11","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":2794466,"visible":true,"origin":"","legend":"Supplementary Material","description":"","filename":"SupplementaryMaterialver5.docx","url":"https://assets-eu.researchsquare.com/files/rs-4492202/v1/a342a710cae397b01821e587.docx"}],"financialInterests":"\u003cb\u003eYes\u003c/b\u003e there is potential Competing Interest.\nC.Y.C. is a founder of Delve Bio and on the scientific advisory board for Delve Bio, Flightpath Biosciences, Biomeme, Mammoth Biosciences, BiomeSense and Poppy Health. He is also an inventor on US patent 11380421, “Pathogen detection using next generation sequencing”, under which algorithms for taxonomic classification, filtering and pathogen detection are used by SURPI+ software. C.Y.C. receives research support from Delve Bio and Abbott Laboratories, Inc. The other authors declare no competing interests.","formattedTitle":"Laboratory validation of a clinical metagenomic next-generation sequencing assay for respiratory virus detection and discovery","fulltext":[{"header":"Introduction","content":"\u003cp\u003eRespiratory infections are among the most common infections globally and are associated with significant morbidity and mortality\u003csup\u003e1-3\u003c/sup\u003e. Despite their importance, half of adult patients hospitalized in the United States with community-acquired pneumonia, which is most commonly caused by respiratory viruses, have no causative pathogen identified\u003csup\u003e2-5\u003c/sup\u003e. Respiratory infections caused by viruses can be especially challenging to diagnose because of the diversity of potential agents\u003csup\u003e6-8\u003c/sup\u003e. In particular, emerging pandemic viruses represent an unpredictable threat which traditional diagnostic tools such as nucleic acid amplification tests have not been designed to detect\u003csup\u003e9\u003c/sup\u003e. The importance of unbiased assays for rapid identification of viral pathogens, especially those with sequence-divergent genomes, became evident during the discovery of SARS-CoV-2\u003csup\u003e10,11\u003c/sup\u003e\u003c/p\u003e\n\u003cp\u003eMetagenomic next-generation sequencing (mNGS) has emerged as an attractive diagnostic method for identifying causative agents in unexplained infections as it provides a comprehensive and agnostic approach by which all potential pathogens can be identified in a single assay without the need for specific primers and probes\u003csup\u003e12,13\u003c/sup\u003e.\u0026nbsp;mNGS has been used for broadly diagnosing infections, whether viral, bacterial, fungal, or parasitic, from multiple specimen types\u003csup\u003e14-16\u003c/sup\u003e, and its clinical utility has been demonstrated for neurological and bloodstream infections\u003csup\u003e16-18\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003eHowever, despite the favorable performance of mNGS testing as shown by multiple studies, general adoption of mNGS technologies in clinical microbiology laboratories has been hindered by high costs, complex protocols, lack of automation, insufficient standardization of bioinformatic pipelines, prolonged turnaround times (24-72 hours), lack for regulatory guidelines for clinical validation, and overall lower sensitivity for detection of common pathogens relative to targeted approaches such as polymerase chain reaction (PCR) assays\u003csup\u003e19\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003eHere we describe the development, optimization, and clinical validation of a streamlined and largely automated mNGS laboratory-developed test (LDT) with a sample-to-result turnaround time of less than 24 hours for identification of common as well as unexpected and/or novel viral respiratory pathogens. The computational SURPI+ pipeline used by the mNGS assay was modified to provide enhanced analysis capabilities, including viral load quantification, incorporation of curated reference genome databases such as FDA dAtabase for Reference Grade micrObial Sequences (FDA-ARGOS), and sensitive identification of novel, sequence-divergent viruses by \u003cem\u003ede novo\u003c/em\u003e assembly and translated nucleotide alignment. We comprehensively evaluated assay performance metrics, including limits of detection, linearity, precision, inclusivity and exclusivity, contamination, interference, matrix effect, stability, accuracy, and capacity to detect novel viruses.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cstrong\u003eDevelopment and Optimization of an mNGS Assay for Detection of Viral Respiratory Pathogens\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe developed an mNGS assay for the detection of viral pathogens from respiratory secretions, including upper respiratory swab and bronchoalveolar lavage (BAL) fluid samples \u003cstrong\u003e(Figure 1)\u003c/strong\u003e. We leveraged our 7-year experience running clinical mNGS assays for pathogen detection from cerebrospinal fluid\u003csup\u003e20\u003c/sup\u003e by optimizing the sample preparation and bioinformatics analysis protocols to maximize sensitivity and decrease assay sample-to-result turnaround time. We tested different combinations of centrifugation, heat, and addition of a DNA/RNA stabilization medium prior to total nucleic acid extraction and found that centrifugation alone produced the highest yield of detected viral reads. To decrease turnaround times, we used a 15-minute protocol for human rRNA depletion and reduced incubation times for the reverse transcription and second-strand cDNA synthesis steps to 15 and 9 minutes, respectively. The final assay used 450 μL of sample input volume and consisted of the following steps: (1) centrifugation (~15 min), total nucleic acid extraction and DNase treatment for isolation of total RNA (~1 hr), (2) cDNA synthesis with ribosomal RNA (rRNA) depletion (~1 hr), (3) barcoded adapter ligation, library PCR amplification and purification on an automated instrument (~6.5 hr), (4) library pooling (~5 min), (5) Illumina (San Diego, CA)\u0026nbsp;sequencing (5 or 13 hr, depending on whether a MiniSeq or NextSeq sequencer is used), and (6) bioinformatics analysis for viral detection and quantification using the SURPI+ pipeline (~1 hr). Overall sample-to-answer assay turnaround time was 14 - 24 hours. We used MS2 phage and External RNA Controls Consortium (ERCC) RNA Spike-In Mix (Invitrogen, Waltham, MA) added into each sample as internal qualitative and quantitative controls, respectively. The MS2 phage and ERCC sequencing results were also used to evaluate and interpret the background level in the sample, generally originating from the human host \u003cstrong\u003e(Supplementary Tables 1 and 2)\u003c/strong\u003e.A commercial reference panel (Accuplex Panel, SeraCare, Milford, MA) consisting of quantified SARS-CoV-2, influenza A, influenza B, and respiratory syncytial virus (RSV) was spiked into pooled virus-negative nasopharyngeal swab matrix (see Methods for details) as an external positive control (PC) for the assay, with pooled virus-negative nasopharyngeal swabs from healthy uninfected donors as the negative matrix serving as an external negative control (NC).\u003c/p\u003e\n\u003cp\u003eThe SURPI+ computational pipeline, run as a container on either a server or cloud, was used for the identification of viral respiratory pathogens from mNGS data\u003csup\u003e21,22\u003c/sup\u003e. Three enhancements were made (\u003cstrong\u003eFigure 2A\u003c/strong\u003e). First, we added the capability for viral load quantification using the PC and a standard curve generated for each sample from the ERCC reads. Second, “tagging” of Genbank accession numbers in the SURPI+ database was incorporated to allow inclusion of curated viral reference genomes, such as those deposited in the FDA-ARGOS database\u003csup\u003e23\u003c/sup\u003e, for virus identification by alignment and results reporting \u003cstrong\u003e.\u0026nbsp;\u003c/strong\u003eThird, a custom algorithm consisting of \u003cem\u003ede novo\u003c/em\u003e assembly of metagenomic reads and translated nucleotide, or amino acid, alignment of the reads to a viral protein database was developed to enable detection of novel, sequence-divergent viruse \u003csup\u003e23\u003c/sup\u003e.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFollowing the review of clinical charts, we investigated the correlation between viral load concentration, quantified in copies per milliliter (cp/mL) \u003cstrong\u003e(Figure 2B).\u0026nbsp;\u003c/strong\u003eThe severity of the infection which was categorized on a scale ranging from asymptomatic to mild, moderate, and severe. We observed significant differences in median viral loads between patients with asymptomatic/mild and moderate/severe infections (P \u0026lt; 0.001) (Supplemental Fig. 5a). Further stratification of patients into asymptomatic, mild, moderate, and severe infections highlighted an increasing trend in viral load concentrations. Through pairwise comparisons, we noted significant differences between asymptomatic and moderate (P \u0026lt; 0.01), as well as between mild and moderate (P \u0026lt; 0.01) infections. Overall, differences in median viral loads across all severity levels were significant (P \u0026lt; 0.001) (Supplemental Fig. 5b).\u003c/p\u003e\n\u003cp\u003eQuality control metrics were based on those previously established for a validated cerebrospinal fluid mNGS assay\u003csup\u003e21\u003c/sup\u003e and include a minimum of 5 million preprocessed reads per sample, \u0026gt;75% of data with quality score \u0026gt;30 (Q\u0026gt;30), and successful detection of the internal spiked MS2 phage control and all four respiratory viruses in the PC. A threshold criterion of ≥3 non-overlapping viral reads or contigs aligning to the target viral genome was considered a positive detection. Overall,\u0026nbsp;93% (156 of 167) of both positive (n= 111) and negative (n=56) nasopharyngeal swab samples met QC metrics, those that did not meet QC metrics were excluded from the analysis.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAnalytical Sensitivity\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe adopted Clinical and Laboratory Standards Institute (CLSI) guidelines for NGS-based infectious diseases testing (MM24)\u003csup\u003e24\u003c/sup\u003e and validation of multiplex nucleic acid assays (MM17)\u003csup\u003e25\u003c/sup\u003e to conduct a comprehensive evaluation of assay performance metrics\u003cstrong\u003e\u0026nbsp;(Table 1)\u003c/strong\u003e. To determine limits of detection (LoD), negative nasopharyngeal swab matrix was spiked with the Accuplex Verification Panel and diluted at concentrations ranging from 5,000 to 100 copies/mL, with 10 to 40 replicates at each concentration. By 95% probit analysis, the LoD was determined for each of the four representative organisms in the panel (SARS-CoV-2, Influenza A, Influenza B, and RSV). We found LoDs ranging from 439 to 706 copies/mL for the four respiratory viruses in the positive control \u003cstrong\u003e(Figure 3)\u003c/strong\u003e. The achieved average LoD of 550 copies/mL was comparable within one log to reported LoDs from specific reverse transcription-polymerase chain reaction (RT-PCR) assays for detection of viral respiratory pathogens\u003csup\u003e26\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eLinearity\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo evaluate the assay’s capability to accurately quantitate viral load for detected viruses, a linearity panel was generated using five log dilutions of a quantified \u0026nbsp;high-titer SARS-CoV-2 positive nasal swab sample and compared to a commercially available AccuSpan\u003csup\u003eTM\u003c/sup\u003e HCV RNA Linearity Panel. For both panels, the calculated linearity was 100% after running duplicates or triplicate replicates across a minimum of four 10-fold dilutions \u003cstrong\u003e(Supplementary Figure 1).\u0026nbsp;\u003c/strong\u003eThe absolute log\u003csub\u003e10\u003c/sub\u003e deviation of calculated from expected viral loads was \u0026lt;0.52 log\u003csub\u003e10\u003c/sub\u003e, which was favorable in comparison to the interquartile ranges for virus-specific qPCR assays between different laboratories\u003csup\u003e27\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003ePrecision\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe measured intra-assay precision by testing two PC and two NC samples within the same run using different barcodes across 20 runs and inter-assay precision by testing 20 PC and 20 NC samples using different barcodes across 20 separate runs. Essential agreement (EA) was 100% and intra- and inter-assay precision were within our \u003cem\u003ea priori\u0026nbsp;\u003c/em\u003eestablished limits of \u0026lt;10% and \u0026lt;30% (log-transformed coefficients of variation in reads per million), respectively \u003cstrong\u003e(Table 1)\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInclusivity and Exclusivity\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo evaluate the ability of the mNGS assay to detect a wide range of targets (inclusivity), we obtained commercially available culture supernatants from 17 respiratory viruses representing different sublineages and subspecies. Viruses were spiked into negative control matrix at concentrations ranging from 1.3 x 10\u003csup\u003e3\u003c/sup\u003e to 1.2 x 10\u003csup\u003e7\u0026nbsp;\u003c/sup\u003e50% tissue culture infective dose (TCID50) per mL in 1:10 ratio \u003cstrong\u003e(Table 2)\u003c/strong\u003e.All 17 (100%) of 17 viruses in these contrived samples were correctly identified by mNGS assay at the sublineage or subspecies level. Additionally, we identified subtypes of rhinovirus and enterovirus from PCR-positive clinical samples that were not differentiated by multiplex RT-PCR (\u003cstrong\u003eSupplementary Figure 2A\u003c/strong\u003e). We also evaluated the ability of the mNGS assay to identify uncommon or rare viral pathogens associated with respiratory infections (n=8 virus-positive tracheal aspirate samples) or central nervous system (CNS) infections (n=4 cerebrospinal fluid samples) in severely ill hospitalized patients \u003cstrong\u003e(Table 2, Supplementary Figure 2B).\u0026nbsp;\u003c/strong\u003eThe assay detected 11 (100%) of 11 viruses in these samples. To assess the exclusivity of the mNGS assay, we spiked two mixtures of microorganisms, including a previously reported positive control mNGS panel consisting of 7 representative pathogens\u003csup\u003e21\u003c/sup\u003e and a commercial reference panel consisting of 10 bacterial and fungal species, into negative nasopharyngeal swab matrix and analyzed multiple aliquots \u003cstrong\u003e(Table 1 and Supplementary Table 3)\u003c/strong\u003e\u003cstrong\u003e.\u0026nbsp;\u003c/strong\u003eDetected reads from non-viral pathogenic organisms did not result in any false-positive detections for viral pathogens. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eContamination,, Matrix Effect and Stability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe evaluated potential cross-contamination between nearby sample wells and carryover contamination across successive runs from 10 SARS-CoV-2 high-titer clinical samples and 24 controls (cycle threshold, or C\u003csub\u003et\u003c/sub\u003e = 16-20) loaded in a modified checkerboard pattern (with at least one space between samples) on a 96-well plate, to mimic a single run on the Illumina NextSeq instrument. Only one possible cross-contamination event was observed, with a single SARS-CoV-2 read detected in one of the negative control wells at a subthreshold reporting level. \u0026nbsp;We also evaluated the effects of interference from human RNA, bacterial DNA, and potential interfering substances on mNGS assay performance. Hemolysis, lipids, bilirubin, and human genomic RNA spiked into PC matrix at concentrations of 0.1 – 100 µg/mL did not interfere with respiratory virus detection, but background DNA/RNA spiked into PC matrix at concentrations ³1 x 10\u003csup\u003e7\u003c/sup\u003e cells/mL resulted in failure to detect viruses due to high background. To evaluate the potential matrix effect from samples with high host background, we analyzed 14 PCR-positive highly mucoid bronchoalveolar lavage (BAL) samples obtained from lung transplant or cystic fibrosis patients undergoing surveillance bronchoscopy \u003cstrong\u003e(Supplementary Table 4)\u003c/strong\u003e. All 14 samples had high host background, and 13 (92.9%) of 14 samples had very high host background. As a result, 6 (42.9%) of 14 samples had neither detection of the internal spiked MS2 phage control nor of a respiratory virus, and thus excluded from further analysis, as they not pass equencing quality control criteria \u003cstrong\u003e(Supplementary Table 1).\u0026nbsp;\u003c/strong\u003eThe respiratory viral pathogen was detected in all (100%) of the remaining 8 samples. We concluded that highly mucoid samples can inhibit the assay due to high host background.\u0026nbsp;Finally, we evaluated mNGS assay stability; qualitative detection was not affected by keeping samples for up to 7 days at 4°C or subjecting the samples to 3 freeze/thaw cycles.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAccuracy\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo evaluate accuracy, 191 residual samples after routine clinical testing were obtained from the UCSF Clinical Microbiology Laboratory, including 110 virus-positive samples (104 upper respiratory swab samples and 6 BAL fluids) from patients with acute respiratory infection \u003cstrong\u003e(Supplementary Dataset 1)\u003c/strong\u003e, along with 81 virus-negative samples (52 upper respiratory swab samples and 29 BAL fluids) \u003cstrong\u003e(Figure 4)\u003c/strong\u003e.As more than one target may be positive with mNGS and respiratory viral multiplex panel (RVP) testing using FDA-approved in vitro diagnostic (IVD) assays, sensitivity/specificity analyses were performed by assessing each result independently to assign true/false-positive/negative calls (see Methods for details). Compared to results from RVP RT-PCR testing, the mNGS assay exhibited 93.6% (103 of 110) sensitivity, 93.8% (76 of 81) specificity, and 93.7% (179 of 191) accuracy.\u003c/p\u003e\n\u003cp\u003eDiscrepancy testing and clinical adjudication (DTCA) of 14 mNGS positive-RVP negative samples using blinded chart review by two board-certified infectious diseases physician (PB and CYC) and orthogonal assays run by the California Department of Public Health Viral and Rickettsial Disease Laboratory confirmed the presence of 9 respiratory viruses missed by RVP, allowing them to be reclassified as true positives \u003cstrong\u003e(Supplementary Table 5)\u003c/strong\u003e. Viruses detected by mNGS but not targeted by RVP were not considered false-positive results. In one case, while the original RVP and orthogonal PCR testing returned negative results, mNGS identified rhinovirus C with high confidence. A review of the viral sequences revealed 12 non-overlapping reads across the human rhinovirus C genome \u003cstrong\u003e(Supplementary Figure 3)\u003c/strong\u003e. Cross-contamination was ruled out, as no other sample in the sequencing batch tested positive for rhinovirus. A nucleotide BLAST (blastn) search confirmed sequences with high homology (95-98% identity) to known rhinovirus C strains \u003cstrong\u003e(Supplementary Data 1)\u003c/strong\u003e. Although the exact primer binding sites for the clinical RT-PCR assays used in the current study are unknown, we identified, for the rhinovirus C sample, the presence of mismatches in primer and probe regions from previously reported RT-PCR assays targeting the 5’-untranslated region (UTR)\u003csup\u003e28,29\u003c/sup\u003e \u003cstrong\u003e(Supplementary Figure 3C),\u003c/strong\u003e which explained the detection by mNGS despite negative RT-PCR results.\u003c/p\u003e\n\u003cp\u003eSimilarly, DTCA was performed on the 7 mNGS negative / RVP positive samples along with repeating the RVP assay (if possible, on a different instrument). This reassessment resulted in 5.5 samples being reclassified as true negatives (1 sample harbored two organisms adjudicated as one true negative and one false negative) \u003cstrong\u003e(Supplementary Table 6)\u003c/strong\u003e. Compared to a composite standard that incorporates discrepancy testing and clinical adjudication, positive, negative, and overall predictive agreements of the mNGS assay were 98.7% (110.5 of 113), 98.1% (76.5 of 78), and 97.9% (187 of 191), respectively.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDetection of divergent viruses\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo benchmark the capability of the modified SURPI+ pipeline for detection of novel, highly divergent viruses \u003cem\u003ein silico\u003c/em\u003e, we created a simulated sequencing output file containing many known human viral pathogens of clinical and public health significance, including those with pandemic potential \u003cstrong\u003e(Figure 5, left)\u003c/strong\u003e. We then removed all viral reference sequences of the same type (for example, all human polyomviruses, coronaviruses, or parainfluenza viruses) or corresponding to the same genus or species from the SURPI+ 2019 reference database \u003cstrong\u003e(Figure 5, middle)\u003c/strong\u003e. Next, we used the SURPI+ pipeline to analyze the simulated sequencing file against both the original and “filtered” reference databases. In this analysis, 98.6% (69 of 70) of human viruses were detected at a sequencing depth of 100 reads per million (RPM) and 100% (70 of 70) at 1000 RPM based on homology to known animal or plant viruses \u003cstrong\u003e(Figure 5, right)\u003c/strong\u003e. Of note, bunyaviruses pathogenic to humans, which are among the most divergent viruses, were still identified by translated nucleotide (amino acid) alignment to plant viruses (for example, detection of Venezuelan equine encephalitis virus based on homology to vanilla latent virus in \u003cstrong\u003eFigure 3\u003c/strong\u003e).\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eWe validated a clinical mNGS assay in a CLIA laboratory as a Laboratory Developed Test (LDT) for agnostic viral respiratory pathogen detection intended to aid in patient diagnosis and public health surveillance. Our main goal was to develop, optimize, and streamline a protocol for respiratory viral mNGS testing that could be deployed and run routinely in clinical or public health laboratories. The mNGS assay developed here has favorable performance characteristics compared to clinical RVP testing, including a limit of detection of ~500 copies/mL, viral load quantification with 100% linearity, and sensitivity, specificity, and accuracy ranging from 93.6 \u0026ndash; 93.8%. However, in contrast to targeted assays such as RVP, the mNGS assay is capable of detecting, in principle, all known as well as novel viral pathogens in respiratory samples. In addition, mNGS assay performance was found to be superior to RVP (97.9% versus 95.0% overall agreement) after discrepancy testing and clinical adjudication. The correlations\u0026nbsp;we observed between viral load and disease severity highlight the potential for complementary quantitative viral load measurements to aid to distinguish beween asymptomatic infection and/or colonization and overt and/or severe respiratory disease, thereby informing clinical management and treatment, as has been previously demonstrated for certain non-respiratory viruses such as CMV\u003csup\u003e30\u003c/sup\u003e.Following completion of the validation, our assay received breakthrough device designation from the US Food and Drug Administration (FDA) in August of 2023. Widespread implementation of highly accurate, rapid mNGS assays such as this, with enhanced capacity to detect novel viruses, will support robust preparation for and rapid response to the next viral pandemic.\u003c/p\u003e\n\u003cp\u003eSpeed is a critical factor for diagnosis of respiratory infections, especially in critically ill patients with lower respiratory involvement and in outbreak investigations of novel or emerging viruses with pandemic potential. We also aimed to develop an assay that could be deployable widely in clinical and public health laboratories. Thus, we optimized many of the steps of the mNGS assay and moved the key RNA/cDNA library preparation step to an automated platform, the MagicPrep NGS system (Tecan Genomics, Inc., M\u0026auml;nnedorf, Switzerland). We further demonstrated that sequencing can be performed on the Illumina MiniSeq using the Rapid Reagent Kit for a faster 5-hour turnaround time or on the Illumina NextSeq 550Dx using the Mid-Output Reagent Kit for a 13-hour turnaround time, depending on laboratory needs and priorities. All together, these modifications resulted in an assay with a turnaround time of 14-24 hours and ~2 hours of hands-on technician time.\u003c/p\u003e\n\u003cp\u003eOrthogonal testing and clinical adjudication performed on discordant results demonstrated that the RVP assay is an imperfect gold standard on which to judge mNGS performance. The mNGS assay was able to not only detect uncommon infections from viruses not covered on existing RVP panels, but also, in multiple cases, detect viruses that would in principle be detectable by RVP but tested negative. Unlike RVP, mNGS does not rely on specific primers or probes and is thus less susceptible to primer failure due to viral evolution, as evidenced by the mNGS positive and RVP negative rhinovirus case presented here, and which can result in decreased assay sensitivity or false negative results due to viral mutation, which is an inevitable feature of SARS-CoV-2 and many other RNA viruses\u003csup\u003e31\u003c/sup\u003e. Notably, a previous study evaluating the usefulness of published PCR primers in detecting rhinovirus infection reported that none of the published rhinovirus-specific PCR primer pairs could detect all human rhinoviruses in 101 genotyped clinical specimens\u003csup\u003e32\u003c/sup\u003e. In addition, the broader sampling of the viral genome by mNGS may result in increased sensitivity of virus detection compared to RVP due to increased robustness to variability in the relative levels of viral gene expression by infected cells\u003csup\u003e33\u003c/sup\u003e. Most of the false-negative mNGS samples were confirmed as true negative after chart review and repeating the RVP assay. Most likely, these represented false-positive results during the original RVP run, given the high cycle thresholds (\u0026gt;36), suggesting low viral titers, or samples that had degraded over time and/or after multiple freezing and thawing cycles.\u003c/p\u003e\n\u003cp\u003eIn the study, we used several approaches to demonstrate the capacity of the mNGS assay to identify novel and/or emerging viruses with divergent genomes. The assay was successful in detecting uncommon and unusual viral pathogens associated with both severe respiratory infections (bronchoalveolar lavage fluid) and central nervous infections (CSF spiked into respiratory sample matrix). mNGS testing also enabled subtyping of specific viral strains with increased virulence, such as enterovirus D68, which has been linked to acute flaccid myelitis in children\u003csup\u003e34,35\u003c/sup\u003e, and rhinovirus C, which has been associated with invasive pulmonary and bloodstream infection in immunocompromised patients\u003csup\u003e36,37\u003c/sup\u003e. Importantly, the mNGS assay was also able to detect DNA viruses, such as adenovirus and bocavirus, in both clinical and contrived samples, despite the incorporation of DNase treatment in the protocol. Detection of DNA viruses is presumably based on detection of transcribed viral mRNA in infected cells, although may also enabled by incomplete DNA digestion from.the DNase enzyme.\u003c/p\u003e\n\u003cp\u003eTo evaluate the capacity for mNGS testing using a modified SURPI+ computational pipeline to identify novel viruses, we performed an \u003cem\u003ein silico\u0026nbsp;\u003c/em\u003eanalysis of a contrived metagenomic dataset consisting of reads from the genomes of human viruses of pandemic potential spiked into background using a reference database depleted of all known human viral sequences. This analysis was done to simulate whether \u0026ldquo;novel\u0026rdquo; human viruses with pandemic potential could be identified based on homology to known plant and animal viruses. All 70 of the human viral pathogens tested were successfully identified, including those with only remote homology to other viruses. Indeed, chikungunya virus, in the \u003cem\u003eAlphavirus\u003c/em\u003e genus of the \u003cem\u003eTogaviridae\u003c/em\u003e family, was only identified (after removal of all alphaviruses) because of distant homology to vanilla latent virus in the family \u003cem\u003eAlphaflexivirdae\u003c/em\u003e. Notably, alphaflexiviruses contain a distinct lineage of alphavirus-like replication proteins that lack a recognized protease domain\u003csup\u003e38\u003c/sup\u003e. Here we show \u003cem\u003ein silico\u003c/em\u003e that the pipeline is able to detect highly diverse viruses from families that are known to be potentially pathogenic to humans and that emerge from animal reservoirs (for example, \u003cem\u003eBunyaviridae, Flaviviridae,\u0026nbsp;\u003c/em\u003eand \u003cem\u003eAdenoviridae\u003c/em\u003e). If a novel, highly divergent virus from an uncharacterized family were detected, with little to no homology, much more work would be needed to ascertain its clinical significance, or whether it is even capable of infecting humans, including formal assessment of Koch\u0026rsquo;s postulates with modificatons by Rivers for causality\u003csup\u003e39\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003eOur validation study has limitations. First, we tested very few bronchoalveolar lavage fluid samples from patients with acute respiratory infection (n=6) and very few clinical samples harboring rare or unusual respiratory viruses (n=7), and further validation of assay performance with these kinds of samples is needed. Second, mNGS testing was performed exclusively on samples from US patients, so viral pathogen diversity may not represent all populations globally. Third, we did not formally prove that the mNGS assay would be able to detect a novel, sequence-divergent virus, but instead demonstrated the ability of the test to detect such a virus using an \u003cem\u003ein silico\u003c/em\u003e analysis, an approach which nonetheless has been used in previous studies to benchmark mNGS bioinformatic pipelines for viral pathogen discovery\u003csup\u003e40,41\u003c/sup\u003e. Finally, we did not address the utility of the mNGS assay for routine diagnosis in patients with unexplained infections, or for outbreak surveillance in public health, which will likely require future prospective clinical and/or epidemiologic investigation.\u003c/p\u003e\n\u003cp\u003eEven though the respiratory mNGS assay described here has demonstrated high performance characteristics for sensitivity and specificity for the detection of viral pathogens, it is currently unlikely to replace multiplex respiratory panels \u003cu\u003eas a first-line test\u003c/u\u003e since these are inexpensive and have more rapid turnaround times than mNGS. The projected costs of ~$300 USD per sample \u003cstrong\u003e(Supplementary Table 7)\u0026nbsp;\u003c/strong\u003emake the respiratory mNGS assay more expensive than standard RVP tests, for which costs in our clinical laboratory range from $77 to $149 USD. However, the benefits of greatly expanded scope of detection, capability to identify novel emerging viruses, and comparable performance likely outweigh the costs for certain clinical and public health scenarios. The test could be particularly useful in public health laboratories that are more likely to receive and test samples from patients infected with unusual or novel viruses that are not part of the standard RVP testing. Of note, a modified protocol based on the assay was used to identify adeno-associated virus 2 in co-infections with adenoviruses and herpesviruses in cases of acute severe hepatitis in children as part of a nationwide US outbreak\u003csup\u003e42\u003c/sup\u003e. The mNGS assay could also be implemented as a second-line test in clinical laboratories for patients with presumed viral bronchiolitis and pneumonia when RVP testing is negative. This strategy would be useful for diagnosis of rare and/or unexpected infections in immunocompromised patients or returning travelers, for whom there is a wider differential diagnosis.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResource availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eLead Contact\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFurther information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Charles Chiu (
[email protected]).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMaterials Availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study did not generate any new reagents.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData and Code Availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eHuman-subtracted raw sequence data were submitted to the Sequence Read Archive (SRA) database. (BioProject accession number PRJNA1084017 and umbrella BioProject accession number PRJNA171119). Sequence metadata, custom scripts and code for data analyses and visualization are available in a Zenodo data repository (https://doi.org/10.5281/zenodo.10553379).\u003c/p\u003e"},{"header":"Methods details","content":"\u003cp\u003e\u003cstrong\u003eHuman Sample Collection\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eResidual laboratory-confirmed virus-positive upper respiratory swab or BAL samples from clinical patient testing were retrieved from the UCSF Clinical Microbiology Laboratory and stored according to protocols approved by the UCSF Institutional Review Board (protocol no. 11-05519) . Acceptable upper respiratory swab samples included (1) bilateral nasopharyngeal swabs, (2) bilateral anterior nares swabs, (3) oropharyngeal swabs, (4) combined nasopharyngeal and oropharyngeal swabs, and (5) combined oropharyngeal/mid-turbinate nasal swabs. All samples were required to meet minimal sample handling, storage, and volume requirements for inclusion in our study. Samples were stored at 4°C for \u0026lt;24 hr prior to being de-identified, aliquoted, and stored in -80°C freezer prior to mNGS processing, thus undergoing one freeze-thaw cycle.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInclusion and Ethics\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll residual samples meeting minimal requirements were included in the study. Samples were de-identified prior to processing. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eExternal controls preparation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eExternal positive control (PC) was prepared by spiking a pooled negative nasal swab matrix with a commercially available reference material, the Accuplex Verification Panel\u0026nbsp;(SeraCare, Milford, MA). This panel consists of a mixture of non-infectious SARS-CoV-2, influenza A, influenza B, and RSV genomes encapsidated in a synthetic protein coat to mimic the structure of a viral capsid. This PC material was “spiked in” at a titer of approximately 10\u003csup\u003e4\u003c/sup\u003e copies/mL for each virus control, which is 1–2 logs higher than the estimated limit of detection of the assay(~500 copies/mL).\u0026nbsp;The negative matrix was prepared by pooling nasopharyngeal swab samples from asymptomatic individuals and was used as an external negative control (NC).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eNucleic acid extraction\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e500 µL of upper respiratory swab or BAL fluid was centrifuged at 16,000 x g for 10 minutes. The\u0026nbsp;MagMAX™ Viral/Pathogen II (MVP II) Nucleic Acid Isolation Kit\u0026nbsp;(Thermo Fisher Scientific, Waltham, MA) and the KingFisher™\u0026nbsp;Flex Purification System with a 96 deep-well head (Thermo Fisher Scientific, Waltham, MA) were used for total nucleic acid extraction. This protocol\u0026nbsp;was modified to include DNase treatment as a host depletion step during extraction. \u0026nbsp;Bacteriophage MS2 (Zeptometrix, Buffalo, NY) was added to all samples including the negative control as an internal qualitative control.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eLibrary preparation and sequencing\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSimultaneous reverse transcription of purified\u0026nbsp;RNA, spiked in with ERCC RNA controls (Invitrogen, Waltham, MA), and ribosomal RNA (rRNA) depletion were carried out using NEBNext® Ultra™ II RNA First Strand Synthesis Module (New England Biolabs, Ipswich, MA) and QIAseq FastSelect-rRNA HMR Kit (Qiagen, Germantown, MD), respectively,\u0026nbsp;followed by second strand cDNA synthesis using Sequenase™ Version 2.0 DNA Polymerase (Thermo Fisher Scientific, Waltham, MA). Complementary DNA (cDNA) was purified using AMPure XP beads (Beckman Coulter, Brea, CA) and loaded on the MagicPrep NGS instrument (Tecan Genomics, Inc., Männedorf, Switzerland) to undergo end-repair, adapter ligation and barcoding, amplification (25 cycles) and purification. Libraries were quantified and normalized using the Qubit dsDNA HS Assay (Thermo Fisher Scientific,\u0026nbsp;Waltham, MA) on the Qubit Flex (Thermo Fisher Scientific, Waltham, MA). Final pooled libraries were sequenced as single-end reads on either the Illumina (San Diego, CA)\u0026nbsp;MiniSeq using the Rapid Reagent Kit (100 cycles) or on the Illumina NextSeq 550 using the Mid-Output or High-Output Kit (150 cycles).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eBioinformatics\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe SURPI+ computational pipeline, run as a container (v1.0.0) on either a secure server or cloud infrastructure, was used for identification of respiratory viral pathogens from mNGS data. Reads were preprocessed by trimming of adapters and removal of low-complexity and low-quality sequences, followed by computational subtraction of human reads. The Scalable Nucleotide Alignment Program (SNAP)\u003csup\u003e43\u003c/sup\u003e nucleotide aligner was run using an edit distance of 16 against the National Center for Biotechnology Information (NCBI) nucleotide (NT) database (March 2019, with inclusion of the SARS-CoV-2 WuHan-Hu-1 genome accession number NC_045512) filtered to retain only viral reads. The pipeline was modified to include “tagging”, or annotation, of entries from reference sequences that constitute a subset of the NCBI NT database, such as FDA-ARGOS\u003csup\u003e23\u003c/sup\u003e. Note that the FDA-ARGOS database, while\u0026nbsp;quality controlled and regulated,\u0026nbsp;contains only 1,428 microbial strains, the majority of which are bacterial. It had also not been updated with recent viruses such as SARS-CoV-2; thus, we did not detect any reads matching to viral genomes in this study. The pipeline is also able to accommodate additional reference databases as needed such as GISAID\u003csup\u003e44\u003c/sup\u003e.\u0026nbsp;The pipeline was also modified to include optional de novo assembly of reads into contiuous sequences (contigs) and translated nucleotide sequence alignment of both reads and contigs using SPAdes \u003csup\u003e45\u003c/sup\u003e and e\u003csup\u003e46\u003c/sup\u003e, respectively. Viral reads are identified using DIAMOND at a e-value cutoff of 10\u003csup\u003e-5\u003c/sup\u003e. Coverage maps were automatically generated by mapping reads classified by SURPI as viral to the most likely reference genome.\u003c/p\u003e\n\u003cp\u003eQuality control metrics for the assay were based on those previously established for cerebrospinal fluid\u003csup\u003e21\u003c/sup\u003e, and include a minimum of 5 million preprocessed reads per sample, \u0026gt;75% of data with quality score \u0026gt;30 (Q\u0026gt;30), and successful detection of the 4 respiratory viruses in the PC and the internal spiked MS2 phage control. A criterion of ≥3 non-overlapping viral reads or contigs aligning to the target viral genome was considered a positive detection.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEvaluation of mNGS analytical performance characteristics\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe automated standard operating procedures and sequencing runs for these clinical validation studies were performed by a state-licensed clinical laboratory scientist.LoD was determined for each of the four representative organisms in the PC by probit analysis using a series of dilutions ranging from 100 to 5,000 copies/mL, with 10 to 40 replicates at each concentration. Linearity was demonstrated by plotting the standard curve. To validate the quantification using the ERCC and the positive control, we serially diluted an HCV positive plasma to known concentration ranging from 4 x 10\u003csup\u003e6\u003c/sup\u003e to 4 x 10\u003csup\u003e3\u0026nbsp;\u003c/sup\u003ecopies/mL in triplicates. We then compared the quantitative measure to the known measure. Precision was determined using repeat analysis of two PC and two NC samples across 20 runs (intra-assay reproducibility) and by testing 20 PC and 20 NC across 20 separate runs (inter-assay reproducibility). To assess inclusivity, commercially available cultured supernatants were obtained to assess the assay’s ability to detect the intended targets. Each of the 17 respiratory viruses, titers ranging from 1.3 x 10\u003csup\u003e4\u003c/sup\u003e to 1.2 x 10\u003csup\u003e8\u003c/sup\u003e TCID50/mL, were spiked into the negative control matrix at 1:10 dilutions. These viruses represented known sublineages and subspecies and we evaluated their identification by our assay. We also tested samples of confirmed virus-positive BAL (n=7) and CSF samples (n=4) spiked into negative matrix to evaluate the detection of unusual viruses. To assess the exclusivity of the mNGS assay, we spiked a previously established mixture of seven representative pathogenic organisms to verify the false positive detection for viral pathogens. We evaluated cross-contamination between adjacent sample wells and carryover contamination across successive runs from samples with high viral loads. Interference was determined using PC spiked with known amount of hemolytic blood, lipids, bilirubin, human RNA, bacterial DNA/RNA. The effect of mucus in BAL positive fluid was also assessed. Stability was determined by keeping samples for up to 7 days at 4°C or subjecting the samples to 3 freeze/thaw cycles. Accuracy was determined using 191 clinical samples comprising 110 virus-positive samples (103 upper respiratory swab samples and 7 BAL fluids) from patients with acute respiratory infection, along with 81 virus-negative samples (52 upper respiratory swab samples and 29 BAL fluids). Samples were obtained from patients at the University of California, San Francisco (UCSF). The viral RT-PCR comparator assays that were used include the Genmark ePlex (Carlsbad, CA), Luminex NxTAG (Austin, TX), and/or Luminex Verigene RP Flex Respiratory Pathogen Panels. mNGS results were compared with original clinical testing and then with a composite reference standard including discrepancy testing and clinical adjudication. In the second comparison, when results were discordant, orthogonal testing was performed using a different instrument or an independent CLIA laboratory (the California Department of Public Health) in addition to clinical adjudication to reclassify mNGS results. The second comparison was reported as positive percent agreement (PPA) and negative percent agreement (NPA), as selective discrepancy testing can bias sensitivity and specificity results.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eOrthogonal discrepancy testing at the California Department of Public Health\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSpecimens were tested by real-time PCR based on CDC protocols using a viral respiratory panel, an unpublished CDPH laboratory-developed test (LDT). Viruses that can be detected by this panel include human metapneumovirus, respiratory syncytial virus, adenovirus, parainfluenza virus (types 1, 2, 3, and 4), enterovirus/rhinovirus, and human coronaviruses 229E, OC43, NL63, and HKU1. \u0026nbsp; \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eIn\u0026nbsp;\u003c/em\u003esilico analysis for identification of novel and/or divergent viruses using the SURPI+ pipeline\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo measure accurate detection of novel and/or divergent viruses, an \u003cem\u003ein silico\u0026nbsp;\u003c/em\u003eanalysis was performed. Representative viral reference genomes corresponding to outbreak viruses of clinical and public health significance with pandemic potential were retrieved from the NCBI GenBank database, partitioned into non-overlapping segments, and then randomly sampled and spiked \u003cem\u003ein silico\u0026nbsp;\u003c/em\u003einto a negative nasal swab matrix sequencing library. We then took a higher-level set of taxonomic identifiers (species, genus, and/or family) corresponding to these viruses and removed all entries with these taxonomic identifiers from the SURPI+ reference dataset. Next, we used the SURPI+ pipeline to analyze the simulated sequencing file against both the original and “restricted reference” databases and evaluated the performance of the pipeline in detecting “simulated” novel and/or divergent viruses that lacked a reference sequence.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStatistical analyses\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSensitivity and specificity analyses were performed as follows: as more than one target may be positive with mNGS and RVP, each result was independently assessed in every sample and true/false-negative/positive were accordingly assigned to each result. However, the total number of observations was kept constant (one sample = one observation = 1). For instance, in the case a test detected two organisms, namely the real culprit pathogen and a contaminant, the former was assigned 0.5 true-positive (TP) and the latter 0.5 false-positive (FP), in order as their sum was always equal to 1. In addition, as we used RVP as a comparator which includes a limited number of targets, mNGS positive-RVP negative results that were not a target for the RVP were not considered as false-positive results.\u003c/p\u003e\n\u003cp\u003eStatistical analyses were performed using scipy (version 1.5.3) and rstatix (version 0.7.0) packages as implemented in Python (version 3.7.12) and R (version 4.0.3), respectively. Probit regression analyses were done using scipy (version 1.5.3), numpy (version 1.19.1) and statsmodels (version 0.12.2) as implemented in Python software (version 3.7.12).\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank the staff at the UCSF Clinical Microbiology Laboratory for help in collecting nasopharyngeal swab and bronchoalveolar lavage fluid samples. This work was financially supported in part by BARDA EZ-BAA award 75A50122C00022 (C.Y.C.), US CDC grants 75D30122C15360 and 75D30121C12641 (C.Y.C.), Abbott Laboratories (C.Y.C.), and the Chan-Zuckerberg Biohub (C.Y.C.). The funders had no role\u0026nbsp;in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review or approval of the manuscript; and decision to submit the manuscript for publication.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDisclaimer\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe content of this paper is solely the responsibility of the authors and does not represent the official views or opinions of the National Institutes of Health, Kaiser Permanente, California Department of Public Health or the California Health and Human Services Agency. Use of trade names and commercial sources is for identification only and does not imply endorsement by the California Department of Public Health or the California Health and Human Services Agency.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eC.Y.C. is a founder of Delve Bio and on the scientific advisory board for Delve Bio, Flightpath Biosciences, Biomeme, Mammoth Biosciences, BiomeSense and Poppy Health. He is also an inventor on US patent 11380421, “Pathogen detection using next generation sequencing”, under which algorithms for taxonomic classification, filtering and pathogen detection are used by SURPI+ software. C.Y.C. receives research support from Delve Bio and Abbott Laboratories, Inc. The other authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eJ. Tan, V.S., D.S., and C.Y.C conceived and designed the study. J. Tan, V.S., D.S., N.S., A.F., H.J.H., J.N., M.O., N.B., J. Tang, D.I., B.F., H.R., M.H., C.M., D.A.W., and C.Y.C coordinated the sequencing efforts and laboratory studies. J. Tan, A.C., H.C., and S.Y. processed samples. J. Tan, V.S., D.S., E.K., A.C., H.C., S.Y., M.D.L., P.B., and C.Y.C. analyzed data. J. Tan, N.S., A.F., J.N., M.O., P.M.M., and C.L. collected samples. J. Tan, V.S., E.K., P.B., M.D.L and C.Y.C. wrote the manuscript. J. Tan, V.S., E.K., P.B., and C.Y.C. prepared the figures. J. Tan, V.S., D.S., E.K., N.S., A.F., H.J.H., J.N., M.O., N.B., J. Tang, D.I., B.F., H.R., M.H., D.A.W., P.M.M., C.R.L., M.D.L., P.B., and C.Y.C edited the manuscript. J. Tan, V.S., E.K., M.D.L., P.B., and C.Y.C. revised the manuscript. All authors read the manuscript and agree to its contents.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eDALYs, G.B.D.\u003cem\u003e, et al.\u003c/em\u003e Global, regional, and national disability-adjusted life years (DALYs) for 306 diseases and injuries and healthy life expectancy (HALE) for 188 countries, 1990-2013: quantifying the epidemiological transition. \u003cem\u003eLancet\u003c/em\u003e \u003cstrong\u003e386\u003c/strong\u003e, 2145-2191, doi: 10.1016/S0140-6736(15)61340-X (2015).\u003c/li\u003e\n \u003cli\u003eJain, S.\u003cem\u003e, et al.\u003c/em\u003e Community-Acquired Pneumonia Requiring Hospitalization among U.S. Adults. \u003cem\u003eN Engl J Med\u003c/em\u003e \u003cstrong\u003e373\u003c/strong\u003e, 415-427, doi: 10.1056/NEJMoa1500245 (2015).\u003c/li\u003e\n \u003cli\u003eJain, S.\u003cem\u003e, et al.\u003c/em\u003e Community-acquired pneumonia requiring hospitalization among U.S. children. \u003cem\u003eN Engl J Med\u003c/em\u003e \u003cstrong\u003e372\u003c/strong\u003e, 835-845, doi: 10.1056/NEJMoa1405870 (2015).\u003c/li\u003e\n \u003cli\u003eMusher, D.M. \u0026amp; Thorner, A.R. Community-acquired pneumonia. \u003cem\u003eN Engl J Med\u003c/em\u003e \u003cstrong\u003e371\u003c/strong\u003e, 1619-1628, doi: 10.1056/NEJMra1312885 (2014).\u003c/li\u003e\n \u003cli\u003eCharlton, C.L.\u003cem\u003e, et al.\u003c/em\u003e Practical Guidance for Clinical Microbiology Laboratories: Viruses Causing Acute Respiratory Tract Infections. \u003cem\u003eClin Microbiol Rev\u003c/em\u003e \u003cstrong\u003e32\u003c/strong\u003e, doi: 10.1128/CMR.00042-18 (2019).\u003c/li\u003e\n \u003cli\u003eEvans, S.E.\u003cem\u003e, et al.\u003c/em\u003e Nucleic Acid-based Testing for Noninfluenza Viral Pathogens in Adults with Suspected Community-acquired Pneumonia. An Official American Thoracic Society Clinical Practice Guideline. \u003cem\u003eAm J Respir Crit Care Med\u003c/em\u003e \u003cstrong\u003e203\u003c/strong\u003e, 1070-1087, doi: 10.1164/rccm.202102-0498ST (2021).\u003c/li\u003e\n \u003cli\u003eJain, S. Epidemiology of Viral Pneumonia. \u003cem\u003eClin Chest Med\u003c/em\u003e \u003cstrong\u003e38\u003c/strong\u003e, 1-9, doi: 10.1016/j.ccm.2016.11.012 (2017).\u003c/li\u003e\n \u003cli\u003eSchlaberg, R.\u003cem\u003e, et al.\u003c/em\u003e Viral Pathogen Detection by Metagenomics and Pan-Viral Group Polymerase Chain Reaction in Children With Pneumonia Lacking Identifiable Etiology. \u003cem\u003eJ Infect Dis\u003c/em\u003e \u003cstrong\u003e215\u003c/strong\u003e, 1407-1415, doi: 10.1093/infdis/jix148 (2017).\u003c/li\u003e\n \u003cli\u003eJones, K.E.\u003cem\u003e, et al.\u003c/em\u003e Global trends in emerging infectious diseases. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e451\u003c/strong\u003e, 990-993, doi: 10.1038/nature06536 (2008).\u003c/li\u003e\n \u003cli\u003eZhou, P.\u003cem\u003e, et al.\u003c/em\u003e A pneumonia outbreak associated with a new coronavirus of probable bat origin. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e579\u003c/strong\u003e, 270-273, doi: 10.1038/s41586-020-2012-7 (2020).\u003c/li\u003e\n \u003cli\u003eLu, R.\u003cem\u003e, et al.\u003c/em\u003e Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. \u003cem\u003eLancet\u003c/em\u003e \u003cstrong\u003e395\u003c/strong\u003e, 565-574, doi: 10.1016/S0140-6736(20)30251-8 (2020).\u003c/li\u003e\n \u003cli\u003eChiu, C.Y. \u0026amp; Miller, S.A. Clinical metagenomics. \u003cem\u003eNat Rev Genet\u003c/em\u003e \u003cstrong\u003e20\u003c/strong\u003e, 341-355, doi: 10.1038/s41576-019-0113-7 (2019).\u003c/li\u003e\n \u003cli\u003eSimner, P.J., Miller, S. \u0026amp; Carroll, K.C. Understanding the Promises and Hurdles of Metagenomic Next-Generation Sequencing as a Diagnostic Tool for Infectious Diseases. \u003cem\u003eClin Infect Dis\u003c/em\u003e \u003cstrong\u003e66\u003c/strong\u003e, 778-788, doi: 10.1093/cid/cix881 (2018).\u003c/li\u003e\n \u003cli\u003eBlauwkamp, T.A.\u003cem\u003e, et al.\u003c/em\u003e Analytical and clinical validation of a microbial cell-free DNA sequencing test for infectious disease. \u003cem\u003eNat Microbiol\u003c/em\u003e \u003cstrong\u003e4\u003c/strong\u003e, 663-674, doi: 10.1038/s41564-018-0349-6 (2019).\u003c/li\u003e\n \u003cli\u003eGaston, D.C.\u003cem\u003e, et al.\u003c/em\u003e Evaluation of Metagenomic and Targeted Next-Generation Sequencing Workflows for Detection of Respiratory Pathogens from Bronchoalveolar Lavage Fluid Specimens. \u003cem\u003eJ Clin Microbiol\u003c/em\u003e \u003cstrong\u003e60\u003c/strong\u003e, e0052622, doi: 10.1128/jcm.00526-22 (2022).\u003c/li\u003e\n \u003cli\u003eWilson, M.R.\u003cem\u003e, et al.\u003c/em\u003e Clinical Metagenomic Sequencing for Diagnosis of Meningitis and Encephalitis. \u003cem\u003eN Engl J Med\u003c/em\u003e \u003cstrong\u003e380\u003c/strong\u003e, 2327-2340, doi: 10.1056/NEJMoa1803396 (2019).\u003c/li\u003e\n \u003cli\u003eLee, R.A., Al Dhaheri, F., Pollock, N.R. \u0026amp; Sharma, T.S. Assessment of the Clinical Utility of Plasma Metagenomic Next-Generation Sequencing in a Pediatric Hospital Population. \u003cem\u003eJ Clin Microbiol\u003c/em\u003e \u003cstrong\u003e58\u003c/strong\u003e, doi: 10.1128/JCM.00419-20 (2020).\u003c/li\u003e\n \u003cli\u003eHan, D.\u003cem\u003e, et al.\u003c/em\u003e The Real-World Clinical Impact of Plasma mNGS Testing: an Observational Study. \u003cem\u003eMicrobiol Spectr\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, e0398322, doi: 10.1128/spectrum.03983-22 (2023).\u003c/li\u003e\n \u003cli\u003eMiller, S. \u0026amp; Chiu, C. The Role of Metagenomics and Next-Generation Sequencing in Infectious Disease Diagnosis. \u003cem\u003eClin Chem\u003c/em\u003e \u003cstrong\u003e68\u003c/strong\u003e, 115-124, doi: 10.1093/clinchem/hvab173 (2021).\u003c/li\u003e\n \u003cli\u003eBenoit, P.\u003cem\u003e, et al.\u003c/em\u003e Metagenomic next-generation sequencing of cerebrospinal fluid for diagnosis of central nervous system infections: 7-year performance of a clinically validated test. \u003cem\u003emedRxiv\u003c/em\u003e, doi: (2024).\u003c/li\u003e\n \u003cli\u003eMiller, S.\u003cem\u003e, et al.\u003c/em\u003e Laboratory validation of a clinical metagenomic sequencing assay for pathogen detection in cerebrospinal fluid. \u003cem\u003eGenome Res\u003c/em\u003e \u003cstrong\u003e29\u003c/strong\u003e, 831-842, doi: 10.1101/gr.238170.118 (2019).\u003c/li\u003e\n \u003cli\u003eNaccache, S.N.\u003cem\u003e, et al.\u003c/em\u003e A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples. \u003cem\u003eGenome Res\u003c/em\u003e \u003cstrong\u003e24\u003c/strong\u003e, 1180-1192, doi: 10.1101/gr.171934.113 (2014).\u003c/li\u003e\n \u003cli\u003eSichtig, H.\u003cem\u003e, et al.\u003c/em\u003e FDA-ARGOS is a database with public quality-controlled reference genomes for diagnostic use and regulatory science. \u003cem\u003eNat Commun\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 3313, doi: 10.1038/s41467-019-11306-6 (2019).\u003c/li\u003e\n \u003cli\u003eClinical Laboratory Standards Institute. Molecular Methods for Genotyping and Strain Typing of Infectious Organisms, 1st Edition. Vol. 24 (ed. Institute, C.a.L.S.) (Clinical and Laboratory Standards Institute, Wayne, Pennsylvania, 2021).\u003c/li\u003e\n \u003cli\u003eClinical Laboratory Standards Institute. Validation and Verification of Multiplex Nucleic Acid Assays, 2nd Edition. Vol. 9 (ed. Institute, C.a.L.S.) (Clinical and Laboratory Standards Institute, Wayne, Pennsylvania, 2018).\u003c/li\u003e\n \u003cli\u003eEspy, M.J.\u003cem\u003e, et al.\u003c/em\u003e Real-time PCR in clinical microbiology: applications for routine laboratory testing. \u003cem\u003eClin Microbiol Rev\u003c/em\u003e \u003cstrong\u003e19\u003c/strong\u003e, 165-256, doi: 10.1128/CMR.19.1.165-256.2006 (2006).\u003c/li\u003e\n \u003cli\u003eHayden, R.T.\u003cem\u003e, et al.\u003c/em\u003e Progress in Quantitative Viral Load Testing: Variability and Impact of the WHO Quantitative International Standards. \u003cem\u003eJ Clin Microbiol\u003c/em\u003e \u003cstrong\u003e55\u003c/strong\u003e, 423-430, doi: 10.1128/JCM.02044-16 (2017).\u003c/li\u003e\n \u003cli\u003eAndeweg, A.C., Bestebroer, T.M., Huybreghs, M., Kimman, T.G. \u0026amp; de Jong, J.C. Improved detection of rhinoviruses in clinical samples by using a newly developed nested reverse transcription-PCR assay. \u003cem\u003eJ Clin Microbiol\u003c/em\u003e \u003cstrong\u003e37\u003c/strong\u003e, 524-530, doi: 10.1128/JCM.37.3.524-530.1999 (1999).\u003c/li\u003e\n \u003cli\u003eLu, X.\u003cem\u003e, et al.\u003c/em\u003e Real-time reverse transcription-PCR assay for comprehensive detection of human rhinoviruses. \u003cem\u003eJ Clin Microbiol\u003c/em\u003e \u003cstrong\u003e46\u003c/strong\u003e, 533-539, doi: 10.1128/JCM.01739-07 (2008).\u003c/li\u003e\n \u003cli\u003eRazonable, R.R. \u0026amp; Hayden, R.T. Clinical utility of viral load in management of cytomegalovirus infection after solid organ transplantation. \u003cem\u003eClin Microbiol Rev\u003c/em\u003e \u003cstrong\u003e26\u003c/strong\u003e, 703-727, doi: 10.1128/CMR.00015-13 (2013).\u003c/li\u003e\n \u003cli\u003eClark, C., Schrecker, J., Hardison, M. \u0026amp; Taitel, M.S. Validation of reduced S-gene target performance and failure for rapid surveillance of SARS-CoV-2 variants. \u003cem\u003ePLoS One\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, e0275150, doi: 10.1371/journal.pone.0275150 (2022).\u003c/li\u003e\n \u003cli\u003eFaux, C.E.\u003cem\u003e, et al.\u003c/em\u003e Usefulness of published PCR primers in detecting human rhinovirus infection. \u003cem\u003eEmerg Infect Dis\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 296-298, doi: 10.3201/eid1702.101123 (2011).\u003c/li\u003e\n \u003cli\u003eRussell, A.B., Trapnell, C. \u0026amp; Bloom, J.D. Extreme heterogeneity of influenza virus infection in single cells. \u003cem\u003eElife\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, doi: 10.7554/eLife.32303 (2018).\u003c/li\u003e\n \u003cli\u003eGreninger, A.L.\u003cem\u003e, et al.\u003c/em\u003e A novel outbreak enterovirus D68 strain associated with acute flaccid myelitis cases in the USA (2012-14): a retrospective cohort study. \u003cem\u003eLancet Infect Dis\u003c/em\u003e \u003cstrong\u003e15\u003c/strong\u003e, 671-682, doi: 10.1016/S1473-3099(15)70093-9 (2015).\u003c/li\u003e\n \u003cli\u003eMessacar, K.\u003cem\u003e, et al.\u003c/em\u003e Enterovirus D68 and acute flaccid myelitis-evaluating the evidence for causality. \u003cem\u003eLancet Infect Dis\u003c/em\u003e \u003cstrong\u003e18\u003c/strong\u003e, e239-e247, doi: 10.1016/S1473-3099(18)30094-X (2018).\u003c/li\u003e\n \u003cli\u003eLupo, J.\u003cem\u003e, et al.\u003c/em\u003e Disseminated rhinovirus C8 infection with infectious virus in blood and fatal outcome in a child with repeated episodes of bronchiolitis. \u003cem\u003eJ Clin Microbiol\u003c/em\u003e \u003cstrong\u003e53\u003c/strong\u003e, 1775-1777, doi: 10.1128/JCM.03484-14 (2015).\u003c/li\u003e\n \u003cli\u003eSayama, A.\u003cem\u003e, et al.\u003c/em\u003e Comparison of Rhinovirus A-, B-, and C-Associated Respiratory Tract Illness Severity Based on the 5\u0026apos;-Untranslated Region Among Children Younger Than 5 Years. \u003cem\u003eOpen Forum Infect Dis\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, ofac387, doi: 10.1093/ofid/ofac387 (2022).\u003c/li\u003e\n \u003cli\u003eKreuze, J.F.\u003cem\u003e, et al.\u003c/em\u003e ICTV Virus Taxonomy Profile: Alphaflexiviridae. \u003cem\u003eJ Gen Virol\u003c/em\u003e \u003cstrong\u003e101\u003c/strong\u003e, 699-700, doi: 10.1099/jgv.0.001436 (2020).\u003c/li\u003e\n \u003cli\u003eGuo, C. \u0026amp; Wu, J.Y. Pathogen Discovery in the Post-COVID Era. \u003cem\u003ePathogens\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, doi: 10.3390/pathogens13010051 (2024).\u003c/li\u003e\n \u003cli\u003eWood, D.E. \u0026amp; Salzberg, S.L. Kraken: ultrafast metagenomic sequence classification using exact alignments. \u003cem\u003eGenome Biol\u003c/em\u003e \u003cstrong\u003e15\u003c/strong\u003e, R46, doi: 10.1186/gb-2014-15-3-r46 (2014).\u003c/li\u003e\n \u003cli\u003eFlygare, S.\u003cem\u003e, et al.\u003c/em\u003e Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling. \u003cem\u003eGenome Biol\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 111, doi: 10.1186/s13059-016-0969-1 (2016).\u003c/li\u003e\n \u003cli\u003eServellita, V.\u003cem\u003e, et al.\u003c/em\u003e Adeno-associated virus type 2 in US children with acute severe hepatitis. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e617\u003c/strong\u003e, 574-580, doi: 10.1038/s41586-023-05949-1 (2023).\u003c/li\u003e\n \u003cli\u003eZaharia, M.\u003cem\u003e, et al.\u003c/em\u003e Alignment in a SNAP: Cancer Diagnosis in the Genomic Age. \u003cem\u003eLaboratory Investigation\u003c/em\u003e \u003cstrong\u003e92\u003c/strong\u003e, 458a-458a, doi: (2012).\u003c/li\u003e\n \u003cli\u003eShu, Y. \u0026amp; McCauley, J. GISAID: Global initiative on sharing all influenza data - from vision to reality. \u003cem\u003eEuro Surveill\u003c/em\u003e \u003cstrong\u003e22\u003c/strong\u003e, doi: 10.2807/1560-7917.ES.2017.22.13.30494 (2017).\u003c/li\u003e\n \u003cli\u003eBankevich, A.\u003cem\u003e, et al.\u003c/em\u003e SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. \u003cem\u003eJ Comput Biol\u003c/em\u003e \u003cstrong\u003e19\u003c/strong\u003e, 455-477, doi: 10.1089/cmb.2012.0021 (2012).\u003c/li\u003e\n \u003cli\u003eBuchfink, B., Xie, C. \u0026amp; Huson, D.H. Fast and sensitive protein alignment using DIAMOND. \u003cem\u003eNat Methods\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, 59-60, doi: 10.1038/nmeth.3176 (2015).\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Tables","content":"\u003cp\u003e\u003cstrong\u003eTable 1. Performance characteristics of the UCSF viral respiratory mNGS assay\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"945\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eMetrics\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eMethod\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eExpected target\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"29.947089947089946%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eResults\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" valign=\"top\"\u003e\n \u003cp\u003eLimit of detection (LoD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of PC dilution by probit analysis\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;1000 copies/mL\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.91005291005291%\" valign=\"top\"\u003e\n \u003cp\u003eTarget\u003c/p\u003e\n \u003cp\u003eSARS-CoV-2\u003c/p\u003e\n \u003cp\u003eInfluenza A\u003c/p\u003e\n \u003cp\u003eInfluenza B\u003c/p\u003e\n \u003cp\u003eRSV\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"17.037037037037038%\" valign=\"top\"\u003e\n \u003cp\u003eLoD\u003c/p\u003e\n \u003cp\u003e439 copies/mL\u003c/p\u003e\n \u003cp\u003e706 copies/mL\u003c/p\u003e\n \u003cp\u003e493 copies/mL\u003c/p\u003e\n \u003cp\u003e563 copies/mL\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" valign=\"top\"\u003e\n \u003cp\u003eLinearity\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eCorrelation of PC with assay quantification\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eR\u003csup\u003e2\u003c/sup\u003e \u0026gt; 90%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.91005291005291%\" valign=\"top\"\u003e\n \u003cp\u003eR\u003csup\u003e2\u003c/sup\u003e = 100 %\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"17.037037037037038%\" valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" rowspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003ePrecision\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003cp\u003eIntra-Assay: PC and NC within the same run across 20 runs.\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"10.476190476190476%\" valign=\"top\"\u003e\n \u003cp\u003eConcordance\u003c/p\u003e\n \u003cp\u003e100% EA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"14.603174603174603%\" valign=\"top\"\u003e\n \u003cp\u003eLog-transformed CV\u003c/p\u003e\n \u003cp\u003e\u0026lt;10%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.91005291005291%\" valign=\"top\"\u003e\n \u003cp\u003eConcordance\u003c/p\u003e\n \u003cp\u003e100% EA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"17.037037037037038%\" valign=\"top\"\u003e\n \u003cp\u003eLog-transformed CV\u003c/p\u003e\n \u003cp\u003e\u0026lt;10%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"38.89541715628672%\" valign=\"top\"\u003e\n \u003cp\u003eInter-Assay: PC and NC across 20 separate runs\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"11.63337250293772%\" valign=\"top\"\u003e\n \u003cp\u003e100% EA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"16.216216216216218%\" valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;30%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"14.336075205640423%\" valign=\"top\"\u003e\n \u003cp\u003e100% EA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"18.91891891891892%\" valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;30%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" rowspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eInclusivity\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of viruses from diluted culture supernatant\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e100% detection\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"29.947089947089946%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e100% detection (17/17)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"38.89541715628672%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of viruses in positive BAL/CSF diluted samples\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"27.849588719153935%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e100% detection\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"33.25499412455934%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e100% detection (11/11)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" valign=\"top\"\u003e\n \u003cp\u003eExclusivity\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of viruses in known organism mixtures\u003csup\u003ea\u003c/sup\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eNo false-positive\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"29.947089947089946%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eNo false-positive\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" valign=\"top\"\u003e\n \u003cp\u003eContamination\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of cross-contamination on the sample wells\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eNo carryover contamination\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"29.947089947089946%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eCross-contamination of 0.1% between adjacent wells but no carryover contamination\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" rowspan=\"4\" valign=\"top\"\u003e\n \u003cp\u003eInterference\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of PC spiked with hemolytic blood\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eDetection at all concentrations\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"29.947089947089946%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eDetection at all concentrations\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"38.89541715628672%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of PC spiked with Human RNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"27.849588719153935%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eDetection at all concentrations\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"33.25499412455934%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eDetection at all concentrations\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"38.89541715628672%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of PC spiked with bacterial DNA/RNA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"27.849588719153935%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eDetection at concentration \u0026le; 10\u003csup\u003e7\u0026nbsp;\u003c/sup\u003ecells/mL\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"33.25499412455934%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eDetection at concentration \u0026le; 10\u003csup\u003e7\u0026nbsp;\u003c/sup\u003ecells/mL\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"38.89541715628672%\"\u003e\n \u003cp\u003eDetection of virus-positive overtly mucoid BAL samples\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"27.849588719153935%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eDetection in all BAL samples\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"33.25499412455934%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eTarget detected in 13/14 (92.9%) valid sample runs\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" valign=\"top\"\u003e\n \u003cp\u003eStability\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of targets in samples held at 4\u0026deg;C for 7 days or after 3 freeze-thaw cycles\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e100% concordance\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"29.947089947089946%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e100% concordance\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" valign=\"top\"\u003e\n \u003cp\u003eAccuracy\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eDetection in virus positive and negative samples (n=191)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eSensitivity \u0026gt; 90%\u003c/p\u003e\n \u003cp\u003eSpecificity \u0026gt; 90%\u003c/p\u003e\n \u003cp\u003eAccuracy \u0026gt; 90%\u003c/p\u003e\n \u003cp\u003ePPA \u0026gt; 90%\u003c/p\u003e\n \u003cp\u003eNPA \u0026gt; 90%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"12.91005291005291%\" valign=\"top\"\u003e\n \u003cp\u003eOriginal testing\u0026nbsp;\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003cp\u003eSensitivity: 93.6%\u003c/p\u003e\n \u003cp\u003eSpecificity: 93.8 %\u003c/p\u003e\n \u003cp\u003eAccuracy: 93.7 %\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"17.037037037037038%\" valign=\"top\"\u003e\n \u003cp\u003eAfter discrepancy testing and clinical adjudication\u003c/p\u003e\n \u003cp\u003ePPA: 98.7%\u003c/p\u003e\n \u003cp\u003eNPA: 98.1%\u003c/p\u003e\n \u003cp\u003eOverall: 97.9%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"9.947089947089948%\" valign=\"top\"\u003e\n \u003cp\u003eDetection of divergent viruses\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"35.026455026455025%\" valign=\"top\"\u003e\n \u003cp\u003eDetection by an \u003cem\u003ein silico\u003c/em\u003e analysis of divergent viruses (n=70)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"25.07936507936508%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eSensitivity \u0026gt;95%\u003c/p\u003e\n \u003cp\u003eSpecificity \u0026gt;95%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"29.947089947089946%\" colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eSensitivity: 98.6%\u003c/p\u003e\n \u003cp\u003eSpecificity: 100%\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e(PC) Positive control consisting of 4 respiratory viruses spiked into pooled nasopharyngeal swab matrix; (IC) spiked internal control consisting of a RNA MS2 phage; (NC) Negative control; (EA) Essential agreement, (CV) Coefficient of variation, (PPA) positive percent agreement; (NPA) negative percent agreement.\u003c/p\u003e\n\u003cp\u003e\u003csup\u003ea\u003c/sup\u003eTwo mixtures were assessed. The first mixture included detectable concentrations of CMV, HIV, \u003cem\u003eKlebisella pneumoniae\u003c/em\u003e, \u003cem\u003eStreptococcus agalactiae\u003c/em\u003e, \u003cem\u003eAspergillus niger\u003c/em\u003e, \u003cem\u003eCryptococcus neoformans\u003c/em\u003e and \u003cem\u003eToxoplasma gondii\u003c/em\u003e, and corresponds to positive control material from a previously validated CSF assay\u003csup\u003e21\u003c/sup\u003e. The second mixture was a commercial reference panel, the ZymoBIOMICS Microbial Community Standard (Zymo Research, Tustin, CA), and consisted of 10 bacterial and fungal pathogens at varying concentrations (\u003cem\u003eListeria monocytogenes\u003c/em\u003e - 12%, \u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e - 12%, \u003cem\u003eBacillus subtilis\u003c/em\u003e - 12%, \u003cem\u003eEscherichia coli\u003c/em\u003e - 12%, \u003cem\u003eSalmonella enterica\u003c/em\u003e - 12%, \u003cem\u003eLactobacillus fermentum\u003c/em\u003e - 12%, \u003cem\u003eEnterococcus faecalis\u003c/em\u003e - 12%, \u003cem\u003eStaphylococcus aureus\u003c/em\u003e - 12%, \u003cem\u003eSaccharomyces cerevisiae\u003c/em\u003e - 2%, and \u003cem\u003eCryptococcus neoformans\u003c/em\u003e - 2%) that were spiked into negative nasopharyngeal swab matrix.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 2. Detection of a broad range of viruses in contrived samples\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd width=\"34.370370370370374%\"\u003e\n \u003cp\u003e\u003cstrong\u003eContrived Sample type\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"65.62962962962963%\" colspan=\"2\"\u003e\n \u003cp\u003e\u003cstrong\u003eCorrectly identified Virus by mNGS assay\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"34.370370370370374%\" rowspan=\"4\"\u003e\n \u003cp\u003ePositive cerebrospinal fluid (CSF) spiked in negative matrix\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"65.62962962962963%\" colspan=\"2\"\u003e\n \u003cp\u003eLymphocytic Choriomeningitis Virus (LCMV)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\" colspan=\"2\"\u003e\n \u003cp\u003eHerpes simplex virus 2 (HSV-2)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\" colspan=\"2\"\u003e\n \u003cp\u003eVaricella-zoster virus (VZV)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"100%\" colspan=\"2\"\u003e\n \u003cp\u003eHerpes simplex virus 1 (HSV-1) and Epstein-Barr Virus (EBV)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"34.370370370370374%\" rowspan=\"4\"\u003e\n \u003cp\u003ePositive bronchoalveolar lavage (BAL) spiked in negative matrix\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"32.2962962962963%\"\u003e\n \u003cp\u003eParainfluenza Virus Type 4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"33.333333333333336%\"\u003e\n \u003cp\u003eParechovirus A\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eInfluenza C Virus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eHuman Bocavirus\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003ePrimate Bocaparvovirus 1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eCoronavirus 229E\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eCoronavirus NL63\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\" valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"34.370370370370374%\" rowspan=\"9\"\u003e\n \u003cp\u003eViral culture fluid spiked in negative control matrix (1:10)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"32.2962962962963%\"\u003e\n \u003cp\u003eAdenovirus Type 1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"33.333333333333336%\"\u003e\n \u003cp\u003eCoronavirus 229E\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eCoronavirus NL63\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eCoxsackie Virus Type A1\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eEchovirus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eHuman Metapneumovirus 16\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eInfluenza B Virus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eMeasles Virus\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eMumps Virus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eParainfluenza Virus Type 2\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eParainfluenza Virus Type 3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eParainfluenza Virus Type 4A\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eParechovirus Type 1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eRhinovirus A16\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eRhinovirus B14\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\"\u003e\n \u003cp\u003eRubella Virus\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd width=\"49.20993227990971%\"\u003e\n \u003cp\u003eInfluenza B Virus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd width=\"50.79006772009029%\" valign=\"top\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"metagenomic next-generation sequencing, assay development, agnostic detection, respiratory virus detection, pandemic preparedness, SARS-CoV-2, viral diagnostics, SURPI+ computational pipeline for pathogen detection, viral load quantification, diagnostic assay performance, viral multiplex RT-PCR","lastPublishedDoi":"10.21203/rs.3.rs-4492202/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4492202/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eTools for rapid identification of novel and/or emerging viruses are urgently needed for clinical diagnosis of unexplained infections and pandemic preparedness. Here we developed and clinically validated a largely automated metagenomic next-generation sequencing (mNGS) assay for agnostic detection of respiratory viral pathogens from upper respiratory swab and bronchoalveolar lavage samples in \u0026lt;24 hours. The mNGS assay achieved mean limits of detection of 543 copies/mL, viral load quantification with 100% linearity, and 93.6% sensitivity, 93.8% specificity, and 93.7% accuracy compared to gold-standard clinical multiplex RT-PCR. Performance increased to 97.9% overall predictive agreement after discrepancy testing and clinical adjudication, which was superior to that of RT-PCR (95.0% overall agreement). To enable discovery of novel, sequence-divergent human viruses with pandemic potential, \u003cem\u003ede novo \u003c/em\u003eassembly and translated nucleotide algorithms were incorporated into the automated SURPI+ computational pipeline used by the mNGS assay for pathogen detection. Using \u003cem\u003ein silico\u003c/em\u003e analysis, we showed after removal of all human viral sequences from the reference database that 70 (100%) of 70 representative human viral pathogens could still be identified based on homology to related animal or plant viruses. Our assay, which was granted breakthrough device designation from the US Food and Drug Administration (FDA) in August of 2023, demonstrates the feasibility of routine mNGS testing in clinical and public health laboratories, thus enabling a robust and rapid response to the next viral respiratory pandemic.\u003c/p\u003e","manuscriptTitle":"Laboratory validation of a clinical metagenomic next-generation sequencing assay for respiratory virus detection and discovery","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-05-31 12:11:06","doi":"10.21203/rs.3.rs-4492202/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"nature-communications","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"NCOMMS","sideBox":"Learn more about [Nature Communications](http://www.nature.com/ncomms/)","snPcode":"","submissionUrl":"https://mts-ncomms.nature.com/","title":"Nature Communications","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature Communications","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"65b80b89-19d6-40f5-8461-33afb4a21bb6","owner":[],"postedDate":"May 31st, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":32610315,"name":"Health sciences/Medical research/Translational research"},{"id":32610316,"name":"Health sciences/Diseases/Infectious diseases/Viral infection"},{"id":32610317,"name":"Biological sciences/Microbiology/Virology/Metagenomics"},{"id":32610318,"name":"Biological sciences/Microbiology/Clinical microbiology"},{"id":32610319,"name":"Biological sciences/Microbiology/Infectious-disease diagnostics"}],"tags":[],"updatedAt":"2024-11-13T08:07:27+00:00","versionOfRecord":{"articleIdentity":"rs-4492202","link":"https://doi.org/10.1038/s41467-024-51470-y","journal":{"identity":"nature-communications","isVorOnly":false,"title":"Nature Communications"},"publishedOn":"2024-11-12 05:00:00","publishedOnDateReadable":"November 12th, 2024"},"versionCreatedAt":"2024-05-31 12:11:06","video":"","vorDoi":"10.1038/s41467-024-51470-y","vorDoiUrl":"https://doi.org/10.1038/s41467-024-51470-y","workflowStages":[]},"version":"v1","identity":"rs-4492202","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4492202","identity":"rs-4492202","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.