Host–Microbiome Archetypes Differentiate Infection from Pathogen Carriage in the Human Lower Airway | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Host–Microbiome Archetypes Differentiate Infection from Pathogen Carriage in the Human Lower Airway Charles Langelier, Emily Lydon, Padmini Deosthale, Abigail Glascock, and 9 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8171822/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted You are reading this latest preprint version Abstract Accurately distinguishing lower respiratory tract infection (LRTI) from incidental pathogen carriage (IPC) is clinically challenging. The host immunologic and microbial factors that define the states of LRTI and IPC are poorly understood. We performed host-microbe metatranscriptomic profiling of tracheal aspirate from 326 mechanically ventilated children with clinically adjudicated LRTI (n = 207), IPC (n = 70), or non-infectious acute respiratory illnesses (n = 49). In the airway microbiome, LRTI was characterized by reduced alpha diversity and taxonomic richness, while IPC was characterized greater total bacterial abundance, enrichment in respiratory anaerobes and increased metabolic activity. In terms of host response, patients with LRTI exhibited a distinct lower airway transcriptional signature of innate and adaptive immune activation compared to those with IPC, who had similar transcriptional profiles as uninfected controls. Mediation analyses suggested that the airway microbiome influences the host response to pathogens. An integrated host-microbe metatranscriptomic classifier discriminated LRTI from IPC and controls with an AUC = 0.89 (95% confidence interval (CI) 0.85–0.92). The single gene FABP4 , when combined with alpha diversity, performed similarly, and FABP4 protein alone achieved an AUC = 0.88 (95% CI 0.82–0.93). Together, our findings reveal distinct ecological and immunologic archetypes that define LRTI and IPC, and support data-driven, biology-informed LRTI diagnostics that incorporate host and microbe. Biological sciences/Microbiology/Microbial communities/Microbiome Health sciences/Molecular medicine Health sciences/Diseases/Infectious diseases Biological sciences/Microbiology/Microbial communities/Metagenomics Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 INTRODUCTION The upper and lower airways harbor robust microbial communities which together represent the human respiratory microbiome 1 . While primarily comprised of commensal bacteria during states of health, it has been well established that respiratory pathobionts, or microbes with the potential to cause disease, can be incidentally carried within the airway microbiome without eliciting signs or symptoms of infection 2 . This phenomenon of ‘colonization’ or incidental pathogen carriage (IPC), which involves a dynamic relationship between pathobiont, microbiome and host immune response, remains incompletely understood. IPC frequently complicates the management of acute respiratory illness by confounding accurate lower respiratory tract infection (LRTI) diagnosis. For instance, in patients hospitalized for respiratory failure, the underlying cause often remains unclear for days, as both infectious and non-infectious conditions can present with overlapping clinical features. LRTI diagnostic tests that rely almost exclusively on pathogen detection cannot differentiate between true infection and IPC, nor discern the presence of a non-infectious etiology 3 , 4 . As a result, detection of any potential pathogen, particularly in the setting of respiratory failure, can lead to an LRTI diagnosis, even in the absence of true infection 5 , 6 . This contributes to unnecessary antimicrobial use and missed opportunities to diagnose and treat alternative causes of respiratory failure, such as cardiac conditions or autoinflammatory diseases 7 – 9 . While IPC occurs across the age spectrum, the incidence is highest in children compared to adults 10 – 12 . For instance, an estimated 33–90% of children incidentally carry Streptococcus pneumoniae in the airway, compared to < 5% of adults 13 – 16 . Similarly, Moraxella catarrhalis colonizes the nasopharynx in 30–100% of infants but only 1–5% of adults 17 – 19 . Viral IPC is also far more common among children; based on population surveillance studies, an estimated 25% of asymptomatic young children incidentally carry at least one viral pathogen, in contrast to only 2% of adults 20 – 22 . The high baseline rates of respiratory IPC in children may be further increased in the setting of hospitalization and critical illness. The physiological disturbances of critical illness, including disruption of epithelial barriers, immune dysregulation, and introduction of endotracheal tubes, can reshape the respiratory microenvironment, promoting shifts in microbial community composition and facilitating colonization by opportunistic pathogens 23 , 24 . Among patients who require mechanical ventilation for non-infectious indications, airway colonization with potentially pathogenic organisms occurs in nearly half within the first 24 hours 25 , 26 . Despite its clinical relevance and frequent occurrence, the host and microbial factors that distinguish IPC from LRTI remain incompletely understood. No studies have yet evaluated IPC in the context of the lower airway microbiome and host response, and tests capable of accurately distinguishing LRTI from IPC do not yet exist. To address these gaps, we studied a prospective cohort of critically ill children hospitalized for acute respiratory failure and performed metatranscriptomic RNA sequencing on lower respiratory samples to simultaneously profile both host and microbe. We identify striking host and microbial biosignatures that distinguish the two states, and then leverage findings to build accurate diagnostic classifiers with the potential to advance acute respiratory illness management. RESULTS Patient cohort, clinical adjudication, and pathogen detection Critically ill children with all-cause respiratory failure requiring mechanical ventilation (n = 457) were prospectively enrolled at eight U.S. hospitals between 2/2015 and 12/2017 ( Figure S1 ). 27 – 29 Tracheal aspirate was obtained within 24 hours of intubation and stored in an RNA stabilizing agent. High-quality RNA sequencing data capturing both the respiratory microbiome and host transcriptome were generated on 343 patients. LRTI cases were identified by structured, retrospective clinical adjudication following ICU discharge performed by ≥ 2 physicians trained in critical care medicine or infectious diseases using the CDC/NHSN PNU1 criteria 30 . Adjudicators, who had access to all clinical data in the electronic health record and were blinded to metatranscriptomic results 31 , identified 224 patients (65.3%) with LRTI. Alternative, non-infectious etiologies of respiratory failure were adjudicated in 119 (34.7%) patients, and included neurologic conditions, anatomic airway abnormalities, toxin exposures, cardiac conditions, trauma and autoimmune disease. To comprehensively identify respiratory pathogens in the lower airway, a combination of standard-of-care clinical microbiologic testing and respiratory metatranscriptomics was performed. Following this process, 207 patients received a clinical diagnosis of LRTI and had a respiratory pathogen detected (“LRTI” group). Of those with a clear non-infectious cause of respiratory failure, 70 had a pathogen detected (“IPC” group) and 49 patients did not (“CTRL” group). Seventeen patients with clinically adjudicated LRTI but negative microbiologic testing were excluded from the analysis. We found no differences in sex, race, ethnicity, or comorbidities between groups. Patients in the LRTI group had a younger median age of 0.6 years (interquartile range (IQR) 0.2–2.1), compared to 1.6 years (IQR 0.9–6.4) in the IPC group and 9.5 years (IQR 1.3–14.4) in the CTRL group, reflecting the typical epidemiologic differences in pediatric respiratory failure. 4 , 32 Ventilator days and ICU length of stay were slightly longer in LRTI versus IPC, though hospital length of stay was similar and mortality was lower. Antibiotic usage prior to intubation was similar across all groups, and notably, most patients received antibiotics during their hospital course (LRTI 99.0%, IPC 91.4%, CTRL 83.7%). Respiratory syncytial virus (RSV) was the most common respiratory pathogen in the cohort, identified in 52.7% of the LRTI group and 12.9% of the IPC group (P < 0.001) (Fig. 1 ). Other pathogens that statistically differed in prevalence between groups included human metapneumovirus (LRTI 6.8%, IPC 0.0%, P = 0.02), Haemophilus influenzae (LRTI 30.0%, IPC 17.1%, P = 0.04), and Pseudomonas aeruginosa (LRTI 1.9%, IPC 10.0%, P = 0.01). Several other pathogens were identified at similar rates between groups, including rhinovirus, Moraxella catarrhalis , Staphylococcus aureus , and Streptococcus pneumoniae . Co-detection of both bacteria and viruses was common, comprising 59.4% of LRTI cases and 25.7% of IPC cases. The respiratory microbiome differs between LRTI, IPC, and controls We first sought to compare the composition and function of the lung microbiome between LRTI and IPC, hypothesizing that both biologically relevant and diagnostically useful distinctions may exist. Alpha diversity demonstrated notable differences, with LRTI characterized by a lower Shannon Diversity Index (SDI) compared to either IPC (P adj =2.8e-8) or CTRL (P adj =2.6e-14). In contrast, SDI did not differ between IPC and CTRL groups (P adj =0.59) (Fig. 2 a). Similarly, we found that community richness (total number of unique species in the lung microbiome) was significantly higher in both IPC and CTRL groups compared to those with LRTI (P adj =1.2e-11 and 1.3e-9, respectively) (Fig. 2 b). Community composition also differed between LRTI, IPC, and CTRL groups based on the Bray-Curtis index (P < 0.001) (Fig. 2 c). Despite these distinct microbiome archetypes, the detected pathogen was typically the most abundant microbe in the airway microbiome in both LRTI and IPC ( Figure S2 ). We next compared total bacterial abundance (measured in reads per million, RPM) between groups. Surprisingly, total bacterial abundance was highest in IPC compared to both LRTI (P adj =2.2e-3) and CTRL (P adj =2.7e-3) (Fig. 2 d). In contrast, total viral RPM was highest in LRTI, while the IPC group exhibited an intermediate state with greater viral abundance compared to the CTRL group (P adj =1.3e-9) (Fig. 2 e). Sensitivity analyses subsetting by viral-bacterial co-detection did not markedly change the observed relationships between LRTI, IPC, and CTRL groups ( Figure S3 ). We next examined pathogen abundance and found that rhinovirus RPM was significantly higher in LRTI compared to IPC (P = 0.047), while mean RSV abundance trended higher in LRTI, but the difference did not reach statistical significance (P = 0.060) ( Figure S4 ). H. influenzae abundance did not differ between groups (P = 0.84), while M. catarrhalis RPM was unexpectedly higher in IPC compared to LRTI (P = 9.1e-3). To further characterize the microbiome features of LRTI and IPC, we carried out differential taxonomic abundance analysis using ANCOM-BC 33 . While RSV was the only microbe with significantly greater abundance in LRTI, we found that IPC was enriched with classically commensal taxa including Prevotella , Neisseria , Porphyromonas , and Streptococcus species (Fig. 2 f, Supp. Data 1 ). We subsequently evaluated pathogen burden, quantified as the proportion of implicated pathogen reads in the microbiome, and found that LRTI was characterized by a significantly greater pathogen burden compared to IPC (63.4% versus 36.6% mean pathogen proportion, P = 5.9e-7) (Fig. 2 g ) . We hypothesized that virulence factor expression might differ between LRTI and IPC, and thus carried out an exploratory assessment using the MetaVF database 34 . We found that the expression of two H. influenzae virulence factors, HMW1/2 and HxuABC , were higher in LRTI (P = 0.02 and P = 0.03, respectively) (Fig. 2 h, Figure S5, Supp. Data 2 ). HMW1/2 are adhesin proteins that facilitate H. influenzae adherence to the respiratory epithelium 35 , and hxuABC is a specialized ATP-binding cassette transporter for iron acquisition from the host 36 . We additionally assessed microbiome functional differences by profiling metabolic pathways using HUMAnN 37 . We found that compared to LRTI, IPC was characterized by higher expression of metabolic pathways essential for fatty acid beta-oxidation, citrulline biosynthesis, and arginine degradation (Fig. 2 i, Supp. Data 3 ). Taken together, our findings suggested that IPC is characterized by a more diverse, taxonomically rich, abundant, and metabolically active respiratory microbiome compared to LRTI. Host airway transcriptional responses distinguish LRTI from IPC and controls. We next tested the hypothesis that the pulmonary host response would differ between LRTI, IPC, and CTRL groups by evaluating the lower airway transcriptome. Principal component analysis demonstrated that LRTI was characterized by a distinct transcriptional signature compared to IPC or CTRL groups (PERMANOVA P adj =0.002), which did not differ (P adj =0.25) (Fig. 3 a, Figure S6 ). This finding was underscored by hierarchical clustering of the top 20 most differentially expressed (DE) genes between LRTI and CTRL groups, which generally separated LRTI from non-LRTI cases, though several IPC cases clustered among LRTI cases (Fig. 3 b). IPC and CTRL patients did not clearly separate based on hierarchical clustering. To better understand host gene expression differences between the three groups at a more granular level, we performed pairwise differential expression analyses, adjusting for age and sex. We identified distinct host signatures that differentiated LRTI from IPC and CTRL groups, with 3517 and 2856 DE genes, respectively (Fig. 3 c, Fig. 3 d, Supp. Data 4 ). In contrast, IPC and CTRL groups demonstrated minimal differences, with only 2 DE genes (Fig. 3 e, Supp. Data 4 ). Among the genes DE between LRTI and both CTRL and IPC groups, we noted that FABP4 , which is expressed in macrophages and encodes a lipid chaperone that modulates leukotriene stability 38 , was a clear outlier in both fold change and statistical significance (Fig. 3 f ). To characterize biological pathways encompassing the DE genes, we carried out gene set enrichment analyses (GSEA) (Fig. 3 g, Supp. Data 5) . Canonical infection-related pathways, including interferon signaling, antigen presentation, adaptive immune signaling, and neutrophil degranulation, were all upregulated in LRTI versus IPC, as expected. However, we also noted that interferon signaling pathways were upregulated in IPC compared to CTRL patients, albeit to a lesser extent, a. Principal component analysis (PCA) of the lower respiratory tract transcriptome. Adjusted P value calculated by PERMANOVA between LRTI and non-LRTI groups. b. Heat map demonstrating hierarchical clustering of patients in each group (LRTI, IPC, CTRL) based on the top 20 differentially expressed (DE) genes between LRTI and CTRL groups. Color bar indicates normalized, Z-score scaled expression of each gene. c. Volcano plot of DE genes between LRTI and CTRL, with Benjamini-Hochberg (BH) adjusted P values < 0.05 colored. d. Volcano plot highlighting DE genes between LRTI and IPC. e. Volcano plot of DE results comparing IPC versus CTRL. f. Normalized FABP4 expression across groups, with Benjamini-Hochberg adjusted P values from the DE analyses. g. Gene‑set enrichment analysis (GSEA) highlighting immune pathway enrichment in IPC compared to LRTI (purple) and IPC compared to CTRL (yellow). Top 20 DE pathways shown for the LRTI/IPC comparison, then overlaid with the IPC/CTRL comparison, showing only significant pathways (P adj <0.05). Point size scales inversely with P adj value. Given that interferon signaling is a central feature of the anti-viral host immune response, we hypothesized that the immunologic features of IPC may differ between viral and bacterial pathogens. To investigate this, we performed differential expression analyses within patients who had viral (n = 223) or bacterial (n = 195) pathogens detected. Aligning with our primary composite analysis, both viral and bacterial LRTI were characterized by distinct airway transcriptional signatures with respect to the CTRL group (3424 and 3140 DE genes, respectively) (Fig. 4 a, Supp. Data 6 ). Compared to IPC, viral and bacterial LRTI also exhibited distinct host signatures, although with fewer DE genes (1860 and 1991, respectively). Few transcriptomic differences were observed between IPC and CTRL groups, although a subtle signature of 29 DE genes, primarily interferon-stimulated genes (ISGs, e.g., ISG15, IFIH1, OAS3 ) characterized viral IPC (Fig. 4 b). In contrast, there were zero DE genes between bacterial IPC and CTRL groups, suggesting fundamental differences in immune activation between patients incidentally carrying viral versus bacterial pathogens. We next examined expression of individual genes classically associated with anti-viral and anti-bacterial defense. The viral subgroups demonstrated a gradation in ISG expression 39 , ranging from highest in viral LRTI to lowest in CTRL (Fig. 4 c ) . Given the similar pattern with viral abundance in our microbiome analysis (Fig. 2 e), we hypothesized that the differences in interferon signaling might simply be related to viral load differences between groups. A regression of ISG expression against viral RPM showed that ISG expression did positively correlate with interferon expression, as expected (adjusted R 2 = 0.45), though interestingly, when stratified by group, IPC patients demonstrated a proportionally attenuated response compared to those with LRTI for any given viral load (P = 1.8e-3 for the ISG IFIH1 ) (Fig. 4 d). When age was added as a covariate, this finding did not change (P = 2.0e-3), and other ISGs ( ISG15 , IFI44 ) exhibited the same pattern ( Figure S7 ). While these findings could be due entirely to intrinsic differences in innate immune response activation between individuals, we considered the possibility that the lung microbiome might modulate, at least to some extent, inflammatory gene expression in the setting of pathogen exposure. To investigate this, we evaluated the relationship between SDI and ISG expression and found that as lung microbiome diversity increased, ISG expression decreased (adjusted R 2 = 0.29, regression P = 1.3e-17) (Fig. 4 e). Furthermore, we found that after adjusting for SDI, between-group differences in the relationship between viral load and ISG expression disappeared (P = 0.50), suggesting that the lung microbiome modulates anti-viral host responses (Fig. 4 f). Mediation analysis suggested that lower SDI was independently associated with higher IFIH1 expression, and that microbiome composition may partially mediate the relationship between viral pathogen presence and interferon signaling, explaining ~ 43% of the group effect on IFIH1 expression (Fig. 4 g). We performed a parallel analysis focused on bacterial LRTI and IPC ( Figure S8 ). In contrast to our observations with viral pathogens, the expression of canonical anti-bacterial innate immunity genes ( GZMB 40 , CD64 41 , and TLR1 42 ) remained relatively constant across a range of bacterial pathogen loads (e.g., for GZMB , adjusted R 2 = 0.07). However, as with viral pathogens, the bacterial IPC group exhibited consistently lower innate immunity gene expression compared to the LRTI group (e.g. for GZMB , P adj =3.0e-04). Applying the same mediation analysis to GZMB demonstrated that lower lung microbiome alpha diversity was independently associated with higher innate immunity gene expression, explaining ~ 31% of the group effect on GZMB expression (Fig. 4 h). Lastly, we assessed the impact of adjusting for SDI in our original pairwise differential expression analyses and found that doing so markedly reduced the lower airway transcriptional differences between LRTI and IPC (Fig. 4 i, Supp. Data 7 ). Integration of host and microbial features enables accurate LRTI diagnosis Having identified such distinct microbiome and host immune response differences between groups, we next sought to translate our findings into proof-of-concept diagnostic tests. Using LASSO regularized regression, we built diagnostic classifiers to distinguish true infection from the alternative clinically encountered states of IPC or non-infectious acute respiratory illness (Fig. 5 a). Given prior work demonstrating the utility of FABP4 as a pneumonia diagnostic biomarker 43 , we evaluated its performance alone or in combination with alpha diversity. Both FABP4 and SDI performed well individually, although the combination achieved even better classification performance with an area under the receiver operator curve (AUC) of 0.87 (95% confidence interval (CI) 0.83–0.91) based on 5-fold cross validation (CV). A multi-gene host transcriptional classifier in combination with SDI performed comparably with an AUC of 0.89 (95% CI 0.85–0.92, Fig. 5 b, Table S1 ) and yielded classifier scores that effectively distinguished LRTI from patients in either the IPC or CTRL groups (Fig. 5 c). Considering that a single protein biomarker could have distinct practical utility as a clinical diagnostic, we tested whether protein levels of FABP4 could also effectively differentiate LRTI from IPC and CTRL groups in a subset of patients with FABP4 protein measurements from the lower airway (n = 134). Indeed, the LRTI group had markedly different levels of FABP4 compared to both IPC (P = 6.0e-4) and CTRL (P = 2.3e-9) groups (Fig. 5 d). Consistent with our findings at the transcriptional level, no difference in FABP levels between IPC and CTRL groups was observed (Fig. 3 f). Notably, we found that respiratory FABP4 alone performed as well as the integrated host/microbe metatranscriptomic classifier (AUC = 0.88, 95% CI 0.82–0.93) (Fig. 5 e), suggesting promise as a clinical biomarker for both LRTI diagnosis and distinguishing LRTI from IPC. DISCUSSION Differentiating LRTI from IPC remains a frequent and unresolved challenge in the care of patients with acute respiratory illness. The resulting diagnostic uncertainty drives antimicrobial overuse and reflects a key gap in our understanding of host-microbe interactions in the lower airway. Here, we deploy metatranscriptomic profiling to holistically characterize the host and microbial features of LRTI and IPC, identifying distinct inflammatory signatures and respiratory microbiome ecology that distinguish these two states of pathobiont existence. Leveraging these findings, we develop host-microbe and practical single biomarker LRTI diagnostic classifiers, offering a path toward more precise, biologically informed diagnostics. Although the implicated pathogens were frequently the most abundant microbes in the lower airway microbiome in both patients with LRTI and IPC, microbiome alpha and beta diversity, and taxonomic richness were strikingly different. LRTI was marked by a collapse of alpha diversity, reflecting ecologic disruption that is characteristic of infection 27 , 44 , 45 , whereas IPC resembled uninfected controls, with a diverse and taxonomically rich community composition. Unexpectedly, total bacterial abundance was greater in IPC than in LRTI or controls, which may be explained by the enrichment of commensal taxa such as Prevotella , Neisseria , and Porphyromonas and reflect a more resilient and balanced barrier microbiota. The IPC state was further characterized by increased expression of diverse metabolic programs (e.g. energy production, fatty‑acid β‑oxidation, and amino‑acid biosynthesis) and higher virulence factor expression. Globally, these findings suggest that IPC is characterized by a more robust, diverse, and metabolically active microbiome, tolerant of carriage but resilient to pathogen invasion. Host inflammatory gene expression in the lower airway also differed markedly between LRTI and IPC. We found that LRTI elicited a distinct transcriptional signature compared to either IPC or controls, comprised of thousands of DE genes related to innate and adaptive immune signaling. In contrast, the lower airway transcriptome of IPC largely resembled that of controls. Sensitivity analyses demonstrate that while no detectable transcriptional differences existed between bacterial IPC and controls, a subtle signal of interferon-stimulated genes distinguished viral IPC from controls. This pattern is consistent with prior reports of interferon activation during asymptomatic viral carriage 46 , 47 , but contrasts studies in neonates demonstrating that nasopharyngeal colonization with M. catarrhalis , H. influenzae , and S. pneumoniae correlates with mucosal immune shifts and the future development of asthma 48 , 49 . The discrepancy may be due to developmental stage (neonates were not included in our study and may be uniquely susceptible as their respiratory microbiomes are being established) or differences in upper versus lower respiratory tract biology 50 . Regression analyses demonstrated that ISG expression is induced in a viral load-dependent manner, consistent with prior studies 51 , 52 . Intriguingly, however, ISG activation was consistently diminished in the setting of IPC across a range of viral loads. This suggested that IPC may be characterized by a global attenuation of the pathogen recognition-innate immune activation axis. Similar regression analyses involving bacterial cases did not demonstrate a dose-dependent relationship with respect to innate immune gene expression, perhaps reflecting fundamental differences in the coupling of bacterial antigens to the transcriptional activation of host innate immune responses. That said, we observed consistently higher expression of anti-bacterial immune genes (e.g., GZMB , CD64 , TLR1 ) in LRTI compared to IPC, across a broad range of pathogen abundance, suggesting a fundamental set point difference between the two states, agnostic to pathogen class. Our mediation analysis supports a role for the lung microbiome in moderating the intensity of inflammatory responses to potential invading pathogens. These findings, while proof-of-concept in nature, align with murine models in which microbiome disruption amplifies innate inflammatory responses and influenza-associated lung injury 53 . We estimated that microbiome factors explained ~ 43% of the relationship between group and inflammatory gene expression in viral cases, and ~ 31% in bacterial cases. While an important contribution, this suggests that other mediators (e.g., host genetics, epigenetic modifications, immune memory to related pathogens) primarily account for host responses differences between LRTI and IPC. Regardless, our findings underscore the complex, bi-directional relationship between host and microbe that determines whether a pathobiont will cause invasive disease or co-exist innocuously in a microbial community. We acknowledge that a full mediation analysis and determination of causality and directionality is not possible given the inherent limitations of an observational cohort 54 . Distinguishing LRTI from IPC and non-infectious acute respiratory illnesses remains a clinical challenge and underscores the need for better diagnostic tests to guide antimicrobial therapy and patient care. We illustrate that simple host and microbial biomarkers can be used independently, or combined, to build clinically translatable diagnostic tests to address this need. For instance, FABP4 in combination with SDI accurately distinguished children with proven LRTI from those with other causes of acute respiratory failure, including those with IPC, achieving an AUC of 0.87. As sequencing technology becomes more economical and clinically available, the feasibility and cost effectiveness of performing metatranscriptomic analyses will continue to increase 55 – 57 . Inflammatory protein biomarkers (e.g., procalcitonin, C-reactive protein) are the most widely clinically available class of host-based infectious disease diagnostics, although they only have modest capability of diagnosing LRTI and have not been shown to effectively discriminate between LRTI and IPC. 58 , 59 Thus, we found it promising that FABP4 alone, when measured at the protein level, performed as well as our integrated host/microbe metatranscriptomic model (AUC = 0.88), highlighting the potential clinical utility of this single host biomarker for rapid and accurate diagnosis of infection in this cohort. Our study has several strengths including the incorporation of both host and microbial data using metatranscriptomics, a multicenter design, rigorous and comprehensive adjudication of LRTI and IPC, and a large sample size. Our study also has limitations. We focused exclusively on children because overall they have a higher prevalence of IPC, thus it remains unknown whether our findings are generalizable to adults, or to individuals with less severe respiratory illnesses. The infection and IPC groups differed in age, although we adjusted for this in our analyses. While our study is the largest to date to examine biological differences between infection and IPC, our sample size did limit sub-analyses at the individual pathogen level. Given the inherent limitations of an observational cohort and cross-sectional study design, our mediation analyses should be considered proof-of-concept and will require validation in a more controlled experimental setting. Longitudinal sampling could help determine whether diversity collapse precedes, accompanies, or follows infection onset, and studies in xenobiotic mice could more effectively establish causal relationships between microbiome and host inflammatory responses in the setting of pathobiont challenge. Finally, future studies are needed to evaluate whether our findings at the host or microbiome level generalize to the upper respiratory tract. In sum, we find that LRTI and IPC are characterized by distinct biology with respect to both host and microbe, emphasizing that simply detecting a microbe with known pathogenicity in the respiratory tract is insufficient for clinical diagnosis of infection. It is not just the pathogen alone, but its dynamic relationship with the host immune response and airway microbiome, that determines disease. Our study provides fresh insight into the vexing and common challenge of interpreting positive respiratory tests in patients with acute respiratory illnesses and offers a new approach for improving LRTI diagnostic accuracy and limiting antimicrobial overuse. METHODS Study cohort We studied a prospective multicenter cohort of 457 critically ill children with acute respiratory illnesses requiring mechanical ventilation who were admitted to eight U.S. intensive care units (ICUs) in the National Institute of Child Health and Human Development’s Collaborative Pediatric Care Research Network (CPCCRN) between February 2015 and December 2017 27–29 . Enrollment sites included: Children’s Hospital Colorado, Aurora, CO USA; University of California San Francisco, San Francisco, CA; Nationwide Children’s Hospital, Columbus, OH, USA; The Children’s Hospital of Philadelphia, Philadelphia, PA, USA; University of Pittsburg, Pittsburg, PA, USA; Children’s Hospital of Michigan, Detroit, MI, USA; University of California Los Angeles, Los Angeles, CA, USA; Children’s National Medical Center and George Washington School of Medicine and Health Sciences, Washington, DC, USA. Children aged 31 days to 17 years who were expected to require mechanical ventilation for at least 72 hours and had tracheal aspirate (TA) sampling performed within 24 hours of intubation were approached for enrollment. Exclusion criteria included TA sample collection > 24 hours after intubation, presence of a tracheostomy tube, any condition in which deep tracheal suctioning was contraindicated, prior mechanical ventilation during the hospitalization, goals of care dictating a do not resuscitate order and/or another a request for limited support, or previous enrollment in the study. Eligible patients were identified, and their guardians were approached for consent by clinical research coordinator staff as soon as possible following intubation. Written informed consent for study participation was obtained from legal guardians. An initial waiver of consent was granted for TA samples to be obtained from standard of care suctioning of the ETT and stored until the parents or other legal guardians could be approached for informed consent. Samples from unconsented subjects were subsequently destroyed. The study was approved by the University of Utah central IRB #00088656. Adjudication of infection status and definition of subgroups Adjudication of LRTI status was carried out retrospectively by study-site clinicians with access to all clinical, laboratory, microbiology, and radiology data available up to the end of admission, without knowledge of metagenomic next-generation sequencing (mNGS) results. Each patient was reviewed independently by two adjudicators with expertise in pediatric infectious disease and/or critical care to determine the presence or absence of clinical LRTI; disagreements were discussed and resolved by a panel. For this study, patients were classified into three groups: 1) LRTI if they were clinically adjudicated as having LRTI and had supportive microbiology, 2) IPC if they were clinically adjudicated as not having LRTI but had positive microbiology, and 3) CTRL if they were clinically adjudicated as not having LRTI and had negative microbiology. Microbiology included standard-of-care clinical microbiology (multiplex polymerase chain reaction (PCR) and semiquantitative bacterial respiratory cultures) and/or metagenomic detection for pathogenic bacteria and viruses (implementing a validated, rules-based computational model, described in detail below) 27 , 28 , 44 . RNA sequencing Tracheal aspirate (TA) collected within 24 hours of intubation was mixed equi-volume with DNA/RNA shield (Zymo Research, Cat. No R1100) and stored at -80°C. Following bead-bashing, RNA or negative control water samples underwent extraction using the Qiagen AllPrep Kit (Qiagen, Cat. No R2145), followed by DNAse treatment. Sequencing libraries were prepared from purified RNA using the NEBNext Ultra II Library Prep Kit (New England Biolabs, Cat. No E7770L) and dual index barcodes. Human ribosomal RNA depletion was carried out prior to library amplification and pooling using the Cas9-based Depletion of Abundant Sequences by Hybridization (DASH) method 60 . Libraries underwent 150-base pair paired-end sequencing on an Illumina NovaSeq 6000 sequencer. Measurement of FABP4 protein levels FABP4 was measured from TA specimens collected within 24 hours of intubation using the SomaScan 7k assay (SomaLogic) 61 – 63 in a subset of this cohort. Following collection, TA specimens underwent centrifugation at 4°C at 15,000 × g for 5 min, subsequently the supernatant was frozen at − 80°C within 30 minutes. Taxonomic mapping from RNA-seq data We employed the CZ ID Illumina mNGS pipeline (v7.1) for taxonomic mapping of microbial sequence data 64 , 65 . This incorporates initial removal of human reads using Kallisto 66 , adapter sequence trimming with fastp 67 , filtering low quality and low complexity reads using PriceSeq 68 and the Lempel-Ziv-Welch algorithm, respectively, and a final scrub of any residual human reads using Bowtie2 69 . Taxonomic classification was then performed on both short reads and assembled contigs using the NCBI nucleotide (NT) and nonredundant (NR) databases. Background and batch correction was performed on species level taxon matrices (see below). Identification and mitigation of background contaminants Negative water controls were processed and sequenced alongside the patient samples to enable characterization and subtraction of background contamination. A previously developed negative binomial model 51 was used to model the distribution of reads of microbial taxa in the negative controls. Mean and dispersion parameters were fit to the data and estimates of the mean were generated for each batch:taxon pair. A single dispersion parameter was generated across all taxa using the MASS package (R, v7.3.58.1). P values were adjusted for multiple testing using the Benjamini-Hochberg False Discovery Rate method. Microbial taxa that were present at a significantly higher average abundance in participant samples than in negative controls (P adj = 1 hit to the NR database (3) a minimum alignment length of 70 bases to the NT database. Clinical detection of respiratory pathogens Standard-of-care clinical respiratory microbiologic testing was performed based on the discretion of the treating clinicians at each study site. Diagnostics included nasopharyngeal swab respiratory pathogen testing by multiplex PCR and/or TA bacterial semiquantitative cultures. Clinical diagnostic tests on samples obtained within 48 hours of intubation were included in the analyses. Microbes reported by the clinical laboratory as representing laboratory, skin, or environmental contaminants, or reported as mixed upper respiratory flora, were excluded. Detection of respiratory pathogens by metatranscriptomics For bacterial taxa that remained after background filtering, we applied an established rules-based model (RBM) to identify potential respiratory pathogens. In two prior studies, the RBM identified 82–96% of clinically-confirmed lower respiratory pathogens compared to standard of care clinical diagnostics, and permitted detection of otherwise missed potential pathogens in > 50% of patients with clinically adjudicated LRTI but negative standard testing 27 , 44 . The RBM operates by first retaining the most abundant species from each mapped genus, and any lower-abundance species within that genus with known pathogenicity in the respiratory tract based on a curated reference list from epidemiologic surveillance studies. 70 – 73 Species were then ranked by abundance (reads per million values aligned to NT database, sum NT RPM), limiting to the top 15. The largest drop in abundance among this ranked list was identified, and any species above the largest drop in abundance with known ability to cause LRTI as a potential pathogen were counted as bacterial hits. Viruses detected with an abundance > 0.1 RPM with established human respiratory pathogenicity were subsequently identified. Microbial abundance calculations Microbial abundance/load was approximated for each sample by calculating the sum NT RPM. Statistical comparison of sum NT RPM across groups was performed using the wilcox_test () function (rstatix v0.7.2). Resulting P values were adjusted using the Benjamini-Hochberg False Discovery Rate algorithm via the p_adjust() function of the stats (v4.2.3) package. Generalized linear modeling of these relationships was performed using the glm() function of the stats package, specifying a Gaussian distribution and identity link function, and adjusted for both sex and age. These methods were applied to the entire microbial profile as well as subsets of the profile (e.g. bacterial, viral) based on NCBI lineage data. Differences in abundance (sum NT RPM) of individual species of interest between groups were also performed using this approach. Microbiome diversity analyses Alpha diversity (Shannon Diversity Index, or SDI) was calculated using the diversity() function of the vegan package (v2.6-6.1). Beta diversity (Bray-Curtis dissimilarity) was calculated using the functions vegdist(), betadisper(), permutest() and adonis2() of the vegan package. Principal Coordinate Analysis (PCoA) was performed using the cmdscale() function of the stats package (v4.2.3). Differential microbial abundance analysis Differential abundance analysis was performed using the ANCOM-BC package 33 (v2.8.1) using a library filter of 0, prevalence filter of 10%, alpha level of 0.05 and a pseudo-count of 1. The analysis was adjusted for age and sex, and P values were adjusted for multiple testing using the Benjamini-Hochberg correction. Virulence factor screening Virulence factors were identified from transcriptomic data using the MetaVF toolkit 34 , its associated virulence factor database VFDB2.0, and BLAST (v2.16.0). Called virulence factors with an associated e-value less than 1e-10 were retained for downstream analysis. Differentially expressed virulence factors were identified using ANCOM-BC 33 with a library filter of 0, prevalence filter of 5%, alpha level of 0.05 and a pseudo-count of 1. Generation of host gene counts RNA-seq reads were pseudoaligned using Kallisto 66 against an index consisting of all transcripts associated with human protein coding genes (GRCh38-based). We excluded samples with less than one million exon counts. Gene-level counts were generated using tximport package, with the scaled TPM method 74 . Genes were retained for subsequent analysis if they had at least 10 counts in at least 20% of the samples in the cohort. The gene counts table underwent variance-stabilizing transformation (VST) using the R package DESeq2 75 , and VST-transformed counts were used in the principal component analysis, hierarchical clustering, assessment of individual genes, and classifier development. Principal component analysis Principal component analysis (PCA) was performed on the complete gene expression matrix using the prcomp function. For data visualization, we plotted PC1 versus PC3, which provided the greatest apparent separation in two dimensions, and ellipses depict 68% confidence regions around group centroids. PC1 versus PC2 and PC2 versus PC3 are shown in the supplement. To formally assess group separation, we ran pairwise PERMANOVA (adonis2, vegan package) on Euclidean distances computed from the full expression matrix (i.e., testing differences in group centroids). P values from the pairwise contrasts were adjusted by the Benjamini-Hochberg method, with P adj <0.05 considered significant. Heat map and hierarchical clustering For display, we selected the top 20 most significantly differentially expressed genes based in the LRTI and CTRL comparison, based on P adj (see below for DE methods). Each gene was standardized across samples (z-score; mean = 0, SD = 1). Genes were ordered by unsupervised hierarchical clustering using Euclidean distance and complete linkage (ComplexHeatmap defaults). Samples were clustered using correlation distance with Ward.D2 to emphasize similarity of gene expression profiles. Dendrograms were computed for ordering but omitted in final visual panel. Differential expression and gene set enrichment analyses DE analyses were performed with the R package limma-voom on raw gene-level counts 76 . The design matrix included age and sex as covariates, and where noted, Shannon Diversity Index (SDI). Counts were transformed with voom (mean-variance modeling with precision weights) and quantile normalized across samples. Gene-wise statistics used empirical-Bayes moderated two-sided t-tests; multiple testing was accounted for by Benjamini-Hochberg with P adj <0.05 considered significant. Where individual genes are displayed (e.g., boxplots for FABP4 , IFIH1 , and GZMB ), the adjusted P values from the respective limma DE analyses are displayed. For pathway analysis, we performed pre-ranked gene set enrichment analysis (GSEA) using ReactomePA (gsePathway) on Reactome gene sets with a minimum pathway size of 10 genes and a maximum size of 1500 genes 77 . All genes from each limma DE comparison, ranked by the limma t-statistic, were included as input. For visualization, we displayed the top 20 DE pathways between LRTI and IPC (all statistically significant with P adj <0.05), then overlaid results from the IPC and CTRL comparison, only displaying the significant pathways. Regression and mediation analyses For both viral and bacterial subgroup analyses (run in parallel with identical pipeline), we modeled gene expression using linear regression with log 10 pathogen abundance (sum NT RPM) and group (LRTI vs IPC) as predictors, including an interaction term (gene expression ~ log 10 pathogen abundance * group). Group differences were evaluated using a global F-test comparing this model to the null model (gene expression ~ log 10 pathogen abundance), and sensitivity models additionally adjusted for age. For visualization, raw data points are plotted alongside model-predicted regression lines. The same structure was applied for microbiome diversity (gene expression ~ SDI * group). SDI-adjusted associations were visualized by plotting residuals from the gene expression ~ SDI model against pathogen abundance with group-specific linear fits. To evaluate modulation of innate host gene expression by diversity, we used the mediation package with SDI as the mediator and included pathogen abundance as a covariate in both models (mediator: SDI ~ group + log10 pathogen abundance; outcome: expression ~ group + SDI + log10 pathogen abundance) and calculated average causal mediated effect (ACME) and average direct effect (ADE) with 1000 simulations 78 . The mediation models used simpler additive models to maintain the interpretability of the effect estimates and because the interaction terms were not significant in any of the regression models. Viral outcomes focused on interferon-stimulated genes ( IFIH1 , ISG15 , IFI44 ) and bacterial outcomes on canonical anti-bacterial defense genes ( GZMB , CD64 , TLR1 ). Classifier development Binary classifiers were developed to distinguish LRTI from non-LRTI (IPC + CTRL combined). We performed stratified five-fold cross-validation (same folds re-used across the different models, with a minimum IPC and CTRL counts to keep each fold balanced) and generated out-of-fold predictions for performance assessment. Single-feature models (i.e. FABP4 gene, FABP4 protein, SDI) used logistic regression, as did FABP4 + SDI. Multi-gene models used LASSO logistic regression on all genes (glmnet), with the regularization parameter lambda selected by internal cross-validation 79 . Non-zero coefficients selected by the LASSO model are provided in Table S1 . For performance metrics, the reported AUC reflects the mean AUC of each of the five folds computed with the pROC package 80 , and the confidence intervals were obtained by bootstrapping the out-of-fold predictions with 1000 resamples. Declarations Data and code availability Source data are provided with this paper. The raw fastq files with microbial sequencing reads are available under NCBI BioProject ID: PRJNA748764. Deidentified clinical metadata, host and microbial data, code, and source data files are available at: https://github.com/infectiousdisease-langelier-lab/Incidental_pathogen_carriage . References Man WH, de Steenhuijsen Piters WAA, Bogaert D (2017) The microbiota of the respiratory tract: gatekeeper to respiratory health. Nat Rev Microbiol 15:259–270 Robinson J (2004) Colonization and infection of the respiratory tract: What do we know? Paediatr Child Health 9:21–24 Vo P, Kharasch VS (2014) Respiratory Failure. Pediatr Rev 35:476–486 Panetti B et al (2024) Acute Respiratory Failure in Children: A Clinical Update on Diagnosis. Children 11:1232 Jeffrey M, Denny KJ, Lipman J, Conway Morris A (2023) Differentiating infection, colonisation, and sterile inflammation in critical illness: the emerging role of host-response profiling. Intensive Care Med 49:760–771 Lydon EC, Ko ER, Tsalik EL (2018) The host response as a tool for infectious disease diagnosis and management. Expert Rev Mol Diagn 18:723–738 Vaughn VM et al (2019) Excess Antibiotic Treatment Duration and Adverse Events in Patients Hospitalized With Pneumonia: A Multihospital Cohort Study. Ann Intern Med 171:153–163 Gupta AB et al (2024) Inappropriate Diagnosis of Pneumonia Among Hospitalized Adults. JAMA Intern Med 184:548–556 Anadol D, Aydin YZ, Göçmen A (2001) Overdiagnosis of pneumonia in children. Turk J Pediatr 43:205–209 Pan H, Cui B, Huang Y, Yang J, Ba-Thein W (2016) Nasal carriage of common bacterial pathogens among healthy kindergarten children in Chaoshan region, southern China: a cross-sectional study. BMC Pediatr 16:161 Regev-Yochay G et al (2004) Nasopharyngeal Carriage of Streptococcus pneumoniae by Adults and Children in Community and Family Settings. CLIN INFECT DIS 38:632–639 Nokso-Koivisto J, Kinnari TJ, Lindahl P, Hovi T, Pitkäranta A (2002) Human picornavirus and coronavirus RNA in nasopharynx of children without concurrent respiratory symptoms. J Med Virol 66:417–420 Parker AM et al (2024) Upper respiratory Streptococcus pneumoniae colonization among working-age adults with prevalent exposure to overcrowding. Microbiol Spectr 12:e00879–e00824 Desai AP et al (2015) Decline in Pneumococcal Nasopharyngeal Carriage of Vaccine Serotypes After the Introduction of the 13-Valent Pneumococcal Conjugate Vaccine in Children in Atlanta, Georgia. Pediatr Infect Dis J 34:1168–1174 Bogaert D, De Groot R, Hermans PW (2004) M. Streptococcus pneumoniae colonisation: the key to pneumococcal disease. Lancet Infect Dis 4:144–154 Huang SS et al (2011) Healthcare utilization and cost of pneumococcal disease in the United States. Vaccine 29:3398–3412 Vaneechoutte M, Verschraegen G, Claeys G, Weise B (1990) Van den Abeele, A. M. Respiratory tract carrier rates of Moraxella (Branhamella) catarrhalis in adults and children and interpretation of the isolation of M. catarrhalis from sputum. J Clin Microbiol 28:2674–2680 Faden H, Harabuchi Y, Hong JJ (1994) Epidemiology of Moraxella catarrhalis in children during the first 2 years of life: relationship to otitis media. J Infect Dis 169:1312–1317 Ejlertsen T, Thisted E, Ebbesen F, Olesen B, Renneberg J (1994) Branhamella catarrhalis in children and adults. A study of prevalence, time of colonisation, and association with upper and lower respiratory tract infections. J Infect 29:23–31 Most ZM, Perl TM, Sebert M (2024) Respiratory virus infections in symptomatic and asymptomatic children upon hospital admission: new insights. Antimicrob Steward Healthc Epidemiol 4:e162 Jansen RR et al (2011) Frequent Detection of Respiratory Viruses without Symptoms: Toward Defining Clinically Relevant Cutoff Values ▿. J Clin Microbiol 49:2631–2636 Self WH et al (2016) Respiratory Viral Detection in Children and Adults: Comparing Asymptomatic Controls and Patients With Community-Acquired Pneumonia. J Infect Dis 213:584–591 Dickson RP (2016) The microbiome and critical illness. Lancet Respiratory Med 4:59–72 Mourani PM et al (2021) Temporal airway microbiome changes related to ventilator-associated pneumonia in children. Eur Respir J 57 Durairaj L et al (2009) Patterns and density of early tracheal colonization in intensive care unit patients. J Crit Care 24:114–121 Ewig S et al (1999) Bacterial colonization patterns in mechanically ventilated patients with traumatic and medical head injury. Incidence, risk factors, and association with ventilator-associated pneumonia. Am J Respir Crit Care Med 159:188–198 Mick E et al (2023) Integrated host/microbe metagenomics enables accurate lower respiratory tract infection diagnosis in critically ill children. J Clin Invest 133 Tsitsiklis A et al (2022) Lower respiratory tract infections in children requiring mechanical ventilation: a multicentre prospective surveillance study incorporating airway metagenomics. Lancet Microbe 3:e284–e293 Lydon E et al (2025) Proteomic profiling of the local and systemic immune response to pediatric respiratory viral infections. mSystems 10, e0133524 United States Centers for Disease Control and Prevention (2021) CDC/NHSN Surveillance Definitions for Specific Types of Infections. https://www.cdc.gov/nhsn/pdfs/pscmanual/pcsmanual_current.pdf Patel R et al (2023) Clinically Adjudicated Reference Standards for Evaluation of Infectious Diseases Diagnostics. Clin Infect Dis 76:938–943 Nitu ME, Eigen H (2009) Respiratory Failure. Pediatr Rev 30:470–478 Lin H, Peddada SD (2020) Analysis of compositions of microbiomes with bias correction. Nat Commun 11:3514 Dong W et al (2024) An expanded database and analytical toolkit for identifying bacterial virulence factors and their associations with chronic diseases. Nat Commun 15:8084 St Geme JW, Yeo H-J (2009) A prototype two-partner secretion pathway: the Haemophilus influenzae HMW1 and HMW2 adhesin systems. Trends Microbiol 17:355–360 Akhtar AA, Turner DPJ (2022) The role of bacterial ATP-binding cassette (ABC) transporters in pathogenesis and virulence: Therapeutic and vaccine potential. Microb Pathog 171:105734 Beghini F et al (2021) Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 Furuhashi M, Saitoh S, Shimamoto K, Miura T (2014) Fatty Acid-Binding Protein 4 (FABP4): Pathophysiological Insights and Potent Clinical Biomarker of Metabolic and Cardiovascular Diseases. Clin Med Insights Cardiol 8:23–33 Schneider WM, Chevillotte MD, Rice CM (2014) Interferon-Stimulated Genes: A Complex Web of Host Defenses. Annu Rev Immunol 32:513–545 Hofer U (2017) Granzyme B’s roundhouse kick against bacteria. Nat Rev Microbiol 15:707–707 Icardi M et al (2009) CD64 Index Provides Simple and Predictive Testing for Detection and Monitoring of Sepsis and Bacterial Infection in Hospital Patients. J Clin Microbiol 47:3914–3919 Albiger B, Dahlberg S, Henriques-Normark B, Normark S (2007) Role of the innate immune system in host defence against bacterial infections: focus on the Toll-like receptors. J Intern Med 261:511–528 Lydon EC et al (2024) Pulmonary FABP4 Is an Inverse Biomarker of Pneumonia in Critically Ill Children and Adults. Am J Respir Crit Care Med 210:1480–1483 Langelier C et al (2018) Integrating host response and unbiased microbe detection for lower respiratory tract infection diagnosis in critically ill adults. Proc Natl Acad Sci U S A 115:E12353–E12362 Flanagan JL et al (2007) Loss of bacterial diversity during antibiotic treatment of intubated patients colonized with Pseudomonas aeruginosa. J Clin Microbiol 45:1954–1962 Wesolowska-Andersen A et al (2017) Dual RNA-seq reveals viral infections in asthmatic children without respiratory illness which are associated with changes in the airway transcriptome. Genome Biol 18:12 Wolsk HM et al (2016) Picornavirus-Induced Airway Mucosa Immune Profile in Asymptomatic Neonates. J Infect Dis 213:1262–1270 Følsgaard NV et al (2013) Pathogenic bacteria colonizing the airways in asymptomatic neonates stimulates topical inflammatory mediator release. Am J Respir Crit Care Med 187:589–595 Bisgaard H et al (2007) Childhood asthma after bacterial colonization of the airway in neonates. N Engl J Med 357:1487–1495 Cho H-J et al (2021) Differences and similarities between the upper and lower airway: focusing on innate immunity. Rhinology 59:441–450 Mick E et al (2020) Upper airway gene expression reveals suppressed immune responses to SARS-CoV-2 compared with other respiratory viruses. Nat Commun 11:5854 Mick E et al (2022) Upper airway gene expression shows a more robust adaptive immune response to SARS-CoV-2 in children. Nat Commun 13:3937 Ichinohe T et al (2011) Microbiota regulates immune defense against respiratory tract influenza A virus infection. Proc. Natl. Acad. Sci. U.S.A. 108, 5354–5359 Vanderweele TJ, Vansteelandt S (2009) Conceptual issues concerning mediation, interventions and composition. Stat Its Interface 2:457–468 Gaston DC (2023) Clinical Metagenomics for Infectious Diseases: Progress toward Operational Value. J Clin Microbiol 61:e01267–e01222 Benoit P et al (2024) Seven-year performance of a clinical metagenomic next-generation sequencing test for diagnosis of central nervous system infections. Nat Med 30:3522–3533 National Human Genome Research Institute (2024) DNA Sequencing Costs: Data. https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data Self WH et al (2017) Procalcitonin as a Marker of Etiology in Adults Hospitalized With Community-Acquired Pneumonia. Clin Infect Dis 65:183–190 van der Meer V, Neven AK, van den Broek PJ, Assendelft WJ (2005) J. Diagnostic value of C reactive protein in infections of the lower respiratory tract: systematic review. BMJ 331:26 Gu W et al (2016) Depletion of Abundant Sequences by Hybridization (DASH): using Cas9 to remove unwanted high-abundance species in sequencing libraries and molecular counting applications. Genome Biol 17:1–13 Gold L et al (2010) Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE 5:e15004 Kim CH et al (2018) Stability and reproducibility of proteomic profiles measured with an aptamer-based platform. Sci Rep 8:8382 Candia J et al (2024) Variability of 7K and 11K SomaScan Plasma Proteomics Assays. J Proteome Res 23:5531–5539 Kalantar KL et al (2020) IDseq-An open source cloud-based pipeline and analysis service for metagenomic pathogen detection and monitoring. Gigascience 9:giaa111 Lu D et al (2025) Simultaneous detection of pathogens and antimicrobial resistance genes with the open source, cloud-based, CZ ID platform. Genome Med 17:46 Bray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34:525–527 Chen S, Zhou Y, Chen Y, Gu J (2018) fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890 Ruby JG, Bellare P, Derisi JL (2013) PRICE: software for the targeted assembly of components of (Meta) genomic sequence data. G3 (Bethesda) 3:865–880 Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods. 9 Jain S et al (2015) Community-Acquired Pneumonia Requiring Hospitalization among U.S. Adults. N Engl J Med 373:415–427 Jain S et al (2015) Community-acquired pneumonia requiring hospitalization among U.S. children. N Engl J Med 372:835–845 Iwai S et al (2014) The Lung Microbiome of Ugandan HIV-Infected Pneumonia Patients Is Compositionally and Functionally Distinct from That of San Franciscan Patients. PLoS ONE 9:e95726 Magill SS et al (2014) Multistate point-prevalence survey of health care-associated infections. N Engl J Med 370:1198–1208 Soneson C, Love MI, Robinson MD (2015) Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res 4, 1521 Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550 Ritchie ME et al (2015) limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43:e47 Croft D et al (2014) The Reactome pathway knowledgebase. Nucleic Acids Res 42:D472–D477 Tingley D, Yamamoto T, Hirose K, Keele L, Imai K (2014) mediation: R Package for Causal Mediation Analysis. J Stat Softw 59:1–38 Friedman J et al (2023) glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models Robin X et al (2011) pROC: an open-source package for R and S + to analyze and compare ROC curves. BMC Bioinformatics 12:77 TABLES Table 1 Demographic and clinical characteristics of the LRTI, IPC, and CTRL groups. Race is indicated as “unknown” if patient declined or was unable to answer, if they selected “other” as an option, or if data were missing. P values compare LRTI and IPC groups. Fisher’s exact test used for categorical variables, and Kruskal-Wallis rank-sum test used for continuous variables. IQR, interquartile range; ICU, intensive care unit. * Indicates statistically significant (P < 0.05) LRTI (n = 207) IPC (n = 70) P value CTRL (n = 49) Female, n (%) 124 (59.9) 38 (54.3) 0.49 24 (49.0) Male, n (%) 83 (40.1) 32 (45.7) 25 (51.0) Age in years, median (IQR) 0.6 (0.2–2.1) 1.6 (0.9–6.4) < 0.001* 9.5 (1.3–14.4) Race, n (%) 0.53 White 122 (58.9) 39 (55.7) 27 (55.1) Black/African American 41 (19.8) 10 (14.3) 11 (22.4) Asian 9 (4.3) 4 (5.7) 5 (10.2) Native Hawaiian/Pacific Islander 1 (0.5) 2 (2.9) 0 (0.0) American Indian/Alaska Native 4 (1.9) 1 (1.4) 0 (0.0) Multi-racial 5 (2.4) 3 (4.3) 1 (2.0) Unknown 25 (12.1) 11 (15.7) 5 (10.2) Hispanic/Latino ethnicity, n (%) 41 (19.8) 18 (25.7) 0.38 3 (6.1) Comorbidities, n (%) 86 (41.5) 37 (52.9) 0.13 22 (44.9) Admission category, n (%) < 0.001* Medical 206 (99.5) 52 (74.3) 29 (59.2) Surgical 1 (0.5) 10 (14.3) 10 (20.4) Trauma 0 (0.0) 8 (11.4) 10 (20.4) Antibiotics prior to intubation, n (%) 72 (34.8) 17 (24.3) 0.14 17 (34.7) Any antibiotic use, n (%) 205 (99.0) 64 (91.4) 0.004* 41 (83.7) Ventilator days, median (IQR) 7.0 (5.0–9.0) 6.0 (4.0–7.0) 0.003* 6.0 (5.0–9.0) ICU length of stay, median (IQR) 11.0 (8.0–16.0) 9.0 (7.0–12.0) 0.009* 10.0 (7.0–15.0) Hospital length of stay, median (IQR) 16.0 (11.0–24.0) 17.0 (9.0–30.0) 0.73 23.0 (14.0–43.0) Mortality, n (%) 6 (2.9) 10 (14.3) 0.001* 2 (4.1) Additional Declarations There is NO Competing Interest. Supplementary Files Supplementarymaterialclean11192025.docx Supplementary Material SuppData1.xlsx Supplementary Data File 1 SuppData2.xlsx Supplementary Data File 2 SuppData3.xlsx Supplementary Data File 3 SuppData4.xlsx Supplementary Data File 4 SuppData5.xlsx Supplementary Data File 5 SuppData6.xlsx Supplementary Data File 6 SuppData7.xlsx Supplementary Data File 7 sourcedatafileLRTIvIPC.xlsx Source Data File Cite Share Download PDF Status: Under Review Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8171822","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":551288474,"identity":"cd19372d-0b86-423e-a5f2-c731a2802f53","order_by":0,"name":"Charles Langelier","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABDUlEQVRIie3QMUvDQBTA8fcIOFWyHqj0K6Q4Kl39Ch3zCJiluBS6OPiKoEuh6235CimuDi8c6CJxDXSpCM7p1qFCcw5SxBwdBe8/PJLwfnAXAJ/vDxbqZiDbx4AF4Ny+LgEO2omqvglacmlt5CTRYIc0w+xBjm7fa+T+VfdiMpHV4yvNZg2rx6adHD9FEXIy6r0UXOiPBekqAdSlg6g4ipEDmmti05EF5SqU4PDORdJakG9onr2x2UhJeWgg+HSRwbDHyIYy1UwQoRwSCNBBVDUcAfEz5R3iYirJqb1LMS3TVhLq9KFeba4puzemXkv/pPljuFyPz1rJV7E9oex8kF/XftblvdZ8Pp/vP7YF8Nxi5JeT2+IAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-6708-4646","institution":"University of California, San Francisco","correspondingAuthor":true,"prefix":"","firstName":"Charles","middleName":"","lastName":"Langelier","suffix":""},{"id":551288475,"identity":"161fc9fe-5c2b-4654-8906-644f8f6672cb","order_by":1,"name":"Emily Lydon","email":"","orcid":"","institution":"University of California San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Emily","middleName":"","lastName":"Lydon","suffix":""},{"id":551288476,"identity":"8df1799d-a822-4bac-bd00-690fe822c9f3","order_by":2,"name":"Padmini Deosthale","email":"","orcid":"","institution":"University of California San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Padmini","middleName":"","lastName":"Deosthale","suffix":""},{"id":551288477,"identity":"9e16ad77-bb07-4d24-af60-e0ce3ee11bd7","order_by":3,"name":"Abigail Glascock","email":"","orcid":"","institution":"Chan Zuckerberg Biohub","correspondingAuthor":false,"prefix":"","firstName":"Abigail","middleName":"","lastName":"Glascock","suffix":""},{"id":551288478,"identity":"b5b81a88-5a6b-4d35-abbb-2f5b9e57ca1d","order_by":4,"name":"Hoang Van Phan","email":"","orcid":"","institution":"UCSF","correspondingAuthor":false,"prefix":"","firstName":"Hoang","middleName":"Van","lastName":"Phan","suffix":""},{"id":551288479,"identity":"57f16801-0098-445a-b886-e61d44ccff20","order_by":5,"name":"Christina Osborne","email":"","orcid":"","institution":"CHOP","correspondingAuthor":false,"prefix":"","firstName":"Christina","middleName":"","lastName":"Osborne","suffix":""},{"id":551288480,"identity":"d5a81ff6-eb20-475f-8902-2f7b5e247cb0","order_by":6,"name":"Matthew Leroue","email":"","orcid":"","institution":"U Colorado","correspondingAuthor":false,"prefix":"","firstName":"Matthew","middleName":"","lastName":"Leroue","suffix":""},{"id":551288481,"identity":"c4b09718-ff7a-4971-b218-a811b313d12d","order_by":7,"name":"Jawara Allen","email":"","orcid":"","institution":"UCSF","correspondingAuthor":false,"prefix":"","firstName":"Jawara","middleName":"","lastName":"Allen","suffix":""},{"id":551288482,"identity":"c6f61b60-23f4-47ed-99c9-075a0412e9b8","order_by":8,"name":"Eran Mick","email":"","orcid":"","institution":"Massachusetts General Hospital","correspondingAuthor":false,"prefix":"","firstName":"Eran","middleName":"","lastName":"Mick","suffix":""},{"id":551288483,"identity":"7ebea600-242a-491d-82db-2a61e025cee4","order_by":9,"name":"Brandie Wagner","email":"","orcid":"","institution":"University of Colorado and Children's Hospital Colorado","correspondingAuthor":false,"prefix":"","firstName":"Brandie","middleName":"","lastName":"Wagner","suffix":""},{"id":551288484,"identity":"396d0bf4-2aec-4093-8e8f-ac98d52929f8","order_by":10,"name":"Joseph DeRisi","email":"","orcid":"","institution":"University of California San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Joseph","middleName":"","lastName":"DeRisi","suffix":""},{"id":551288485,"identity":"f3b429a6-e3e7-4270-9d10-557f6f0e6b0a","order_by":11,"name":"Lilliam Ambroggio","email":"","orcid":"","institution":"University of Colorado and Children's Hospital Colorado","correspondingAuthor":false,"prefix":"","firstName":"Lilliam","middleName":"","lastName":"Ambroggio","suffix":""},{"id":551288486,"identity":"822cba73-57dc-4a3e-ab7f-fb31feb0bf73","order_by":12,"name":"Peter Mourani","email":"","orcid":"","institution":"U Michigan","correspondingAuthor":false,"prefix":"","firstName":"Peter","middleName":"","lastName":"Mourani","suffix":""}],"badges":[],"createdAt":"2025-11-21 09:20:49","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8171822/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8171822/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":97131503,"identity":"5bc74c5e-bb3b-4b36-859c-aaf9cdafba6e","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5187921,"visible":true,"origin":"","legend":"","description":"","filename":"LRTIvIPCmanuscript11192025.docx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/23841f242d44f1a53e9d71fb.docx"},{"id":97131491,"identity":"68bbfed3-2917-4b65-abd4-c3fed753fea7","added_by":"auto","created_at":"2025-12-01 08:42:53","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":13242,"visible":true,"origin":"","legend":"","description":"","filename":"NCOMMS2594320.json","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/28b3cde4bbdcc9b826fac2a6.json"},{"id":97131493,"identity":"833a27cb-0793-4891-8de7-4a4653180dcf","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2569711,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementarymaterialclean11192025.docx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/54d7952ece2d8d9c2a3c5a31.docx"},{"id":97141815,"identity":"d5e7bc4f-7c7b-4de3-b600-b9cd6308ea00","added_by":"auto","created_at":"2025-12-01 10:07:04","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":217135,"visible":true,"origin":"","legend":"","description":"","filename":"NCOMMS25943200enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/d8f2f580b54d9d6d35750bab.xml"},{"id":97142569,"identity":"5649b948-4096-4473-ac9f-2cd650fa6ac2","added_by":"auto","created_at":"2025-12-01 10:07:43","extension":"jpeg","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":226214,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage10.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/d67670caa85649a76ac0c971.jpeg"},{"id":97131521,"identity":"26c75766-e8de-4c0e-9315-bb4df1e08957","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":365246,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage11.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/8fad0257e0af3c34c514c870.png"},{"id":97142487,"identity":"55af2ecc-abff-4248-9c2c-177b9026e4e6","added_by":"auto","created_at":"2025-12-01 10:07:38","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":600873,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage12.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/1e8d98fe92945a9860d3afbe.png"},{"id":97131531,"identity":"878f0765-0442-45f0-b507-839784b808ff","added_by":"auto","created_at":"2025-12-01 08:42:55","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":512345,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage13.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/51a5effcd8aef84f283d5e26.png"},{"id":97131525,"identity":"8b31e0dd-a20a-432b-bca8-13b21ba96b8d","added_by":"auto","created_at":"2025-12-01 08:42:55","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":403678,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/7e96d2056f81f89c4c540921.png"},{"id":97141584,"identity":"81436e29-37a8-4c03-9051-3db3b247af4c","added_by":"auto","created_at":"2025-12-01 10:06:50","extension":"png","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":102902,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/e2962498117c814f2c00dcdc.png"},{"id":97131507,"identity":"4a04eba3-c76e-4627-98bf-a44e6ccba210","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"jpeg","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":380288,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage8.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/182e87a80885bc9863b397e0.jpeg"},{"id":97142924,"identity":"6a73d8ce-32f8-49f4-b99a-0f96c283b49f","added_by":"auto","created_at":"2025-12-01 10:08:07","extension":"jpeg","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":461012,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage9.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/8344c50283d38602b369b36a.jpeg"},{"id":97131510,"identity":"df9fcbaa-62de-4f08-a1ea-468c373c4904","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"png","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":73230,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/a479188a0bd5ef0c1bece3a7.png"},{"id":97142648,"identity":"792acf8c-9ce6-45a4-b723-d6719ca8776b","added_by":"auto","created_at":"2025-12-01 10:07:48","extension":"png","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":61573,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage10.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/f07ed9b541539706f10c64b9.png"},{"id":97131508,"identity":"a7393051-a6d6-46d4-8a7b-735c16ba3622","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"png","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":43318,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage11.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/886035716fbe5950d2dc9f18.png"},{"id":97131511,"identity":"3250756d-0a57-4a56-8d29-2cc0d748ca36","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"png","order_by":20,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":126880,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage12.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/faf4ee21759b2b0d8ee70f39.png"},{"id":97131513,"identity":"eec12ffc-30f7-4f2e-ac12-45969335e2cb","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"png","order_by":21,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":111981,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage13.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/60af99f8e1b64c809cca6b9f.png"},{"id":97142902,"identity":"5c25e0e8-d5e7-411e-ad35-940cf391e81b","added_by":"auto","created_at":"2025-12-01 10:08:04","extension":"png","order_by":22,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":125370,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/31b0ab4e533420001d509782.png"},{"id":97131526,"identity":"68e64e64-c0dd-4aed-9d57-114ed9c4db99","added_by":"auto","created_at":"2025-12-01 08:42:55","extension":"png","order_by":23,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":103996,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/f37fc04f7f2a32c7af9959be.png"},{"id":97131504,"identity":"decae5a7-bcc2-4cd4-8f8f-41be550cc27f","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"png","order_by":24,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":122766,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/b91c36e8945f7a8f45a617a6.png"},{"id":97131522,"identity":"20f9c7a0-b384-4692-9d77-ee28f5bdded6","added_by":"auto","created_at":"2025-12-01 08:42:55","extension":"png","order_by":25,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":61154,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/0f105fb09593c5320519df0c.png"},{"id":97142710,"identity":"16343072-c2cb-4baf-94a6-ff56298a8674","added_by":"auto","created_at":"2025-12-01 10:07:53","extension":"png","order_by":26,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":93000,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/545d5c4ddbd71b4160148c5b.png"},{"id":97141332,"identity":"09efc12e-510b-47f8-8151-b47bc896c1af","added_by":"auto","created_at":"2025-12-01 10:06:35","extension":"png","order_by":27,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":22235,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/edaae78813fc53216ed924a3.png"},{"id":97142496,"identity":"c1611205-3928-410b-b1f6-4beeab353023","added_by":"auto","created_at":"2025-12-01 10:07:39","extension":"png","order_by":28,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":110490,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/f5d90eab3f277fbad02d1abb.png"},{"id":97131520,"identity":"bf82dcbe-6957-4f5c-aa49-89c47ae10133","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"png","order_by":29,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":147284,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/8764e133826d21486f6575d2.png"},{"id":97131519,"identity":"7d89c974-96ec-4e39-9036-0265dc453f23","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"xml","order_by":30,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":215093,"visible":true,"origin":"","legend":"","description":"","filename":"NCOMMS25943200structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/eb8cfdea181b064e801ae12c.xml"},{"id":97142785,"identity":"8b4b2dec-76a7-40f5-8e06-53d654c71e28","added_by":"auto","created_at":"2025-12-01 10:07:56","extension":"html","order_by":31,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":232796,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/860b6721571e297eae061c3c.html"},{"id":97131490,"identity":"021c3ba4-d938-4dc7-af70-5d294f40572b","added_by":"auto","created_at":"2025-12-01 08:42:53","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":306103,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDistribution of pathogens among patients with clinically adjudicated lower respiratory tract infection (LRTI) or incidental pathogen carriage (IPC).\u003c/strong\u003e Bar plot demonstrating the proportion of participants in the LRTI (n = 207, orange) and IPC (n = 70, teal) groups with each detected pathogen as determined by combined clinical testing and metatranscriptomics. Odds ratio (OR) with 95% confidence interval (CI) tabulated and plotted on the right. Filled circles represent statistically significant pathogens based on P\u0026lt;0.05. Arrows indicate where the lower or upper confidence interval exceeds the plotting area. Pathogens detected only once in the entire cohort were excluded from plotting. Statistical significance for between‑group differences was assessed with Fisher’s exact test.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/bc1befa09da896d5e3536322.png"},{"id":97142736,"identity":"da5d8a0f-47aa-4052-b46f-9528cab044bb","added_by":"auto","created_at":"2025-12-01 10:07:55","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":678587,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eThe lower airway microbiome differs between infection states. a.\u003c/strong\u003e Shannon diversity index (SDI) of the lung microbiome in patients with LRTI (n=207, orange), IPC (n=70, teal), or CTRL (n=49, gray). \u003cstrong\u003eb.\u003c/strong\u003e Species richness of the lung microbiome by group. \u003cstrong\u003ec.\u003c/strong\u003e Principal‑coordinate analysis of Bray–Curtis distances demonstrating differences in community composition between groups; p value calculated by PERMANOVA. \u003cstrong\u003ed.\u003c/strong\u003e Lower airway total bacterial abundance measured in reads per million (RPM). \u003cstrong\u003ee.\u003c/strong\u003e Lower airway total viral abundance measured in RPM. \u003cstrong\u003ef.\u003c/strong\u003e Differentially abundant taxa in the lower airway microbiome identified by ANCOM‑BC. \u003cstrong\u003eg.\u003c/strong\u003e Proportion of pathogen‑assigned reads represented in the lower airway microbiome. \u003cstrong\u003eh.\u003c/strong\u003e Expression of \u003cem\u003eHaemophilus influenzae\u003c/em\u003e virulence factors \u003cem\u003eHMW1/2\u003c/em\u003e and \u003cem\u003eHxuABC\u003c/em\u003e across groups. \u003cstrong\u003ei.\u003c/strong\u003e Log₂‑fold change in HUMAnN3‑derived bacterial metabolic pathways found to be significantly different (P\u003csub\u003eadj\u003c/sub\u003e\u0026lt;0.05) between groups.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/b59804c850ff17ad640f67ee.png"},{"id":97131494,"identity":"89b57fd2-1f34-471e-931f-04a130d577bf","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":572836,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eLower airway host transcriptional responses distinguish LRTI from IPC and controls.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003ea.\u003c/strong\u003e Principal component analysis (PCA) of the lower respiratory tract transcriptome. Adjusted P value calculated by PERMANOVA between LRTI and non-LRTI groups. \u003cstrong\u003eb.\u003c/strong\u003e Heat map demonstrating hierarchical clustering of patients in each group (LRTI, IPC, CTRL) based on the top 20 differentially expressed (DE) genes between LRTI and CTRL groups. Color bar indicates normalized, Z-score scaled expression of each gene. \u003cstrong\u003ec.\u003c/strong\u003e Volcano plot of DE genes between LRTI and CTRL, with Benjamini-Hochberg (BH) adjusted P values \u0026lt;0.05 colored. \u003cstrong\u003ed.\u003c/strong\u003e Volcano plot highlighting DE genes between LRTI and IPC. \u003cstrong\u003ee.\u003c/strong\u003e Volcano plot of DE results comparing IPC versus CTRL. \u003cstrong\u003ef.\u003c/strong\u003e Normalized \u003cem\u003eFABP4\u003c/em\u003e expression across groups, with Benjamini-Hochberg adjusted P values from the DE analyses. \u003cstrong\u003eg.\u003c/strong\u003e Gene‑set enrichment analysis (GSEA) highlighting immune pathway enrichment in IPC compared to LRTI (purple) and IPC compared to CTRL (yellow). Top 20 DE pathways shown for the LRTI/IPC comparison, then overlaid with the IPC/CTRL comparison, showing only significant pathways (P\u003csub\u003eadj\u003c/sub\u003e\u0026lt;0.05). Point size scales inversely with P\u003csub\u003eadj\u003c/sub\u003e value.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/60f0faf5af26ada87140e978.png"},{"id":97141200,"identity":"e2213fc6-7d3a-4995-ae73-8d9e3dc358ed","added_by":"auto","created_at":"2025-12-01 10:06:24","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":547740,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eInnate immune activation differs between LRTI and IPC and is modulated by pathogen type, abundance, and microbiome diversity. a. \u003c/strong\u003eNumber of differentially expressed (DE) genes between LRTI, IPC, and CTRL stratified by viral (purple) or bacterial (yellow) pathogen class. \u003cstrong\u003eb.\u003c/strong\u003e Volcano plot highlighting subtle interferon‑stimulated gene (ISG) expression signature in viral IPC versus CTRL. \u003cstrong\u003ec.\u003c/strong\u003e Boxplot of ISG \u003cem\u003eIFIH1\u003c/em\u003e expression by group. Benjamini-Hochberg (BH) adjusted P values, based on DE analyses, shown. \u003cstrong\u003ed.\u003c/strong\u003e Linear regression of viral abundance (reads per million, RPM) against \u003cem\u003eIFIH1\u003c/em\u003e expression for viral LRTI and IPC. Shaded area indicates 95% confidence intervals. \u003cstrong\u003ee.\u003c/strong\u003e Negative correlation between Shannon diversity index (SDI) and \u003cem\u003eIFIH1\u003c/em\u003e expression in viral subgroups. \u003cstrong\u003ef.\u003c/strong\u003e Linear regression between viral abundance and \u003cem\u003eIFIH1\u003c/em\u003e expression, after adjusting for SDI. \u003cstrong\u003eg.\u003c/strong\u003e Mediation analysis illustrating proportion of group effect mediated by SDI in viral cases. \u003cstrong\u003eh.\u003c/strong\u003e Analogous mediation analysis for \u003cem\u003eGZMB\u003c/em\u003e in bacterial cases. \u003cstrong\u003ei.\u003c/strong\u003e Number of DE genes (P\u003csub\u003eadj\u003c/sub\u003e\u0026lt;0.05) between pairwise LRTI, IPC and CTRL comparisons, with and without adjustment for SDI.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/0cf1697c8b4e3ea736927f77.png"},{"id":97141760,"identity":"31b8dc96-26b6-4090-bf1d-81c1712cc95f","added_by":"auto","created_at":"2025-12-01 10:06:59","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":365093,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eIntegration of host and microbial features enable accurate diagnosis of LRTI and differentiation from IPC. a. \u003c/strong\u003eComparative performance of classifiers for distinguishing LRTI from IPC or CTRL groups. Classifiers based on: Shannon diversity index (SDI), \u003cem\u003eFABP4, FABP4\u003c/em\u003e + SDI, a multi-gene model derived from LASSO regularized regression of the host transcriptome, or a multi-gene model + SDI. AUC reflects mean across 5-fold cross validation, and 95% CI estimated by bootstrapping. \u003cstrong\u003eb.\u003c/strong\u003e ROC curve for multi-gene classifier with SDI. \u003cstrong\u003ec.\u003c/strong\u003e Probability of LRTI based on classifier scores broken down by clinical group. \u003cstrong\u003ed.\u003c/strong\u003e Lower respiratory FABP4 protein concentration differences between LRTI, IPC and CTRL groups. \u003cstrong\u003ee.\u003c/strong\u003eReceiver operating characteristic (ROC) curve for FABP4 protein biomarker.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/d2315a54980edb8e0ce9a7f5.png"},{"id":97249526,"identity":"f3a0575f-0e7a-4309-a8d7-04960a941bc4","added_by":"auto","created_at":"2025-12-02 13:12:51","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3850638,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/0874fc47-ca53-4647-b1fe-c5edaa87b175.pdf"},{"id":97142754,"identity":"75a94ad4-053f-4067-9859-92e487803320","added_by":"auto","created_at":"2025-12-01 10:07:56","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":2569711,"visible":true,"origin":"","legend":"Supplementary Material","description":"","filename":"Supplementarymaterialclean11192025.docx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/64082846c4b2e4cc070a87b7.docx"},{"id":97131498,"identity":"00102955-68dc-4157-bfec-2d085336c2ea","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":31357,"visible":true,"origin":"","legend":"Supplementary Data File 1","description":"","filename":"SuppData1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/cdd37965d989a955a64d43bf.xlsx"},{"id":97131492,"identity":"b5b8d415-392b-4ae8-a652-ffd1996b9df5","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":13701,"visible":true,"origin":"","legend":"Supplementary Data File 2","description":"","filename":"SuppData2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/7a045051ea5ec8c7985c5d68.xlsx"},{"id":97142787,"identity":"99f8b222-5e7e-44d7-8800-a92a2d55322a","added_by":"auto","created_at":"2025-12-01 10:07:56","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":18820,"visible":true,"origin":"","legend":"Supplementary Data File 3","description":"","filename":"SuppData3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/39699806dddb01bc8332cd82.xlsx"},{"id":97142675,"identity":"1cbb1d94-ce29-4a65-9209-2c919150a23e","added_by":"auto","created_at":"2025-12-01 10:07:48","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":3128856,"visible":true,"origin":"","legend":"Supplementary Data File 4","description":"","filename":"SuppData4.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/03f54cdf94e772ea5cb16085.xlsx"},{"id":97142836,"identity":"2a35ca98-b9bd-4281-b2d0-24dd23010383","added_by":"auto","created_at":"2025-12-01 10:07:59","extension":"xlsx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":115236,"visible":true,"origin":"","legend":"Supplementary Data File 5","description":"","filename":"SuppData5.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/4014f3de08eb19ccc6b1e9a5.xlsx"},{"id":97131505,"identity":"c26a8205-96ef-4f3b-b541-825aec1097da","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"xlsx","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":5983668,"visible":true,"origin":"","legend":"\u003cp\u003eSupplementary Data File 6\u003c/p\u003e","description":"","filename":"SuppData6.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/c3a7d32d43de25b412a2f604.xlsx"},{"id":97131517,"identity":"5bfa7de1-7ffe-44a8-bf29-caa9c3c9076b","added_by":"auto","created_at":"2025-12-01 08:42:54","extension":"xlsx","order_by":8,"title":"","display":"","copyAsset":false,"role":"supplement","size":3117207,"visible":true,"origin":"","legend":"\u003cp\u003eSupplementary Data File 7\u003c/p\u003e","description":"","filename":"SuppData7.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/71c72c8e929d0958d76d4c6d.xlsx"},{"id":97142634,"identity":"e2141166-d811-4452-b811-2d2cee78120b","added_by":"auto","created_at":"2025-12-01 10:07:47","extension":"xlsx","order_by":9,"title":"","display":"","copyAsset":false,"role":"supplement","size":2300874,"visible":true,"origin":"","legend":"Source Data File","description":"","filename":"sourcedatafileLRTIvIPC.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8171822/v1/5cc70f09fca81da79094cb0e.xlsx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Host–Microbiome Archetypes Differentiate Infection from Pathogen Carriage in the Human Lower Airway","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003eThe upper and lower airways harbor robust microbial communities which together represent the human respiratory microbiome\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e. While primarily comprised of commensal bacteria during states of health, it has been well established that respiratory pathobionts, or microbes with the potential to cause disease, can be incidentally carried within the airway microbiome without eliciting signs or symptoms of infection\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. This phenomenon of \u0026lsquo;colonization\u0026rsquo; or incidental pathogen carriage (IPC), which involves a dynamic relationship between pathobiont, microbiome and host immune response, remains incompletely understood.\u003c/p\u003e\u003cp\u003eIPC frequently complicates the management of acute respiratory illness by confounding accurate lower respiratory tract infection (LRTI) diagnosis. For instance, in patients hospitalized for respiratory failure, the underlying cause often remains unclear for days, as both infectious and non-infectious conditions can present with overlapping clinical features. LRTI diagnostic tests that rely almost exclusively on pathogen detection cannot differentiate between true infection and IPC, nor discern the presence of a non-infectious etiology\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e,\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e. As a result, detection of any potential pathogen, particularly in the setting of respiratory failure, can lead to an LRTI diagnosis, even in the absence of true infection\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. This contributes to unnecessary antimicrobial use and missed opportunities to diagnose and treat alternative causes of respiratory failure, such as cardiac conditions or autoinflammatory diseases\u003csup\u003e\u003cspan additionalcitationids=\"CR8\" citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eWhile IPC occurs across the age spectrum, the incidence is highest in children compared to adults\u003csup\u003e\u003cspan additionalcitationids=\"CR11\" citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e. For instance, an estimated 33\u0026ndash;90% of children incidentally carry \u003cem\u003eStreptococcus pneumoniae\u003c/em\u003e in the airway, compared to \u0026lt;\u0026thinsp;5% of adults\u003csup\u003e\u003cspan additionalcitationids=\"CR14 CR15\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e. Similarly, \u003cem\u003eMoraxella catarrhalis\u003c/em\u003e colonizes the nasopharynx in 30\u0026ndash;100% of infants but only 1\u0026ndash;5% of adults\u003csup\u003e\u003cspan additionalcitationids=\"CR18\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e. Viral IPC is also far more common among children; based on population surveillance studies, an estimated 25% of asymptomatic young children incidentally carry at least one viral pathogen, in contrast to only 2% of adults\u003csup\u003e\u003cspan additionalcitationids=\"CR21\" citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eThe high baseline rates of respiratory IPC in children may be further increased in the setting of hospitalization and critical illness. The physiological disturbances of critical illness, including disruption of epithelial barriers, immune dysregulation, and introduction of endotracheal tubes, can reshape the respiratory microenvironment, promoting shifts in microbial community composition and facilitating colonization by opportunistic pathogens\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e,\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e. Among patients who require mechanical ventilation for non-infectious indications, airway colonization with potentially pathogenic organisms occurs in nearly half within the first 24 hours\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e,\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eDespite its clinical relevance and frequent occurrence, the host and microbial factors that distinguish IPC from LRTI remain incompletely understood. No studies have yet evaluated IPC in the context of the lower airway microbiome and host response, and tests capable of accurately distinguishing LRTI from IPC do not yet exist. To address these gaps, we studied a prospective cohort of critically ill children hospitalized for acute respiratory failure and performed metatranscriptomic RNA sequencing on lower respiratory samples to simultaneously profile both host and microbe. We identify striking host and microbial biosignatures that distinguish the two states, and then leverage findings to build accurate diagnostic classifiers with the potential to advance acute respiratory illness management.\u003c/p\u003e"},{"header":"RESULTS","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003ePatient cohort, clinical adjudication, and pathogen detection\u003c/h2\u003e\u003cp\u003eCritically ill children with all-cause respiratory failure requiring mechanical ventilation (n\u0026thinsp;=\u0026thinsp;457) were prospectively enrolled at eight U.S. hospitals between 2/2015 and 12/2017 (\u003cb\u003eFigure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e\u003c/b\u003e).\u003csup\u003e\u003cspan additionalcitationids=\"CR28\" citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e Tracheal aspirate was obtained within 24 hours of intubation and stored in an RNA stabilizing agent. High-quality RNA sequencing data capturing both the respiratory microbiome and host transcriptome were generated on 343 patients.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eLRTI cases were identified by structured, retrospective clinical adjudication following ICU discharge performed by \u0026ge;\u0026thinsp;2 physicians trained in critical care medicine or infectious diseases using the CDC/NHSN PNU1 criteria\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. Adjudicators, who had access to all clinical data in the electronic health record and were blinded to metatranscriptomic results\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e, identified 224 patients (65.3%) with LRTI. Alternative, non-infectious etiologies of respiratory failure were adjudicated in 119 (34.7%) patients, and included neurologic conditions, anatomic airway abnormalities, toxin exposures, cardiac conditions, trauma and autoimmune disease. To comprehensively identify respiratory pathogens in the lower airway, a combination of standard-of-care clinical microbiologic testing and respiratory metatranscriptomics was performed. Following this process, 207 patients received a clinical diagnosis of LRTI and had a respiratory pathogen detected (\u0026ldquo;LRTI\u0026rdquo; group). Of those with a clear non-infectious cause of respiratory failure, 70 had a pathogen detected (\u0026ldquo;IPC\u0026rdquo; group) and 49 patients did not (\u0026ldquo;CTRL\u0026rdquo; group). Seventeen patients with clinically adjudicated LRTI but negative microbiologic testing were excluded from the analysis.\u003c/p\u003e\u003cp\u003eWe found no differences in sex, race, ethnicity, or comorbidities between groups. Patients in the LRTI group had a younger median age of 0.6 years (interquartile range (IQR) 0.2\u0026ndash;2.1), compared to 1.6 years (IQR 0.9\u0026ndash;6.4) in the IPC group and 9.5 years (IQR 1.3\u0026ndash;14.4) in the CTRL group, reflecting the typical epidemiologic differences in pediatric respiratory failure.\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e,\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e Ventilator days and ICU length of stay were slightly longer in LRTI versus IPC, though hospital length of stay was similar and mortality was lower. Antibiotic usage prior to intubation was similar across all groups, and notably, most patients received antibiotics during their hospital course (LRTI 99.0%, IPC 91.4%, CTRL 83.7%).\u003c/p\u003e\u003cp\u003eRespiratory syncytial virus (RSV) was the most common respiratory pathogen in the cohort, identified in 52.7% of the LRTI group and 12.9% of the IPC group (P\u0026thinsp;\u0026lt;\u0026thinsp;0.001) (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Other pathogens that statistically differed in prevalence between groups included human metapneumovirus (LRTI 6.8%, IPC 0.0%, P\u0026thinsp;=\u0026thinsp;0.02), \u003cem\u003eHaemophilus influenzae\u003c/em\u003e (LRTI 30.0%, IPC 17.1%, P\u0026thinsp;=\u0026thinsp;0.04), and \u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e (LRTI 1.9%, IPC 10.0%, P\u0026thinsp;=\u0026thinsp;0.01). Several other pathogens were identified at similar rates between groups, including rhinovirus, \u003cem\u003eMoraxella catarrhalis\u003c/em\u003e, \u003cem\u003eStaphylococcus aureus\u003c/em\u003e, and \u003cem\u003eStreptococcus pneumoniae\u003c/em\u003e. Co-detection of both bacteria and viruses was common, comprising 59.4% of LRTI cases and 25.7% of IPC cases.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eThe respiratory microbiome differs between LRTI, IPC, and controls\u003c/h3\u003e\n\u003cp\u003eWe first sought to compare the composition and function of the lung microbiome between LRTI and IPC, hypothesizing that both biologically relevant and diagnostically useful distinctions may exist. Alpha diversity demonstrated notable differences, with LRTI characterized by a lower Shannon Diversity Index (SDI) compared to either IPC (P\u003csub\u003eadj\u003c/sub\u003e=2.8e-8) or CTRL (P\u003csub\u003eadj\u003c/sub\u003e=2.6e-14). In contrast, SDI did not differ between IPC and CTRL groups (P\u003csub\u003eadj\u003c/sub\u003e=0.59) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003ea). Similarly, we found that community richness (total number of unique species in the lung microbiome) was significantly higher in both IPC and CTRL groups compared to those with LRTI (P\u003csub\u003eadj\u003c/sub\u003e=1.2e-11 and 1.3e-9, respectively) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003eb). Community composition also differed between LRTI, IPC, and CTRL groups based on the Bray-Curtis index (P\u0026thinsp;\u0026lt;\u0026thinsp;0.001) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003ec). Despite these distinct microbiome archetypes, the detected pathogen was typically the most abundant microbe in the airway microbiome in both LRTI and IPC (\u003cb\u003eFigure S2\u003c/b\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWe next compared total bacterial abundance (measured in reads per million, RPM) between groups. Surprisingly, total bacterial abundance was highest in IPC compared to both LRTI (P\u003csub\u003eadj\u003c/sub\u003e=2.2e-3) and CTRL (P\u003csub\u003eadj\u003c/sub\u003e=2.7e-3) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003ed). In contrast, total viral RPM was highest in LRTI, while the IPC group exhibited an intermediate state with greater viral abundance compared to the CTRL group (P\u003csub\u003eadj\u003c/sub\u003e=1.3e-9) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003ee). Sensitivity analyses subsetting by viral-bacterial co-detection did not markedly change the observed relationships between LRTI, IPC, and CTRL groups (\u003cb\u003eFigure S3\u003c/b\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWe next examined pathogen abundance and found that rhinovirus RPM was significantly higher in LRTI compared to IPC (P\u0026thinsp;=\u0026thinsp;0.047), while mean RSV abundance trended higher in LRTI, but the difference did not reach statistical significance (P\u0026thinsp;=\u0026thinsp;0.060) (\u003cb\u003eFigure S4\u003c/b\u003e). \u003cem\u003eH. influenzae\u003c/em\u003e abundance did not differ between groups (P\u0026thinsp;=\u0026thinsp;0.84), while \u003cem\u003eM. catarrhalis\u003c/em\u003e RPM was unexpectedly higher in IPC compared to LRTI (P\u0026thinsp;=\u0026thinsp;9.1e-3).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTo further characterize the microbiome features of LRTI and IPC, we carried out differential taxonomic abundance analysis using ANCOM-BC\u003csup\u003e33\u003c/sup\u003e. While RSV was the only microbe with significantly greater abundance in LRTI, we found that IPC was enriched with classically commensal taxa including \u003cem\u003ePrevotella\u003c/em\u003e, \u003cem\u003eNeisseria\u003c/em\u003e, \u003cem\u003ePorphyromonas\u003c/em\u003e, and \u003cem\u003eStreptococcus\u003c/em\u003e species (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003ef, \u003cb\u003eSupp. Data 1\u003c/b\u003e). We subsequently evaluated pathogen burden, quantified as the proportion of implicated pathogen reads in the microbiome, and found that LRTI was characterized by a significantly greater pathogen burden compared to IPC (63.4% versus 36.6% mean pathogen proportion, P\u0026thinsp;=\u0026thinsp;5.9e-7) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003eg\u003cb\u003e)\u003c/b\u003e.\u003c/p\u003e\u003cp\u003eWe hypothesized that virulence factor expression might differ between LRTI and IPC, and thus carried out an exploratory assessment using the MetaVF database\u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e. We found that the expression of two \u003cem\u003eH. influenzae\u003c/em\u003e virulence factors, \u003cem\u003eHMW1/2\u003c/em\u003e and \u003cem\u003eHxuABC\u003c/em\u003e, were higher in LRTI (P\u0026thinsp;=\u0026thinsp;0.02 and P\u0026thinsp;=\u0026thinsp;0.03, respectively) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003eh, \u003cb\u003eFigure S5, Supp. Data 2\u003c/b\u003e). \u003cem\u003eHMW1/2\u003c/em\u003e are adhesin proteins that facilitate \u003cem\u003eH. influenzae\u003c/em\u003e adherence to the respiratory epithelium\u003csup\u003e\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u003c/sup\u003e, and \u003cem\u003ehxuABC\u003c/em\u003e is a specialized ATP-binding cassette transporter for iron acquisition from the host\u003csup\u003e\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWe additionally assessed microbiome functional differences by profiling metabolic pathways using HUMAnN\u003csup\u003e\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e. We found that compared to LRTI, IPC was characterized by higher expression of metabolic pathways essential for fatty acid beta-oxidation, citrulline biosynthesis, and arginine degradation (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003ei, \u003cb\u003eSupp. Data 3\u003c/b\u003e). Taken together, our findings suggested that IPC is characterized by a more diverse, taxonomically rich, abundant, and metabolically active respiratory microbiome compared to LRTI.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cem\u003eHost airway transcriptional responses distinguish LRTI from IPC and controls.\u003c/em\u003e\u003c/p\u003e\u003cp\u003eWe next tested the hypothesis that the pulmonary host response would differ between LRTI, IPC, and CTRL groups by evaluating the lower airway transcriptome. Principal component analysis demonstrated that LRTI was characterized by a distinct transcriptional signature compared to IPC or CTRL groups (PERMANOVA P\u003csub\u003eadj\u003c/sub\u003e=0.002), which did not differ (P\u003csub\u003eadj\u003c/sub\u003e=0.25) (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e3\u003c/span\u003ea, \u003cb\u003eFigure S6\u003c/b\u003e). This finding was underscored by hierarchical clustering of the top 20 most differentially expressed (DE) genes between LRTI and CTRL groups, which generally separated LRTI from non-LRTI cases, though several IPC cases clustered among LRTI cases (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e3\u003c/span\u003eb). IPC and CTRL patients did not clearly separate based on hierarchical clustering.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTo better understand host gene expression differences between the three groups at a more granular level, we performed pairwise differential expression analyses, adjusting for age and sex. We identified distinct host signatures that differentiated LRTI from IPC and CTRL groups, with 3517 and 2856 DE genes, respectively (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e3\u003c/span\u003ec, Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e3\u003c/span\u003ed, \u003cb\u003eSupp. Data 4\u003c/b\u003e). In contrast, IPC and CTRL groups demonstrated minimal differences, with only 2 DE genes (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e3\u003c/span\u003ee, \u003cb\u003eSupp. Data 4\u003c/b\u003e). Among the genes DE between LRTI and both CTRL and IPC groups, we noted that \u003cem\u003eFABP4\u003c/em\u003e, which is expressed in macrophages and encodes a lipid chaperone that modulates leukotriene stability\u003csup\u003e\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e\u003c/sup\u003e, was a clear outlier in both fold change and statistical significance (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e3\u003c/span\u003ef\u003cb\u003e).\u003c/b\u003e\u003c/p\u003e\u003cp\u003eTo characterize biological pathways encompassing the DE genes, we carried out gene set enrichment analyses (GSEA) (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e3\u003c/span\u003eg, \u003cb\u003eSupp. Data 5)\u003c/b\u003e. Canonical infection-related pathways, including interferon signaling, antigen presentation, adaptive immune signaling, and neutrophil degranulation, were all upregulated in LRTI versus IPC, as expected. However, we also noted that interferon signaling pathways were upregulated in IPC compared to CTRL patients, albeit to a lesser extent,\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003ea.\u003c/b\u003e Principal component analysis (PCA) of the lower respiratory tract transcriptome. Adjusted P value calculated by PERMANOVA between LRTI and non-LRTI groups. \u003cb\u003eb.\u003c/b\u003e Heat map demonstrating hierarchical clustering of patients in each group (LRTI, IPC, CTRL) based on the top 20 differentially expressed (DE) genes between LRTI and CTRL groups. Color bar indicates normalized, Z-score scaled expression of each gene. \u003cb\u003ec.\u003c/b\u003e Volcano plot of DE genes between LRTI and CTRL, with Benjamini-Hochberg (BH) adjusted P values\u0026thinsp;\u0026lt;\u0026thinsp;0.05 colored. \u003cb\u003ed.\u003c/b\u003e Volcano plot highlighting DE genes between LRTI and IPC. \u003cb\u003ee.\u003c/b\u003e Volcano plot of DE results comparing IPC versus CTRL. \u003cb\u003ef.\u003c/b\u003e Normalized \u003cem\u003eFABP4\u003c/em\u003e expression across groups, with Benjamini-Hochberg adjusted P values from the DE analyses. \u003cb\u003eg.\u003c/b\u003e Gene‑set enrichment analysis (GSEA) highlighting immune pathway enrichment in IPC compared to LRTI (purple) and IPC compared to CTRL (yellow). Top 20 DE pathways shown for the LRTI/IPC comparison, then overlaid with the IPC/CTRL comparison, showing only significant pathways (P\u003csub\u003eadj\u003c/sub\u003e\u0026lt;0.05). Point size scales inversely with P\u003csub\u003eadj\u003c/sub\u003e value.\u003c/p\u003e\u003cp\u003eGiven that interferon signaling is a central feature of the anti-viral host immune response, we hypothesized that the immunologic features of IPC may differ between viral and bacterial pathogens. To investigate this, we performed differential expression analyses within patients who had viral (n\u0026thinsp;=\u0026thinsp;223) or bacterial (n\u0026thinsp;=\u0026thinsp;195) pathogens detected. Aligning with our primary composite analysis, both viral and bacterial LRTI were characterized by distinct airway transcriptional signatures with respect to the CTRL group (3424 and 3140 DE genes, respectively) (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003ea, \u003cb\u003eSupp. Data 6\u003c/b\u003e). Compared to IPC, viral and bacterial LRTI also exhibited distinct host signatures, although with fewer DE genes (1860 and 1991, respectively). Few transcriptomic differences were observed between IPC and CTRL groups, although a subtle signature of 29 DE genes, primarily interferon-stimulated genes (ISGs, e.g., \u003cem\u003eISG15, IFIH1, OAS3\u003c/em\u003e) characterized viral IPC (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003eb). In contrast, there were zero DE genes between bacterial IPC and CTRL groups, suggesting fundamental differences in immune activation between patients incidentally carrying viral versus bacterial pathogens.\u003c/p\u003e\u003cp\u003eWe next examined expression of individual genes classically associated with anti-viral and anti-bacterial defense. The viral subgroups demonstrated a gradation in ISG expression\u003csup\u003e\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e, ranging from highest in viral LRTI to lowest in CTRL (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003ec\u003cb\u003e)\u003c/b\u003e. Given the similar pattern with viral abundance in our microbiome analysis (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e2\u003c/span\u003ee), we hypothesized that the differences in interferon signaling might simply be related to viral load differences between groups. A regression of ISG expression against viral RPM showed that ISG expression did positively correlate with interferon expression, as expected (adjusted R\u003csup\u003e2\u003c/sup\u003e\u0026thinsp;=\u0026thinsp;0.45), though interestingly, when stratified by group, IPC patients demonstrated a proportionally attenuated response compared to those with LRTI for any given viral load (P\u0026thinsp;=\u0026thinsp;1.8e-3 for the ISG \u003cem\u003eIFIH1\u003c/em\u003e) (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003ed). When age was added as a covariate, this finding did not change (P\u0026thinsp;=\u0026thinsp;2.0e-3), and other ISGs (\u003cem\u003eISG15\u003c/em\u003e, \u003cem\u003eIFI44\u003c/em\u003e) exhibited the same pattern (\u003cb\u003eFigure S7\u003c/b\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eWhile these findings could be due entirely to intrinsic differences in innate immune response activation between individuals, we considered the possibility that the lung microbiome might modulate, at least to some extent, inflammatory gene expression in the setting of pathogen exposure. To investigate this, we evaluated the relationship between SDI and ISG expression and found that as lung microbiome diversity increased, ISG expression decreased (adjusted R\u003csup\u003e2\u003c/sup\u003e\u0026thinsp;=\u0026thinsp;0.29, regression P\u0026thinsp;=\u0026thinsp;1.3e-17) (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003ee). Furthermore, we found that after adjusting for SDI, between-group differences in the relationship between viral load and ISG expression disappeared (P\u0026thinsp;=\u0026thinsp;0.50), suggesting that the lung microbiome modulates anti-viral host responses (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003ef). Mediation analysis suggested that lower SDI was independently associated with higher \u003cem\u003eIFIH1\u003c/em\u003e expression, and that microbiome composition may partially mediate the relationship between viral pathogen presence and interferon signaling, explaining\u0026thinsp;~\u0026thinsp;43% of the group effect on \u003cem\u003eIFIH1\u003c/em\u003e expression (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003eg).\u003c/p\u003e\u003cp\u003eWe performed a parallel analysis focused on bacterial LRTI and IPC (\u003cb\u003eFigure S8\u003c/b\u003e). In contrast to our observations with viral pathogens, the expression of canonical anti-bacterial innate immunity genes (\u003cem\u003eGZMB\u003c/em\u003e\u003csup\u003e\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e, \u003cem\u003eCD64\u003c/em\u003e\u003csup\u003e41\u003c/sup\u003e, and \u003cem\u003eTLR1\u003c/em\u003e\u003csup\u003e42\u003c/sup\u003e) remained relatively constant across a range of bacterial pathogen loads (e.g., for \u003cem\u003eGZMB\u003c/em\u003e, adjusted R\u003csup\u003e2\u003c/sup\u003e\u0026thinsp;=\u0026thinsp;0.07). However, as with viral pathogens, the bacterial IPC group exhibited consistently lower innate immunity gene expression compared to the LRTI group (e.g. for \u003cem\u003eGZMB\u003c/em\u003e, P\u003csub\u003eadj\u003c/sub\u003e=3.0e-04). Applying the same mediation analysis to \u003cem\u003eGZMB\u003c/em\u003e demonstrated that lower lung microbiome alpha diversity was independently associated with higher innate immunity gene expression, explaining\u0026thinsp;~\u0026thinsp;31% of the group effect on \u003cem\u003eGZMB\u003c/em\u003e expression (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003eh).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eLastly, we assessed the impact of adjusting for SDI in our original pairwise differential expression analyses and found that doing so markedly reduced the lower airway transcriptional differences between LRTI and IPC (Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e4\u003c/span\u003ei, \u003cb\u003eSupp. Data 7\u003c/b\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eIntegration of host and microbial features enables accurate LRTI diagnosis\u003c/h3\u003e\n\u003cp\u003eHaving identified such distinct microbiome and host immune response differences between groups, we next sought to translate our findings into proof-of-concept diagnostic tests. Using LASSO regularized regression, we built diagnostic classifiers to distinguish true infection from the alternative clinically encountered states of IPC or non-infectious acute respiratory illness (Fig.\u0026nbsp;\u003cspan refid=\"Fig13\" class=\"InternalRef\"\u003e5\u003c/span\u003ea). Given prior work demonstrating the utility of \u003cem\u003eFABP4\u003c/em\u003e as a pneumonia diagnostic biomarker\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e, we evaluated its performance alone or in combination with alpha diversity. Both \u003cem\u003eFABP4\u003c/em\u003e and SDI performed well individually, although the combination achieved even better classification performance with an area under the receiver operator curve (AUC) of 0.87 (95% confidence interval (CI) 0.83\u0026ndash;0.91) based on 5-fold cross validation (CV). A multi-gene host transcriptional classifier in combination with SDI performed comparably with an AUC of 0.89 (95% CI 0.85\u0026ndash;0.92, Fig.\u0026nbsp;\u003cspan refid=\"Fig13\" class=\"InternalRef\"\u003e5\u003c/span\u003eb, \u003cb\u003eTable \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e\u003c/b\u003e) and yielded classifier scores that effectively distinguished LRTI from patients in either the IPC or CTRL groups (Fig.\u0026nbsp;\u003cspan refid=\"Fig13\" class=\"InternalRef\"\u003e5\u003c/span\u003ec).\u003c/p\u003e\u003cp\u003eConsidering that a single protein biomarker could have distinct practical utility as a clinical diagnostic, we tested whether protein levels of FABP4 could also effectively differentiate LRTI from IPC and CTRL groups in a subset of patients with FABP4 protein measurements from the lower airway (n\u0026thinsp;=\u0026thinsp;134). Indeed, the LRTI group had markedly different levels of FABP4 compared to both IPC (P\u0026thinsp;=\u0026thinsp;6.0e-4) and CTRL (P\u0026thinsp;=\u0026thinsp;2.3e-9) groups (Fig.\u0026nbsp;\u003cspan refid=\"Fig13\" class=\"InternalRef\"\u003e5\u003c/span\u003ed). Consistent with our findings at the transcriptional level, no difference in FABP levels between IPC and CTRL groups was observed (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e3\u003c/span\u003ef). Notably, we found that respiratory FABP4 alone performed as well as the integrated host/microbe metatranscriptomic classifier (AUC\u0026thinsp;=\u0026thinsp;0.88, 95% CI 0.82\u0026ndash;0.93) (Fig.\u0026nbsp;\u003cspan refid=\"Fig13\" class=\"InternalRef\"\u003e5\u003c/span\u003ee), suggesting promise as a clinical biomarker for both LRTI diagnosis and distinguishing LRTI from IPC.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eDifferentiating LRTI from IPC remains a frequent and unresolved challenge in the care of patients with acute respiratory illness. The resulting diagnostic uncertainty drives antimicrobial overuse and reflects a key gap in our understanding of host-microbe interactions in the lower airway. Here, we deploy metatranscriptomic profiling to holistically characterize the host and microbial features of LRTI and IPC, identifying distinct inflammatory signatures and respiratory microbiome ecology that distinguish these two states of pathobiont existence. Leveraging these findings, we develop host-microbe and practical single biomarker LRTI diagnostic classifiers, offering a path toward more precise, biologically informed diagnostics.\u003c/p\u003e\u003cp\u003eAlthough the implicated pathogens were frequently the most abundant microbes in the lower airway microbiome in both patients with LRTI and IPC, microbiome alpha and beta diversity, and taxonomic richness were strikingly different. LRTI was marked by a collapse of alpha diversity, reflecting ecologic disruption that is characteristic of infection\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e,\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e,\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u003c/sup\u003e, whereas IPC resembled uninfected controls, with a diverse and taxonomically rich community composition. Unexpectedly, total bacterial abundance was greater in IPC than in LRTI or controls, which may be explained by the enrichment of commensal taxa such as \u003cem\u003ePrevotella\u003c/em\u003e, \u003cem\u003eNeisseria\u003c/em\u003e, and \u003cem\u003ePorphyromonas\u003c/em\u003e and reflect a more resilient and balanced barrier microbiota. The IPC state was further characterized by increased expression of diverse metabolic programs (e.g. energy production, fatty‑acid β‑oxidation, and amino‑acid biosynthesis) and higher virulence factor expression. Globally, these findings suggest that IPC is characterized by a more robust, diverse, and metabolically active microbiome, tolerant of carriage but resilient to pathogen invasion.\u003c/p\u003e\u003cp\u003eHost inflammatory gene expression in the lower airway also differed markedly between LRTI and IPC. We found that LRTI elicited a distinct transcriptional signature compared to either IPC or controls, comprised of thousands of DE genes related to innate and adaptive immune signaling. In contrast, the lower airway transcriptome of IPC largely resembled that of controls. Sensitivity analyses demonstrate that while no detectable transcriptional differences existed between bacterial IPC and controls, a subtle signal of interferon-stimulated genes distinguished viral IPC from controls. This pattern is consistent with prior reports of interferon activation during asymptomatic viral carriage\u003csup\u003e\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e,\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e, but contrasts studies in neonates demonstrating that nasopharyngeal colonization with \u003cem\u003eM. catarrhalis\u003c/em\u003e, \u003cem\u003eH. influenzae\u003c/em\u003e, and \u003cem\u003eS. pneumoniae\u003c/em\u003e correlates with mucosal immune shifts and the future development of asthma\u003csup\u003e\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e,\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e\u003c/sup\u003e. The discrepancy may be due to developmental stage (neonates were not included in our study and may be uniquely susceptible as their respiratory microbiomes are being established) or differences in upper versus lower respiratory tract biology\u003csup\u003e\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eRegression analyses demonstrated that ISG expression is induced in a viral load-dependent manner, consistent with prior studies\u003csup\u003e\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e,\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e\u003c/sup\u003e. Intriguingly, however, ISG activation was consistently diminished in the setting of IPC across a range of viral loads. This suggested that IPC may be characterized by a global attenuation of the pathogen recognition-innate immune activation axis.\u003c/p\u003e\u003cp\u003eSimilar regression analyses involving bacterial cases did not demonstrate a dose-dependent relationship with respect to innate immune gene expression, perhaps reflecting fundamental differences in the coupling of bacterial antigens to the transcriptional activation of host innate immune responses. That said, we observed consistently higher expression of anti-bacterial immune genes (e.g., \u003cem\u003eGZMB\u003c/em\u003e, \u003cem\u003eCD64\u003c/em\u003e, \u003cem\u003eTLR1\u003c/em\u003e) in LRTI compared to IPC, across a broad range of pathogen abundance, suggesting a fundamental set point difference between the two states, agnostic to pathogen class.\u003c/p\u003e\u003cp\u003eOur mediation analysis supports a role for the lung microbiome in moderating the intensity of inflammatory responses to potential invading pathogens. These findings, while proof-of-concept in nature, align with murine models in which microbiome disruption amplifies innate inflammatory responses and influenza-associated lung injury\u003csup\u003e\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e\u003c/sup\u003e. We estimated that microbiome factors explained\u0026thinsp;~\u0026thinsp;43% of the relationship between group and inflammatory gene expression in viral cases, and ~\u0026thinsp;31% in bacterial cases. While an important contribution, this suggests that other mediators (e.g., host genetics, epigenetic modifications, immune memory to related pathogens) primarily account for host responses differences between LRTI and IPC. Regardless, our findings underscore the complex, bi-directional relationship between host and microbe that determines whether a pathobiont will cause invasive disease or co-exist innocuously in a microbial community. We acknowledge that a full mediation analysis and determination of causality and directionality is not possible given the inherent limitations of an observational cohort\u003csup\u003e\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eDistinguishing LRTI from IPC and non-infectious acute respiratory illnesses remains a clinical challenge and underscores the need for better diagnostic tests to guide antimicrobial therapy and patient care. We illustrate that simple host and microbial biomarkers can be used independently, or combined, to build clinically translatable diagnostic tests to address this need. For instance, \u003cem\u003eFABP4\u003c/em\u003e in combination with SDI accurately distinguished children with proven LRTI from those with other causes of acute respiratory failure, including those with IPC, achieving an AUC of 0.87. As sequencing technology becomes more economical and clinically available, the feasibility and cost effectiveness of performing metatranscriptomic analyses will continue to increase\u003csup\u003e\u003cspan additionalcitationids=\"CR56\" citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eInflammatory protein biomarkers (e.g., procalcitonin, C-reactive protein) are the most widely clinically available class of host-based infectious disease diagnostics, although they only have modest capability of diagnosing LRTI and have not been shown to effectively discriminate between LRTI and IPC.\u003csup\u003e\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e,\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e\u003c/sup\u003e Thus, we found it promising that FABP4 alone, when measured at the protein level, performed as well as our integrated host/microbe metatranscriptomic model (AUC\u0026thinsp;=\u0026thinsp;0.88), highlighting the potential clinical utility of this single host biomarker for rapid and accurate diagnosis of infection in this cohort.\u003c/p\u003e\u003cp\u003eOur study has several strengths including the incorporation of both host and microbial data using metatranscriptomics, a multicenter design, rigorous and comprehensive adjudication of LRTI and IPC, and a large sample size. Our study also has limitations. We focused exclusively on children because overall they have a higher prevalence of IPC, thus it remains unknown whether our findings are generalizable to adults, or to individuals with less severe respiratory illnesses. The infection and IPC groups differed in age, although we adjusted for this in our analyses. While our study is the largest to date to examine biological differences between infection and IPC, our sample size did limit sub-analyses at the individual pathogen level.\u003c/p\u003e\u003cp\u003eGiven the inherent limitations of an observational cohort and cross-sectional study design, our mediation analyses should be considered proof-of-concept and will require validation in a more controlled experimental setting. Longitudinal sampling could help determine whether diversity collapse precedes, accompanies, or follows infection onset, and studies in xenobiotic mice could more effectively establish causal relationships between microbiome and host inflammatory responses in the setting of pathobiont challenge. Finally, future studies are needed to evaluate whether our findings at the host or microbiome level generalize to the upper respiratory tract.\u003c/p\u003e\u003cp\u003eIn sum, we find that LRTI and IPC are characterized by distinct biology with respect to both host and microbe, emphasizing that simply detecting a microbe with known pathogenicity in the respiratory tract is insufficient for clinical diagnosis of infection. It is not just the pathogen alone, but its dynamic relationship with the host immune response and airway microbiome, that determines disease. Our study provides fresh insight into the vexing and common challenge of interpreting positive respiratory tests in patients with acute respiratory illnesses and offers a new approach for improving LRTI diagnostic accuracy and limiting antimicrobial overuse.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003eStudy cohort\u003c/h2\u003e\u003cp\u003eWe studied a prospective multicenter cohort of 457 critically ill children with acute respiratory illnesses requiring mechanical ventilation who were admitted to eight U.S. intensive care units (ICUs) in the National Institute of Child Health and Human Development’s Collaborative Pediatric Care Research Network (CPCCRN) between February 2015 and December 2017\u003csup\u003e27–29\u003c/sup\u003e. Enrollment sites included: Children’s Hospital Colorado, Aurora, CO USA; University of California San Francisco, San Francisco, CA; Nationwide Children’s Hospital, Columbus, OH, USA; The Children’s Hospital of Philadelphia, Philadelphia, PA, USA; University of Pittsburg, Pittsburg, PA, USA; Children’s Hospital of Michigan, Detroit, MI, USA; University of California Los Angeles, Los Angeles, CA, USA; Children’s National Medical Center and George Washington School of Medicine and Health Sciences, Washington, DC, USA.\u003c/p\u003e\u003cp\u003eChildren aged 31 days to 17 years who were expected to require mechanical ventilation for at least 72 hours and had tracheal aspirate (TA) sampling performed within 24 hours of intubation were approached for enrollment. Exclusion criteria included TA sample collection \u0026gt; 24 hours after intubation, presence of a tracheostomy tube, any condition in which deep tracheal suctioning was contraindicated, prior mechanical ventilation during the hospitalization, goals of care dictating a do not resuscitate order and/or another a request for limited support, or previous enrollment in the study.\u003c/p\u003e\u003cp\u003eEligible patients were identified, and their guardians were approached for consent by clinical research coordinator staff as soon as possible following intubation. Written informed consent for study participation was obtained from legal guardians. An initial waiver of consent was granted for TA samples to be obtained from standard of care suctioning of the ETT and stored until the parents or other legal guardians could be approached for informed consent. Samples from unconsented subjects were subsequently destroyed. The study was approved by the University of Utah central IRB #00088656.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eAdjudication of infection status and definition of subgroups\u003c/h3\u003e\n\u003cp\u003eAdjudication of LRTI status was carried out retrospectively by study-site clinicians with access to all clinical, laboratory, microbiology, and radiology data available up to the end of admission, without knowledge of metagenomic next-generation sequencing (mNGS) results. Each patient was reviewed independently by two adjudicators with expertise in pediatric infectious disease and/or critical care to determine the presence or absence of clinical LRTI; disagreements were discussed and resolved by a panel. For this study, patients were classified into three groups: 1) LRTI if they were clinically adjudicated as having LRTI and had supportive microbiology, 2) IPC if they were clinically adjudicated as not having LRTI but had positive microbiology, and 3) CTRL if they were clinically adjudicated as not having LRTI and had negative microbiology. Microbiology included standard-of-care clinical microbiology (multiplex polymerase chain reaction (PCR) and semiquantitative bacterial respiratory cultures) and/or metagenomic detection for pathogenic bacteria and viruses (implementing a validated, rules-based computational model, described in detail below)\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e,\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e,\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\n\u003ch3\u003eRNA sequencing\u003c/h3\u003e\n\u003cp\u003eTracheal aspirate (TA) collected within 24 hours of intubation was mixed equi-volume with DNA/RNA shield (Zymo Research, Cat. No R1100) and stored at -80°C. Following bead-bashing, RNA or negative control water samples underwent extraction using the Qiagen AllPrep Kit (Qiagen, Cat. No R2145), followed by DNAse treatment. Sequencing libraries were prepared from purified RNA using the NEBNext Ultra II Library Prep Kit (New England Biolabs, Cat. No E7770L) and dual index barcodes. Human ribosomal RNA depletion was carried out prior to library amplification and pooling using the Cas9-based Depletion of Abundant Sequences by Hybridization (DASH) method\u003csup\u003e\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e\u003c/sup\u003e. Libraries underwent 150-base pair paired-end sequencing on an Illumina NovaSeq 6000 sequencer.\u003c/p\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003eMeasurement of FABP4 protein levels\u003c/h2\u003e\u003cp\u003eFABP4 was measured from TA specimens collected within 24 hours of intubation using the SomaScan 7k assay (SomaLogic)\u003csup\u003e\u003cspan additionalcitationids=\"CR62\" citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e–\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e\u003c/sup\u003e in a subset of this cohort. Following collection, TA specimens underwent centrifugation at 4°C at 15,000 × \u003cem\u003eg\u003c/em\u003e for 5 min, subsequently the supernatant was frozen at − 80°C within 30 minutes.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003eTaxonomic mapping from RNA-seq data\u003c/h2\u003e\u003cp\u003eWe employed the CZ ID Illumina mNGS pipeline (v7.1) for taxonomic mapping of microbial sequence data\u003csup\u003e\u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e64\u003c/span\u003e,\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e\u003c/sup\u003e. This incorporates initial removal of human reads using Kallisto\u003csup\u003e\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e\u003c/sup\u003e, adapter sequence trimming with fastp\u003csup\u003e\u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e\u003c/sup\u003e, filtering low quality and low complexity reads using PriceSeq\u003csup\u003e\u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e\u003c/sup\u003e and the Lempel-Ziv-Welch algorithm, respectively, and a final scrub of any residual human reads using Bowtie2\u003csup\u003e69\u003c/sup\u003e. Taxonomic classification was then performed on both short reads and assembled contigs using the NCBI nucleotide (NT) and nonredundant (NR) databases. Background and batch correction was performed on species level taxon matrices (see below).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003eIdentification and mitigation of background contaminants\u003c/h2\u003e\u003cp\u003eNegative water controls were processed and sequenced alongside the patient samples to enable characterization and subtraction of background contamination. A previously developed negative binomial model\u003csup\u003e\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e\u003c/sup\u003e was used to model the distribution of reads of microbial taxa in the negative controls. Mean and dispersion parameters were fit to the data and estimates of the mean were generated for each batch:taxon pair. A single dispersion parameter was generated across all taxa using the MASS package (R, v7.3.58.1). P values were adjusted for multiple testing using the Benjamini-Hochberg False Discovery Rate method. Microbial taxa that were present at a significantly higher average abundance in participant samples than in negative controls (P\u003csub\u003eadj\u003c/sub\u003e\u0026lt;0.05) were retained for downstream analyses. Microbial taxa were included in downstream analysis if they met these criteria: (1) ≥ 1 hit to the NT database (2) \u0026gt; = 1 hit to the NR database (3) a minimum alignment length of 70 bases to the NT database.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003eClinical detection of respiratory pathogens\u003c/h2\u003e\u003cp\u003eStandard-of-care clinical respiratory microbiologic testing was performed based on the discretion of the treating clinicians at each study site. Diagnostics included nasopharyngeal swab respiratory pathogen testing by multiplex PCR and/or TA bacterial semiquantitative cultures. Clinical diagnostic tests on samples obtained within 48 hours of intubation were included in the analyses. Microbes reported by the clinical laboratory as representing laboratory, skin, or environmental contaminants, or reported as mixed upper respiratory flora, were excluded.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\u003ch2\u003eDetection of respiratory pathogens by metatranscriptomics\u003c/h2\u003e\u003cp\u003eFor bacterial taxa that remained after background filtering, we applied an established rules-based model (RBM) to identify potential respiratory pathogens. In two prior studies, the RBM identified 82–96% of clinically-confirmed lower respiratory pathogens compared to standard of care clinical diagnostics, and permitted detection of otherwise missed potential pathogens in \u0026gt; 50% of patients with clinically adjudicated LRTI but negative standard testing\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e,\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e. The RBM operates by first retaining the most abundant species from each mapped genus, and any lower-abundance species within that genus with known pathogenicity in the respiratory tract based on a curated reference list from epidemiologic surveillance studies.\u003csup\u003e\u003cspan additionalcitationids=\"CR71 CR72\" citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e–\u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e\u003c/sup\u003e Species were then ranked by abundance (reads per million values aligned to NT database, sum NT RPM), limiting to the top 15. The largest drop in abundance among this ranked list was identified, and any species above the largest drop in abundance with known ability to cause LRTI as a potential pathogen were counted as bacterial hits. Viruses detected with an abundance \u0026gt; 0.1 RPM with established human respiratory pathogenicity were subsequently identified.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\u003ch2\u003eMicrobial abundance calculations\u003c/h2\u003e\u003cp\u003eMicrobial abundance/load was approximated for each sample by calculating the sum NT RPM. Statistical comparison of sum NT RPM across groups was performed using the wilcox_test () function (rstatix v0.7.2). Resulting P values were adjusted using the Benjamini-Hochberg False Discovery Rate algorithm via the p_adjust() function of the stats (v4.2.3) package. Generalized linear modeling of these relationships was performed using the glm() function of the stats package, specifying a Gaussian distribution and identity link function, and adjusted for both sex and age. These methods were applied to the entire microbial profile as well as subsets of the profile (e.g. bacterial, viral) based on NCBI lineage data. Differences in abundance (sum NT RPM) of individual species of interest between groups were also performed using this approach.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\u003ch2\u003eMicrobiome diversity analyses\u003c/h2\u003e\u003cp\u003eAlpha diversity (Shannon Diversity Index, or SDI) was calculated using the diversity() function of the vegan package (v2.6-6.1). Beta diversity (Bray-Curtis dissimilarity) was calculated using the functions vegdist(), betadisper(), permutest() and adonis2() of the vegan package. Principal Coordinate Analysis (PCoA) was performed using the cmdscale() function of the stats package (v4.2.3).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\u003ch2\u003eDifferential microbial abundance analysis\u003c/h2\u003e\u003cp\u003eDifferential abundance analysis was performed using the ANCOM-BC package\u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e (v2.8.1) using a library filter of 0, prevalence filter of 10%, alpha level of 0.05 and a pseudo-count of 1. The analysis was adjusted for age and sex, and P values were adjusted for multiple testing using the Benjamini-Hochberg correction.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec19\" class=\"Section2\"\u003e\u003ch2\u003eVirulence factor screening\u003c/h2\u003e\u003cp\u003eVirulence factors were identified from transcriptomic data using the MetaVF toolkit\u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e, its associated virulence factor database VFDB2.0, and BLAST (v2.16.0). Called virulence factors with an associated e-value less than 1e-10 were retained for downstream analysis. Differentially expressed virulence factors were identified using ANCOM-BC\u003csup\u003e33\u003c/sup\u003e with a library filter of 0, prevalence filter of 5%, alpha level of 0.05 and a pseudo-count of 1.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec20\" class=\"Section2\"\u003e\u003ch2\u003eGeneration of host gene counts\u003c/h2\u003e\u003cp\u003eRNA-seq reads were pseudoaligned using Kallisto\u003csup\u003e\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e\u003c/sup\u003e against an index consisting of all transcripts associated with human protein coding genes (GRCh38-based). We excluded samples with less than one million exon counts. Gene-level counts were generated using tximport package, with the scaled TPM method\u003csup\u003e\u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e74\u003c/span\u003e\u003c/sup\u003e. Genes were retained for subsequent analysis if they had at least 10 counts in at least 20% of the samples in the cohort. The gene counts table underwent variance-stabilizing transformation (VST) using the R package DESeq2\u003csup\u003e75\u003c/sup\u003e, and VST-transformed counts were used in the principal component analysis, hierarchical clustering, assessment of individual genes, and classifier development.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec21\" class=\"Section2\"\u003e\u003ch2\u003ePrincipal component analysis\u003c/h2\u003e\u003cp\u003ePrincipal component analysis (PCA) was performed on the complete gene expression matrix using the prcomp function. For data visualization, we plotted PC1 versus PC3, which provided the greatest apparent separation in two dimensions, and ellipses depict 68% confidence regions around group centroids. PC1 versus PC2 and PC2 versus PC3 are shown in the supplement. To formally assess group separation, we ran pairwise PERMANOVA (adonis2, vegan package) on Euclidean distances computed from the full expression matrix (i.e., testing differences in group centroids). P values from the pairwise contrasts were adjusted by the Benjamini-Hochberg method, with P\u003csub\u003eadj\u003c/sub\u003e\u0026lt;0.05 considered significant.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec22\" class=\"Section2\"\u003e\u003ch2\u003eHeat map and hierarchical clustering\u003c/h2\u003e\u003cp\u003eFor display, we selected the top 20 most significantly differentially expressed genes based in the LRTI and CTRL comparison, based on P\u003csub\u003eadj\u003c/sub\u003e (see below for DE methods). Each gene was standardized across samples (z-score; mean = 0, SD = 1). Genes were ordered by unsupervised hierarchical clustering using Euclidean distance and complete linkage (ComplexHeatmap defaults). Samples were clustered using correlation distance with Ward.D2 to emphasize similarity of gene expression profiles. Dendrograms were computed for ordering but omitted in final visual panel.\u003c/p\u003e\u003cdiv id=\"Sec23\" class=\"Section3\"\u003e\u003ch2\u003eDifferential expression and gene set enrichment analyses\u003c/h2\u003e\u003cp\u003eDE analyses were performed with the R package limma-voom on raw gene-level counts\u003csup\u003e\u003cspan citationid=\"CR76\" class=\"CitationRef\"\u003e76\u003c/span\u003e\u003c/sup\u003e. The design matrix included age and sex as covariates, and where noted, Shannon Diversity Index (SDI). Counts were transformed with voom (mean-variance modeling with precision weights) and quantile normalized across samples. Gene-wise statistics used empirical-Bayes moderated two-sided t-tests; multiple testing was accounted for by Benjamini-Hochberg with P\u003csub\u003eadj\u003c/sub\u003e\u0026lt;0.05 considered significant. Where individual genes are displayed (e.g., boxplots for \u003cem\u003eFABP4\u003c/em\u003e, \u003cem\u003eIFIH1\u003c/em\u003e, and \u003cem\u003eGZMB\u003c/em\u003e), the adjusted P values from the respective limma DE analyses are displayed. For pathway analysis, we performed pre-ranked gene set enrichment analysis (GSEA) using ReactomePA (gsePathway) on Reactome gene sets with a minimum pathway size of 10 genes and a maximum size of 1500 genes\u003csup\u003e\u003cspan citationid=\"CR77\" class=\"CitationRef\"\u003e77\u003c/span\u003e\u003c/sup\u003e. All genes from each limma DE comparison, ranked by the limma t-statistic, were included as input. For visualization, we displayed the top 20 DE pathways between LRTI and IPC (all statistically significant with P\u003csub\u003eadj\u003c/sub\u003e\u0026lt;0.05), then overlaid results from the IPC and CTRL comparison, only displaying the significant pathways.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Sec24\" class=\"Section2\"\u003e\u003ch2\u003eRegression and mediation analyses\u003c/h2\u003e\u003cp\u003eFor both viral and bacterial subgroup analyses (run in parallel with identical pipeline), we modeled gene expression using linear regression with log\u003csub\u003e10\u003c/sub\u003e pathogen abundance (sum NT RPM) and group (LRTI vs IPC) as predictors, including an interaction term (gene expression ~ log\u003csub\u003e10\u003c/sub\u003e pathogen abundance * group). Group differences were evaluated using a global F-test comparing this model to the null model (gene expression ~ log\u003csub\u003e10\u003c/sub\u003e pathogen abundance), and sensitivity models additionally adjusted for age. For visualization, raw data points are plotted alongside model-predicted regression lines. The same structure was applied for microbiome diversity (gene expression ~ SDI * group). SDI-adjusted associations were visualized by plotting residuals from the gene expression ~ SDI model against pathogen abundance with group-specific linear fits. To evaluate modulation of innate host gene expression by diversity, we used the mediation package with SDI as the mediator and included pathogen abundance as a covariate in both models (mediator: SDI ~ group + log10 pathogen abundance; outcome: expression ~ group + SDI + log10 pathogen abundance) and calculated average causal mediated effect (ACME) and average direct effect (ADE) with 1000 simulations\u003csup\u003e\u003cspan citationid=\"CR78\" class=\"CitationRef\"\u003e78\u003c/span\u003e\u003c/sup\u003e. The mediation models used simpler additive models to maintain the interpretability of the effect estimates and because the interaction terms were not significant in any of the regression models. Viral outcomes focused on interferon-stimulated genes (\u003cem\u003eIFIH1\u003c/em\u003e, \u003cem\u003eISG15\u003c/em\u003e, \u003cem\u003eIFI44\u003c/em\u003e) and bacterial outcomes on canonical anti-bacterial defense genes (\u003cem\u003eGZMB\u003c/em\u003e, \u003cem\u003eCD64\u003c/em\u003e, \u003cem\u003eTLR1\u003c/em\u003e).\u003c/p\u003e\u003cdiv id=\"Sec25\" class=\"Section3\"\u003e\u003ch2\u003eClassifier development\u003c/h2\u003e\u003cp\u003eBinary classifiers were developed to distinguish LRTI from non-LRTI (IPC + CTRL combined). We performed stratified five-fold cross-validation (same folds re-used across the different models, with a minimum IPC and CTRL counts to keep each fold balanced) and generated out-of-fold predictions for performance assessment. Single-feature models (i.e. \u003cem\u003eFABP4\u003c/em\u003e gene, FABP4 protein, SDI) used logistic regression, as did FABP4 + SDI. Multi-gene models used LASSO logistic regression on all genes (glmnet), with the regularization parameter lambda selected by internal cross-validation\u003csup\u003e\u003cspan citationid=\"CR79\" class=\"CitationRef\"\u003e79\u003c/span\u003e\u003c/sup\u003e. Non-zero coefficients selected by the LASSO model are provided in Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e. For performance metrics, the reported AUC reflects the mean AUC of each of the five folds computed with the pROC package\u003csup\u003e\u003cspan citationid=\"CR80\" class=\"CitationRef\"\u003e80\u003c/span\u003e\u003c/sup\u003e, and the confidence intervals were obtained by bootstrapping the out-of-fold predictions with 1000 resamples.\u003c/p\u003e\u003c/div\u003e"},{"header":"Declarations","content":"\u003cdiv id=\"Sec26\" class=\"Section3\"\u003e\u003ch2\u003eData and code availability\u003c/h2\u003e\u003cp\u003eSource data are provided with this paper. The raw fastq files with microbial sequencing reads are available under NCBI BioProject ID: PRJNA748764. Deidentified clinical metadata, host and microbial data, code, and source data files are available at: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/infectiousdisease-langelier-lab/Incidental_pathogen_carriage\u003c/span\u003e\u003cspan address=\"https://github.com/infectiousdisease-langelier-lab/Incidental_pathogen_carriage\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eMan WH, de Steenhuijsen Piters WAA, Bogaert D (2017) The microbiota of the respiratory tract: gatekeeper to respiratory health. Nat Rev Microbiol 15:259\u0026ndash;270\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRobinson J (2004) Colonization and infection of the respiratory tract: What do we know? Paediatr Child Health 9:21\u0026ndash;24\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVo P, Kharasch VS (2014) Respiratory Failure. Pediatr Rev 35:476\u0026ndash;486\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePanetti B et al (2024) Acute Respiratory Failure in Children: A Clinical Update on Diagnosis. Children 11:1232\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJeffrey M, Denny KJ, Lipman J, Conway Morris A (2023) Differentiating infection, colonisation, and sterile inflammation in critical illness: the emerging role of host-response profiling. Intensive Care Med 49:760\u0026ndash;771\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLydon EC, Ko ER, Tsalik EL (2018) The host response as a tool for infectious disease diagnosis and management. Expert Rev Mol Diagn 18:723\u0026ndash;738\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVaughn VM et al (2019) Excess Antibiotic Treatment Duration and Adverse Events in Patients Hospitalized With Pneumonia: A Multihospital Cohort Study. Ann Intern Med 171:153\u0026ndash;163\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGupta AB et al (2024) Inappropriate Diagnosis of Pneumonia Among Hospitalized Adults. JAMA Intern Med 184:548\u0026ndash;556\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAnadol D, Aydin YZ, G\u0026ouml;\u0026ccedil;men A (2001) Overdiagnosis of pneumonia in children. Turk J Pediatr 43:205\u0026ndash;209\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePan H, Cui B, Huang Y, Yang J, Ba-Thein W (2016) Nasal carriage of common bacterial pathogens among healthy kindergarten children in Chaoshan region, southern China: a cross-sectional study. BMC Pediatr 16:161\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRegev-Yochay G et al (2004) Nasopharyngeal Carriage of \u003cem\u003eStreptococcus pneumoniae\u003c/em\u003e by Adults and Children in Community and Family Settings. CLIN INFECT DIS 38:632\u0026ndash;639\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNokso-Koivisto J, Kinnari TJ, Lindahl P, Hovi T, Pitk\u0026auml;ranta A (2002) Human picornavirus and coronavirus RNA in nasopharynx of children without concurrent respiratory symptoms. J Med Virol 66:417\u0026ndash;420\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eParker AM et al (2024) Upper respiratory \u003cem\u003eStreptococcus pneumoniae\u003c/em\u003e colonization among working-age adults with prevalent exposure to overcrowding. Microbiol Spectr 12:e00879\u0026ndash;e00824\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDesai AP et al (2015) Decline in Pneumococcal Nasopharyngeal Carriage of Vaccine Serotypes After the Introduction of the 13-Valent Pneumococcal Conjugate Vaccine in Children in Atlanta, Georgia. Pediatr Infect Dis J 34:1168\u0026ndash;1174\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBogaert D, De Groot R, Hermans PW (2004) M. Streptococcus pneumoniae colonisation: the key to pneumococcal disease. Lancet Infect Dis 4:144\u0026ndash;154\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHuang SS et al (2011) Healthcare utilization and cost of pneumococcal disease in the United States. Vaccine 29:3398\u0026ndash;3412\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVaneechoutte M, Verschraegen G, Claeys G, Weise B (1990) Van den Abeele, A. M. Respiratory tract carrier rates of Moraxella (Branhamella) catarrhalis in adults and children and interpretation of the isolation of M. catarrhalis from sputum. J Clin Microbiol 28:2674\u0026ndash;2680\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFaden H, Harabuchi Y, Hong JJ (1994) Epidemiology of Moraxella catarrhalis in children during the first 2 years of life: relationship to otitis media. J Infect Dis 169:1312\u0026ndash;1317\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEjlertsen T, Thisted E, Ebbesen F, Olesen B, Renneberg J (1994) Branhamella catarrhalis in children and adults. A study of prevalence, time of colonisation, and association with upper and lower respiratory tract infections. J Infect 29:23\u0026ndash;31\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMost ZM, Perl TM, Sebert M (2024) Respiratory virus infections in symptomatic and asymptomatic children upon hospital admission: new insights. Antimicrob Steward Healthc Epidemiol 4:e162\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJansen RR et al (2011) Frequent Detection of Respiratory Viruses without Symptoms: Toward Defining Clinically Relevant Cutoff Values ▿. J Clin Microbiol 49:2631\u0026ndash;2636\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSelf WH et al (2016) Respiratory Viral Detection in Children and Adults: Comparing Asymptomatic Controls and Patients With Community-Acquired Pneumonia. J Infect Dis 213:584\u0026ndash;591\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDickson RP (2016) The microbiome and critical illness. Lancet Respiratory Med 4:59\u0026ndash;72\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMourani PM et al (2021) Temporal airway microbiome changes related to ventilator-associated pneumonia in children. Eur Respir J 57\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDurairaj L et al (2009) Patterns and density of early tracheal colonization in intensive care unit patients. J Crit Care 24:114\u0026ndash;121\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEwig S et al (1999) Bacterial colonization patterns in mechanically ventilated patients with traumatic and medical head injury. Incidence, risk factors, and association with ventilator-associated pneumonia. Am J Respir Crit Care Med 159:188\u0026ndash;198\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMick E et al (2023) Integrated host/microbe metagenomics enables accurate lower respiratory tract infection diagnosis in critically ill children. J Clin Invest 133\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTsitsiklis A et al (2022) Lower respiratory tract infections in children requiring mechanical ventilation: a multicentre prospective surveillance study incorporating airway metagenomics. Lancet Microbe 3:e284\u0026ndash;e293\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLydon E et al (2025) Proteomic profiling of the local and systemic immune response to pediatric respiratory viral infections. \u003cem\u003emSystems\u003c/em\u003e 10, e0133524\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eUnited States Centers for Disease Control and Prevention (2021) CDC/NHSN Surveillance Definitions for Specific Types of Infections. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.cdc.gov/nhsn/pdfs/pscmanual/pcsmanual_current.pdf\u003c/span\u003e\u003cspan address=\"https://www.cdc.gov/nhsn/pdfs/pscmanual/pcsmanual_current.pdf\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePatel R et al (2023) Clinically Adjudicated Reference Standards for Evaluation of Infectious Diseases Diagnostics. Clin Infect Dis 76:938\u0026ndash;943\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNitu ME, Eigen H (2009) Respiratory Failure. Pediatr Rev 30:470\u0026ndash;478\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLin H, Peddada SD (2020) Analysis of compositions of microbiomes with bias correction. Nat Commun 11:3514\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDong W et al (2024) An expanded database and analytical toolkit for identifying bacterial virulence factors and their associations with chronic diseases. Nat Commun 15:8084\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSt Geme JW, Yeo H-J (2009) A prototype two-partner secretion pathway: the Haemophilus influenzae HMW1 and HMW2 adhesin systems. Trends Microbiol 17:355\u0026ndash;360\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAkhtar AA, Turner DPJ (2022) The role of bacterial ATP-binding cassette (ABC) transporters in pathogenesis and virulence: Therapeutic and vaccine potential. Microb Pathog 171:105734\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBeghini F et al (2021) Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. \u003cem\u003eeLife\u003c/em\u003e 10, e65088\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFuruhashi M, Saitoh S, Shimamoto K, Miura T (2014) Fatty Acid-Binding Protein 4 (FABP4): Pathophysiological Insights and Potent Clinical Biomarker of Metabolic and Cardiovascular Diseases. Clin Med Insights Cardiol 8:23\u0026ndash;33\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchneider WM, Chevillotte MD, Rice CM (2014) Interferon-Stimulated Genes: A Complex Web of Host Defenses. Annu Rev Immunol 32:513\u0026ndash;545\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHofer U (2017) Granzyme B\u0026rsquo;s roundhouse kick against bacteria. Nat Rev Microbiol 15:707\u0026ndash;707\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIcardi M et al (2009) CD64 Index Provides Simple and Predictive Testing for Detection and Monitoring of Sepsis and Bacterial Infection in Hospital Patients. J Clin Microbiol 47:3914\u0026ndash;3919\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAlbiger B, Dahlberg S, Henriques-Normark B, Normark S (2007) Role of the innate immune system in host defence against bacterial infections: focus on the Toll-like receptors. J Intern Med 261:511\u0026ndash;528\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLydon EC et al (2024) Pulmonary \u003cem\u003eFABP4\u003c/em\u003e Is an Inverse Biomarker of Pneumonia in Critically Ill Children and Adults. Am J Respir Crit Care Med 210:1480\u0026ndash;1483\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLangelier C et al (2018) Integrating host response and unbiased microbe detection for lower respiratory tract infection diagnosis in critically ill adults. Proc Natl Acad Sci U S A 115:E12353\u0026ndash;E12362\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFlanagan JL et al (2007) Loss of bacterial diversity during antibiotic treatment of intubated patients colonized with Pseudomonas aeruginosa. J Clin Microbiol 45:1954\u0026ndash;1962\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWesolowska-Andersen A et al (2017) Dual RNA-seq reveals viral infections in asthmatic children without respiratory illness which are associated with changes in the airway transcriptome. Genome Biol 18:12\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWolsk HM et al (2016) Picornavirus-Induced Airway Mucosa Immune Profile in Asymptomatic Neonates. J Infect Dis 213:1262\u0026ndash;1270\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eF\u0026oslash;lsgaard NV et al (2013) Pathogenic bacteria colonizing the airways in asymptomatic neonates stimulates topical inflammatory mediator release. Am J Respir Crit Care Med 187:589\u0026ndash;595\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBisgaard H et al (2007) Childhood asthma after bacterial colonization of the airway in neonates. N Engl J Med 357:1487\u0026ndash;1495\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCho H-J et al (2021) Differences and similarities between the upper and lower airway: focusing on innate immunity. Rhinology 59:441\u0026ndash;450\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMick E et al (2020) Upper airway gene expression reveals suppressed immune responses to SARS-CoV-2 compared with other respiratory viruses. Nat Commun 11:5854\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMick E et al (2022) Upper airway gene expression shows a more robust adaptive immune response to SARS-CoV-2 in children. Nat Commun 13:3937\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIchinohe T et al (2011) Microbiota regulates immune defense against respiratory tract influenza A virus infection. \u003cem\u003eProc. Natl. Acad. Sci. U.S.A.\u003c/em\u003e 108, 5354\u0026ndash;5359\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVanderweele TJ, Vansteelandt S (2009) Conceptual issues concerning mediation, interventions and composition. Stat Its Interface 2:457\u0026ndash;468\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGaston DC (2023) Clinical Metagenomics for Infectious Diseases: Progress toward Operational Value. J Clin Microbiol 61:e01267\u0026ndash;e01222\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBenoit P et al (2024) Seven-year performance of a clinical metagenomic next-generation sequencing test for diagnosis of central nervous system infections. Nat Med 30:3522\u0026ndash;3533\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNational Human Genome Research Institute (2024) DNA Sequencing Costs: Data. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data\u003c/span\u003e\u003cspan address=\"https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSelf WH et al (2017) Procalcitonin as a Marker of Etiology in Adults Hospitalized With Community-Acquired Pneumonia. Clin Infect Dis 65:183\u0026ndash;190\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003evan der Meer V, Neven AK, van den Broek PJ, Assendelft WJ (2005) J. Diagnostic value of C reactive protein in infections of the lower respiratory tract: systematic review. BMJ 331:26\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGu W et al (2016) Depletion of Abundant Sequences by Hybridization (DASH): using Cas9 to remove unwanted high-abundance species in sequencing libraries and molecular counting applications. Genome Biol 17:1\u0026ndash;13\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGold L et al (2010) Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE 5:e15004\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKim CH et al (2018) Stability and reproducibility of proteomic profiles measured with an aptamer-based platform. Sci Rep 8:8382\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCandia J et al (2024) Variability of 7K and 11K SomaScan Plasma Proteomics Assays. J Proteome Res 23:5531\u0026ndash;5539\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKalantar KL et al (2020) IDseq-An open source cloud-based pipeline and analysis service for metagenomic pathogen detection and monitoring. Gigascience 9:giaa111\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLu D et al (2025) Simultaneous detection of pathogens and antimicrobial resistance genes with the open source, cloud-based, CZ ID platform. Genome Med 17:46\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBray NL, Pimentel H, Melsted P, Pachter L (2016) Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34:525\u0026ndash;527\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen S, Zhou Y, Chen Y, Gu J (2018) fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884\u0026ndash;i890\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRuby JG, Bellare P, Derisi JL (2013) PRICE: software for the targeted assembly of components of (Meta) genomic sequence data. G3 (Bethesda) 3:865\u0026ndash;880\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLangmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods. 9\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJain S et al (2015) Community-Acquired Pneumonia Requiring Hospitalization among U.S. Adults. N Engl J Med 373:415\u0026ndash;427\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJain S et al (2015) Community-acquired pneumonia requiring hospitalization among U.S. children. N Engl J Med 372:835\u0026ndash;845\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIwai S et al (2014) The Lung Microbiome of Ugandan HIV-Infected Pneumonia Patients Is Compositionally and Functionally Distinct from That of San Franciscan Patients. PLoS ONE 9:e95726\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMagill SS et al (2014) Multistate point-prevalence survey of health care-associated infections. N Engl J Med 370:1198\u0026ndash;1208\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSoneson C, Love MI, Robinson MD (2015) Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res 4, 1521\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLove MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRitchie ME et al (2015) limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43:e47\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCroft D et al (2014) The Reactome pathway knowledgebase. Nucleic Acids Res 42:D472\u0026ndash;D477\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTingley D, Yamamoto T, Hirose K, Keele L, Imai K (2014) mediation: R Package for Causal Mediation Analysis. J Stat Softw 59:1\u0026ndash;38\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFriedman J et al (2023) glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRobin X et al (2011) pROC: an open-source package for R and S\u0026thinsp;+\u0026thinsp;to analyze and compare ROC curves. BMC Bioinformatics 12:77\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"},{"header":"TABLES","content":"\u003cp\u003e\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003e\u003cb\u003eDemographic and clinical characteristics of the LRTI, IPC, and CTRL groups.\u003c/b\u003e Race is indicated as “unknown” if patient declined or was unable to answer, if they selected “other” as an option, or if data were missing. P values compare LRTI and IPC groups. Fisher’s exact test used for categorical variables, and Kruskal-Wallis rank-sum test used for continuous variables. IQR, interquartile range; ICU, intensive care unit. * Indicates statistically significant (P \u0026lt; 0.05)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLRTI (n = 207)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eIPC (n = 70)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eP value\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eCTRL (n = 49)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eFemale, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e124 (59.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e38 (54.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.49\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e24 (49.0)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMale, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e83 (40.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e32 (45.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e25 (51.0)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eAge in years, median (IQR)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0.6 (0.2–2.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e1.6 (0.9–6.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e\u0026lt; 0.001*\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e9.5 (1.3–14.4)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eRace, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.53\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eWhite\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e122 (58.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e39 (55.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e27 (55.1)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eBlack/African American\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e41 (19.8)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e10 (14.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e11 (22.4)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eAsian\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e9 (4.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e4 (5.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e5 (10.2)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eNative Hawaiian/Pacific Islander\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1 (0.5)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e2 (2.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0 (0.0)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eAmerican Indian/Alaska Native\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e4 (1.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e1 (1.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e0 (0.0)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMulti-racial\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e5 (2.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e3 (4.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e1 (2.0)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eUnknown\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e25 (12.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e11 (15.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e5 (10.2)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eHispanic/Latino ethnicity, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e41 (19.8)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e18 (25.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.38\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e3 (6.1)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eComorbidities, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e86 (41.5)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e37 (52.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.13\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e22 (44.9)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eAdmission category, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e\u0026lt; 0.001*\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMedical\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e206 (99.5)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e52 (74.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e29 (59.2)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eSurgical\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1 (0.5)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e10 (14.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e10 (20.4)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eTrauma\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e0 (0.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e8 (11.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e10 (20.4)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eAntibiotics prior to intubation, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e72 (34.8)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e17 (24.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e17 (34.7)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eAny antibiotic use, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e205 (99.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e64 (91.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.004*\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e41 (83.7)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eVentilator days, median (IQR)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e7.0 (5.0–9.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e6.0 (4.0–7.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.003*\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e6.0 (5.0–9.0)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eICU length of stay, median (IQR)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e11.0 (8.0–16.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e9.0 (7.0–12.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.009*\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e10.0 (7.0–15.0)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eHospital length of stay, median (IQR)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e16.0 (11.0–24.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e17.0 (9.0–30.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.73\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e23.0 (14.0–43.0)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMortality, n (%)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e6 (2.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e10 (14.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.001*\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e2 (4.1)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-8171822/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8171822/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eAccurately distinguishing lower respiratory tract infection (LRTI) from incidental pathogen carriage (IPC) is clinically challenging. The host immunologic and microbial factors that define the states of LRTI and IPC are poorly understood. We performed host-microbe metatranscriptomic profiling of tracheal aspirate from 326 mechanically ventilated children with clinically adjudicated LRTI (n\u0026thinsp;=\u0026thinsp;207), IPC (n\u0026thinsp;=\u0026thinsp;70), or non-infectious acute respiratory illnesses (n\u0026thinsp;=\u0026thinsp;49). In the airway microbiome, LRTI was characterized by reduced alpha diversity and taxonomic richness, while IPC was characterized greater total bacterial abundance, enrichment in respiratory anaerobes and increased metabolic activity. In terms of host response, patients with LRTI exhibited a distinct lower airway transcriptional signature of innate and adaptive immune activation compared to those with IPC, who had similar transcriptional profiles as uninfected controls. Mediation analyses suggested that the airway microbiome influences the host response to pathogens. An integrated host-microbe metatranscriptomic classifier discriminated LRTI from IPC and controls with an AUC\u0026thinsp;=\u0026thinsp;0.89 (95% confidence interval (CI) 0.85\u0026ndash;0.92). The single gene \u003cem\u003eFABP4\u003c/em\u003e, when combined with alpha diversity, performed similarly, and FABP4 protein alone achieved an AUC\u0026thinsp;=\u0026thinsp;0.88 (95% CI 0.82\u0026ndash;0.93). Together, our findings reveal distinct ecological and immunologic archetypes that define LRTI and IPC, and support data-driven, biology-informed LRTI diagnostics that incorporate host and microbe.\u003c/p\u003e","manuscriptTitle":"Host–Microbiome Archetypes Differentiate Infection from Pathogen Carriage in the Human Lower Airway","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-12-01 08:42:48","doi":"10.21203/rs.3.rs-8171822/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"nature-communications","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"NCOMMS","sideBox":"Learn more about [Nature Communications](http://www.nature.com/ncomms/)","snPcode":"","submissionUrl":"https://mts-ncomms.nature.com/","title":"Nature Communications","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature Communications","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"9a384e93-a3aa-480f-8459-d225ebb01c3b","owner":[],"postedDate":"December 1st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":58662832,"name":"Biological sciences/Microbiology/Microbial communities/Microbiome"},{"id":58662833,"name":"Health sciences/Molecular medicine"},{"id":58662834,"name":"Health sciences/Diseases/Infectious diseases"},{"id":58662835,"name":"Biological sciences/Microbiology/Microbial communities/Metagenomics"}],"tags":[],"updatedAt":"2026-03-31T09:04:18+00:00","versionOfRecord":[],"versionCreatedAt":"2025-12-01 08:42:48","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8171822","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8171822","identity":"rs-8171822","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.