Nasal IFN and UPR Transcriptomic Signatures Complement Clinical Predictors of Frequent Exacerbations in Pediatric Asthma

preprint OA: closed
Full text JSON View at publisher
Full text 48,626 characters · extracted from oa-doi-fallback · 12 sections · click to expand

Abstract

Background: Frequent exacerbations in pediatric asthma cause substantial morbidity, yet tools for prospective risk stratification remain limited. We aimed to identify nasal transcriptomic signatures associated with frequent exacerbations (FE), test their virus responsiveness, and compare their predictive performance with a model based on routine clinical variables. Methods: Nasal RNA-seq data from GSE211158 were analyzed to compare FE and non-frequent exacerbators (NFE) and to derive an Elastic Net transcriptomic model, evaluated by stratified 5-fold cross-validation and an internal hold-out test set. External validation of interferon (IFN) and unfolded protein response/endoplasmic reticulum (UPR/ER) stress modules was performed in the experimental human rhinovirus (HRV-A16) cohort GSE97668. An independent hospital cohort of 134 children with asthma was used to build a multivariable “clinical-only” logistic regression model using age, sex, lung function, inhaled corticosteroid dose, blood eosinophils, total IgE and bedside variables; prior exacerbation count defined FE but was not entered as a predictor. Results: FE children showed upregulation of IFN-stimulated and UPR/ER stress gene sets and attenuation of epithelial barrier programs versus NFE. An Elastic Net–derived transcriptomic signature showed modest discrimination, with cross-validated AUCs clustered around ~0.47 and a hold-out ROC AUC of 0.628. IFN/IED modules were directionally increased in the HRV challenge dataset, with a significant increase in the IFN module (t-test p=0.00474) and a similar but non-significant trend for the IED module (p=0.077). In contrast, the clinical-only model performed poorly (repeated 5-fold CV mean/median AUC = 0.44; hold-out AUC = 0.38). Conclusion: Nasal IFN and UPR transcriptomic signatures define a frequent exacerbator endotype and provide moderate discrimination where clinical-only models perform poorly, suggesting that airway transcriptomics may complement conventional clinical assessment for identifying high-risk children. Nasal IFN and UPR Transcriptomic Signatures Complement Clinical Predictors of Frequent Exacerbations in Pediatric Asthma Authors: 1.Yu-xiang Zhang, PhD, Affiliated Children’s Hospital, Soochow University, Su zhou, China 2.Yu Miao, PhD, co-first author, Affiliated Children’s Hospital, Soochow University, Su zhou, China, 3. Jiating Xue, PhD, Affiliated Children’s Hospital, Soochow University, Su zhou, China 4. Nana Wang, PhD, co-corresponding author Affiliated Children’s Hospital, Soochow University, Su zhou, China 5.Chuangli Hao, PhD, Affiliated Children’s Hospital, Soochow University, Su zhou, China Corresponding author information: Corresponding author’s name: Chuangli Hao; Nana wang Corresponding author’s mailing address:No. 92 Zhongnan Street, Suzhou Industrial Park, Suzhou, China Corresponding author’s e-mail: [email protected]; [email protected] Abstract:

Background

Frequent exacerbations in pediatric asthma cause substantial morbidity, yet tools for prospective risk stratification remain limited. We aimed to identify nasal transcriptomic signatures associated with frequent exacerbations (FE), test their virus responsiveness, and compare their predictive performance with a model based on routine clinical variables.

Methods

Nasal RNA-seq data from GSE211158 were analyzed to compare FE and non-frequent exacerbators (NFE) and to derive an Elastic Net transcriptomic model, evaluated by stratified 5-fold cross-validation and an internal hold-out test set. External validation of interferon (IFN) and unfolded protein response/endoplasmic reticulum (UPR/ER) stress modules was performed in the experimental human rhinovirus (HRV-A16) cohort GSE97668. An independent hospital cohort of 134 children with asthma was used to build a multivariable “clinical-only” logistic regression model using age, sex, lung function, inhaled corticosteroid dose, blood eosinophils, total IgE and bedside variables; prior exacerbation count defined FE but was not entered as a predictor.

Results

FE children showed upregulation of IFN-stimulated and UPR/ER stress gene sets and attenuation of epithelial barrier programs versus NFE. An Elastic Net–derived transcriptomic signature showed modest discrimination, with cross-validated AUCs clustered around were directionally increased in the HRV challenge dataset, with a significant increase in the IFN module (t-test p=0.00474) and a similar but non-significant trend for the IED module (p=0.077). In contrast, the clinical-only model performed poorly (repeated 5-fold CV mean/median AUC = 0.44; hold-out AUC = 0.38).

Conclusion

Nasal IFN and UPR transcriptomic signatures define a frequent exacerbator endotype and provide moderate discrimination where clinical-only models perform poorly, suggesting that airway transcriptomics may complement conventional clinical assessment for identifying high-risk children.

Keywords

pediatric asthma; frequent exacerbation; nasal transcriptomics; interferon; unfolded protein response; risk stratification Kye message Despite their substantial contribution to pediatric asthma morbidity and healthcare utilization, frequent severe exacerbations remain challenging to predict using conventional clinical data. By integrating nasal epithelial transcriptomics with statistical learning methodologies, we identified a consistent enrichment of interferon signaling and endoplasmic reticulum stress pathways among children prone to frequent exacerbations. In contrast, a multivariable model based solely on clinical variables demonstrated poor predictive performance, suggesting that critical risk information is encapsulated within airway molecular programs. These IFN and ER-stress gene sets offer biologically plausible candidates for the development of parsimonious transcriptomic risk signatures and warrant further investigation in mechanistic studies.

Background

Asthma exacerbations are a major cause of morbidity in children, and a subset of pediatric asthmatics experiences frequent severe exacerbations requiring repeated hospitalizations 1,2 . These “frequent exacerbators” represent a distinct clinical endotype, often defined by ≥2 hospitalizations per year, and they contribute disproportionately to healthcare costs and long-term risks 1,2 . Yet clinicians still lack robust tools to prospectively identify which children will evolve into this high-risk group. Current assessment relies heavily on traditional risk factors—most notably prior exacerbation history—and a small set of candidate biomarkers such as serum IgE, blood eosinophils, and fractional exhaled nitric oxide (FeNO), all of which have shown only modest and inconsistent performance for predicting future attacks 3 . As a result, many children with apparently similar clinical profiles follow divergent trajectories, and a reliable biomarker of severe, frequent exacerbations remains elusive, underscoring the need for novel predictive markers and deeper mechanistic insight into the airway biology that underpins exacerbation-prone asthma 4,5 In this context, high-dimensional airway transcriptomic profiling offers a promising avenue to move beyond single systemic biomarkers and directly interrogate the epithelial immune networks that may drive exacerbation risk 5 Nasal airway samples, in particular, provide a minimally invasive window into upper-airway immune activity and have been used to define molecular endotypes of pediatric asthma 5-7 . A study published by Zhang et al. profiled nasal epithelial gene expression in children with asthma, stratified by exacerbation frequency, thereby offering a valuable resource to unravel transcriptomic programs associated with the “frequent exacerbator” phenotyp 6. . Recent network-based analyses of airway transcriptomes have highlighted intricate crosstalk among type 2 inflammation, interferon signaling, and stress-response pathways in asthma, supporting the idea that composite gene programs, rather than single markers, may more faithfully capture exacerbation susceptibility 8 . However, it remains unclear to what extent such airway molecular signatures add prognostic information beyond conventional clinical predictors routinely available at the bedside, such as lung function measures and standard inflammatory indices. Here, an integrative bioinformatics approach was applied: Elastic Net (EN) regularization to build a predictive gene model, a WGCNA-like co-expression analysis to identify gene modules associated with exacerbation status, Differential expression and pathway enrichment analyses were performed to pinpoint dysregulated pathways, and an independent cohort from a human rhinovirus (HRV-A16) challenge transcriptomic study was used for directional validation of key modules. Notably, this cohort is highly relevant as respiratory viruses are the primary triggers of pediatric asthma exacerbations 8 .Using this external dataset to test whether FE-associated modules are engaged during experimental viral infection strengthens the robustness and translational relevance of the findings. Immune-inflammatory pathways emerged as central. In particular, genes related to the unfolded protein response (UPR) and endoplasmic reticulum (ER) stress were enriched in children prone to severe exacerbations 9 . This aligns with emerging evidence implicating maladaptive UPR/ER stress in airway epithelial dysfunction and remodeling in asthma 9-11 . Likewise, an interferon (IFN)-associated module was prominent, consistent with known deficits or dysregulation of antiviral IFN responses that can predispose asthmatic children to virus-induced attacks 10-12 . These molecular aberrations dovetail with the concept of a weakened airway epithelial barrier in asthma—impaired tight junctions and innate defenses that render airways hyper-susceptible to allergens and infections 11 . To place these transcriptomic findings in a clinical context, we also assembled an independent hospital cohort of children with asthma and constructed a “clinical-only” multivariable logistic regression model for FE using routine demographic, treatment, lung function, and inflammatory markers. This allowed us to benchmark the prognostic value of complex transcriptomic signatures against standard bedside variables and to assess whether molecular profiling offers meaningful incremental information for risk stratification. Taken together, our analyses support the notion that frequent exacerbators harbor distinct airway immune-network perturbations (UPR/ER stress activation, IFN dysregulation, and epithelial barrier frailty) that are not fully captured by conventional clinical indices.

Methods

Study design and data sources This was a retrospective, hypothesis-generating study based on publicly available, de-identified transcriptomic datasets from the Gene Expression Omnibus (GEO) and an independent hospital-based clinical cohort. The primary discovery cohort was GSE211158, which includes nasal epithelial RNA-sequencing profiles from children with physician-diagnosed asthma classified according to exacerbation frequency. The external transcriptomic validation cohort was GSE97668, a human rhinovirus (HRV-A16) experimental infection study with longitudinal airway samples from asthmatic subjects. In addition, a real-world clinical cohort of children with asthma was assembled from a single tertiary pediatric center to construct a “clinical-only” risk model using routinely collected variables. Because only anonymized GEO data and retrospectively collected, de-identified hospital records were analyzed, no additional institutional review board approval or informed consent was required, in accordance with local regulations on secondary analyses of de-identified data. Definition of frequent exacerbations in GEO cohorts In GSE211158, clinical metadata were used to define frequent exacerbators (FE) and non-frequent exacerbators (NFE). Children experiencing ≥2 severe asthma exacerbations requiring hospital admission within a 12-month period were categorized as FE, whereas those with fewer or no hospital-requiring exacerbations were classified as NFE. Samples with incomplete or ambiguous exacerbation history were excluded from downstream analyses to avoid misclassification. Because exacerbation frequency over the prior year was used to define the FE phenotype itself, this variable was not included as a predictor in any multivariable models to avoid circularity. RNA-seq preprocessing and normalization For both GSE211158 and GSE97668, raw gene-level count matrices were downloaded from GEO. Only protein-coding genes were retained based on the GRCh38.p13 human genome annotation. Lowly expressed genes were filtered out by requiring counts per million (CPM) > 1 in at least 20% of samples in the respective dataset. Library size normalization was first performed at the sample level. Normalized expression values were then transformed to log₂(CPM + 1) to stabilize variance across the dynamic range. When multiple transcripts mapped to the same gene symbol, counts were summed prior to normalization. All downstream analyses were conducted on the log₂-transformed, gene-level expression matrices. Differential expression and gene set enrichment To identify differentially expressed genes (DEGs) between FE and NFE children in GSE211158, Welch’s t-test (unequal variances) was applied to each gene’s log₂(CPM + 1) expression values. P-values were adjusted for multiple testing using the Benjamini–Hochberg false discovery rate (FDR) procedure, and FDR-controlled results were used for primary inference. For the exploratory volcano visualization (Figure 4A), we highlighted genes meeting nominal P < 0.01 and |log₂FC| ≥ 0.3 (FE vs NFE). For pathway-level analysis, a pre-ranked Gene Set Enrichment Analysis (GSEA) was conducted using the genome-wide t-statistics from the FE versus NFE comparison. Curated gene sets were assembled a priori to represent key biological processes of interest, including (i) unfolded protein response (UPR) and ER stress–related genes, (ii) interferon (IFN)-stimulated genes, (iii) generic inflammatory response genes, and (iv) epithelial barrier and junctional complex genes. Genes were ranked by decreasing t-statistic, and enrichment scores were calculated using a running-sum statistic with permutation of gene labels to estimate the null distribution. Normalized enrichment scores (NES) and nominal P-values were reported for each pathway. Co-expression module analysis To identify coordinated gene expression programs associated with exacerbation status, a WGCNA-like co-expression analysis was performed on GSE211158. For computational stability and to focus on the most informative variation, the top 400 most variable genes (by variance across samples) were selected. Gene expression values were standardized to z-scores across samples. An unsigned gene–gene correlation matrix was computed using Pearson correlation coefficients. Instead of full scale-free topology fitting, genes were partitioned into co-expression modules using k-means clustering on the correlation-based distance matrix, approximating the module detection concept of WGCNA. Each module’s eigengene (first principal component of the standardized expression matrix for that module) was calculated and correlated with the binary FE versus NFE phenotype using Pearson correlation. The module with the strongest absolute correlation with exacerbation status was designated the “index module.” Within the index module, intra-modular connectivity was defined as the sum of absolute correlation coefficients between a given gene and all other genes in the module. Genes in the top tier of intra-modular connectivity were considered hub genes and further examined for biological plausibility and overlap with curated IFN/UPR pathways. Predictive modeling with Elastic Net in the transcriptomic cohort To derive a predictive gene signature for FE, Elastic Net (EN) logistic regression was applied to the discovery cohort (GSE211158). Prior to modeling, the top 1,000 most variable genes (by variance) were selected to reduce dimensionality while retaining sufficient information. Expression values were standardized to zero mean and unit variance for each gene. The dataset was split into training and hold-out test sets using an 80:20 stratified random split to preserve the FE/NFE proportion in each subset. On the training set, an EN logistic regression model with a binomial link was fitted, combining L1 and L2 penalties to achieve both variable selection and coefficient shrinkage. The mixing parameter (α) and regularization strength (λ) were tuned using nested cross-validation within the training data, optimizing for area under the receiver operating characteristic curve (AUC). Model performance was evaluated in two ways. First, stratified 5-fold cross-validation was performed to characterize the distribution of AUC across folds. Second, the final model trained on the training set was applied to an independent hold-out set, and test AUC was reported. Calibration was assessed by grouping predicted probabilities into deciles and comparing observed versus predicted event rates. Decision curve analysis was performed over a range of threshold probabilities to estimate net benefit relative to “treat all” and “treat none” strategies. Genes with non-zero EN coefficients in the final model were considered members of the FE prediction panel. The direction and magnitude of coefficients were inspected to ensure biological plausibility, particularly in relation to IFN/UPR and epithelial barrier genes. External validation in the HRV challenge cohort To test whether the FE-associated modules were recapitulated in an independent viral exacerbation context, external validation was carried out in GSE97668. For this dataset, log₂(CPM + 1) expression values were computed as described above. Samples were annotated according to condition (baseline/control versus post–HRV-A16 infection), and module scores were compared in the direction of HRV exposure. Module activity scores for the UPR/ER stress and IFN gene sets were calculated as the mean standardized expression (z-score) across all genes in the respective gene set for each sample. These module scores were then compared between HRV-infected and baseline conditions using Welch’s t-test. The primary validation criterion was directional consistency: FE-associated modules that were upregulated in FE versus NFE children in GSE211158 were expected to be similarly upregulated after HRV infection in GSE97668. Hospital cohort and clinical-only prediction model To benchmark the transcriptomic model against routine clinical variables, an independent hospital cohort of children with asthma was retrospectively identified from the Children’s Hospital of Soochow University between January 2025 and November 2025. Inclusion criteria were physician-diagnosed asthma according to guideline-based criteria, age within the pediatric range, and availability of baseline clinical data and 12-month exacerbation history. FE was defined analogously to the GEO cohort as ≥2 severe exacerbations requiring emergency visit or hospitalization within the previous 12 months; all other children were classified as NFE. For each child, the following baseline variables were extracted from the medical record: age (years), sex, body mass index (BMI), age at asthma onset (years), disease duration (months), daily inhaled corticosteroid (ICS) dose (categorized as low vs. medium according to pediatric guideline-based dose ranges), baseline FEV₁ % predicted, FEV₁/FVC ratio, blood eosinophil count, and total serum IgE. To avoid circularity, the number of severe exacerbations in the past 12 months was used only to define FE versus NFE and was not included as a predictor in the primary multivariable model. A sensitivity model that included exacerbation count as a covariate was constructed as a positive control but is not emphasized as a practical predictive tool. A “clinical-only” multivariable logistic regression model was fitted with FE status as the outcome and the above baseline variables as candidate predictors. Continuous variables were standardized to z-scores (mean 0, SD 1) prior to modeling so that odds ratios (ORs) reflected the change in odds of FE per 1 SD increase in each predictor. ICS dose was entered as a binary factor (low vs. medium). The dataset was randomly split into an 80% training set and a 20% hold-out test set using stratified sampling to maintain the FE/NFE ratio. Within the training set, repeated 5-fold stratified cross-validation was used to summarize AUC variability. The final model was then trained on the entire training set and evaluated on the hold-out set, reporting test AUC, calibration, and decision curve analysis. Regression coefficients, ORs, 95% confidence intervals (CIs), and P-values were summarized in a three-line table. The sensitivity model including exacerbation count was evaluated to illustrate the expected near-perfect discrimination when the defining variable is explicitly modeled. Statistical analysis Unless otherwise specified, continuous variables are summarized as mean ± standard deviation. For two-group comparisons of gene expression or module scores, Welch’s t-test (two-sided) was used. Multiple testing correction for genome-wide differential expression was performed using the Benjamini–Hochberg FDR procedure. A two-sided P < 0.05 or FDR < 0.05 was considered statistically significant. All analyses were conducted using a complete-case approach; missing values were not imputed. All data preprocessing, differential expression, and enrichment analyses were performed using standard statistical environments in R (version 4.2.3) and Python (version 3.12). Predictive modeling and cross-validation were implemented with the scikit-learn library. Logistic regression in the hospital cohort was fitted using both scikit-learn and statsmodels to obtain AUC estimates and regression coefficients with 95% CIs. Plots for ROC curves, calibration, decision curve analysis, volcano plots, and module–trait relationships were generated using base R and matplotlib. Ethics approval and informed consent This retrospective study of the hospital cohort was conducted in accordance with the Declaration of Helsinki and was approved by the Medical Ethics Committee of the Children’s Hospital of Soochow University. According to the committee’s policy for retrospective chart-review studies, no specific approval number was issued. Written informed consent was obtained from the parents or legal guardians of all participating children, with age-appropriate assent from the children when applicable. Analyses of the GEO transcriptomic datasets (GSE211158 and GSE97668) were performed on publicly available, de-identified data and therefore did not require additional ethics approval or informed consent.

Results

Characteristics of the transcriptomic datasets After quality control of the GSE211158 nasal RNA-seq dataset, a high-quality expression matrix was obtained for children classified as FE or NFE. A large number of protein-coding genes with sufficient read counts remained for downstream analyses, providing adequate power to explore gene-level and module-level differences between FE and NFE groups. The external validation dataset GSE97668, an HRV-A16 challenge cohort, included baseline and post-infection nasal samples, enabling assessment of whether FE-associated modules were concordantly regulated during experimental viral infection. Performance of the Elastic Net predictive model in GSE211158 Using the discovery cohort (GSE211158), an EN logistic regression model was trained on the most variable genes to distinguish FE from NFE children. The fitted model yielded a sparse set of non-zero coefficients; the largest positive coefficients included TRIM31 and SULT1A3, whereas prominent negative coefficients included LSP1 and ERAP2 (Figure 1A), highlighting genes spanning antiviral signaling and epithelial/immune programs. The predictive performance of the EN model is summarized in Figure 2. In 5-fold cross-validation, AUC values were centered around When applied to an independent hold-out set, the model yielded a ROC AUC of 0.628 (Figure 2B). Calibration on the hold-out set showed substantial variability across deciles rather than a smooth monotonic increase (Figure 1C). Decision curve analysis indicated net benefit that was close to the “treat none” baseline across most threshold probabilities, with little separation from default strategies (Figure 1D). The composition and stability of the EN-derived gene signature are shown in Figure 2. Calibration analysis based on deciles of predicted risk in the hold-out set showed appreciable scatter around the ideal line, consistent with the modest discriminative performance. Decision curve analysis likewise suggested limited incremental net benefit over default strategies across most threshold probabilities. Differential expression and pathway enrichment in frequent exacerbators Comparative expression analysis between FE and NFE children highlighted only modest gene-level differences. In the exploratory volcano plot (P < 0.01 and |log₂FC| ≥ 0.3), PROC and CXorf22 were among the upregulated genes in FE, whereas ZDHHC19 and ZNF683 were among the downregulated genes (Figure 3A). Over-representation testing of the exploratory DEG set did not show clear enrichment for selected pathway groupings (Figure 3B), whereas pre-ranked GSEA suggested positive enrichment of IFN response and UPR/IED signatures in FE versus NFE (Figure 3C). Pre-ranked GSEA using the genome-wide t-statistic ranking suggested positive enrichment of IFN response and UPR/IED-related signatures in FE versus NFE (Figure 3C). Among the displayed gene sets, IFN response showed the highest NES (≈2.3), followed by UPR/IED (≈2.1) and inflammatory response (≈1.7), while the epithelial barrier set showed a smaller positive NES (≈0.85). Co-expression modules and hub genes associated with exacerbation status To capture coordinated gene programs rather than individual markers, a WGCNA-like co-expression analysis was performed on the top variance genes in GSE211158. The structure and phenotype associations of co-expression modules are summarized in Figure 4. K-means clustering of the correlation matrix identified six distinct co-expression modules. Module–phenotype correlations showed that M1 had the strongest positive association with FE status, although the effect size was modest (Pearson r ≈ 0.17). Several other modules showed weak or inverse correlations (e.g., M2/M6 with r around −0.13 to −0.15). Functional inspection of M1 indicated overlap with IFN- and UPR/ER stress–related genes. Within M1, intra-modular connectivity analysis highlighted several hub genes, including XBP1, HSPA5, and ISG1 (Figure 4B). XBP1 is a key transcription factor in the IRE1 arm of the UPR, and HSPA5 (BiP/GRP78) is a central ER chaperone. ISG1 represents an interferon-stimulated gene signal within the same co-expression neighborhood, supporting partial coupling of ER stress and antiviral programs in FE-associated nasal epithelium. Other modules showed weaker or inverse associations with FE status, including modules enriched for generic epithelial structural genes and metabolic pathways. Overall, module–trait correlations were small in magnitude, underscoring that transcriptomic differences between FE and NFE were subtle at the cohort level. External validation of UPR/IFN modules in the HRV challenge cohort To test whether the FE-associated transcriptional programs were reproducibly engaged during viral infection, UPR and IFN module activity scores were evaluated in the HRV-A16 challenge dataset GSE97668. Module scores were calculated as the mean standardized expression of genes within the curated UPR/IED and IFN gene sets for each sample. These validation analyses are illustrated in Figure 5. In the HRV-A16 challenge dataset (GSE97668), the IFN module score was higher in HRV-exposed samples than in baseline/control samples (t-test p=0.00474; Figure 5B). The IED module score showed a similar direction of change but did not reach conventional significance (p=0.077; Figure 5A). These results support that the IFN/IED programs highlighted in the discovery cohort are virus-responsive in an independent experimental context. Together with the discovery-cohort analyses, the HRV challenge results indicate that the IFN/IED axis identified in FE children is directionally engaged during experimental rhinovirus infection. This cross-cohort concordance supports a virus-responsive transcriptional program as a component of the FE-associated molecular phenotype. Clinical-only risk model in the independent hospital cohort The independent hospital cohort comprised 134 children with asthma, of whom 72 (53.7%) fulfilled the FE definition (≥2 severe exacerbations requiring emergency visit or hospitalization in the previous 12 months), and 62 (46.3%) were classified as NFE. Baseline characteristics, including age, sex, BMI, age at asthma onset, disease duration, ICS dose category, lung function indices, blood eosinophil count, and total IgE, are summarized in the corresponding tables. The performance of the clinical-only model is shown in Figure 6. Repeated 5-fold cross-validation yielded a mean and median AUC of 0.44 (Figure 6A). When applied to the 20% hold-out test set, the ROC AUC was 0.38 (Figure 6B), indicating discrimination close to (or worse than) chance. The hold-out calibration curve deviated from the ideal line (Figure 6C), and decision curve analysis showed limited net benefit, with the clinical-only model overlapping default strategies at low thresholds and approaching zero or negative net benefit at moderate thresholds (Figure 6D). These findings suggest that routinely measured baseline clinical variables are insufficient to reliably identify children at high risk of frequent severe exacerbations in this cohort.

Discussion

In this study, we integrated nasal epithelial transcriptomic data from a public pediatric asthma cohort with an experimental human rhinovirus (HRV) challenge dataset and an independent hospital-based clinical cohort to interrogate mechanisms underlying frequent exacerbations and to benchmark molecular versus clinical predictors. This work provides three main findings. First, we delineate a frequent exacerbator endotype characterized by coordinated upregulation of interferon- and UPR/ER stress–related programs in the nasal epithelium, accompanied by relatively weaker epithelial barrier signatures. Second, an Elastic Net transcriptomic model achieved a hold-out AUC of 0.628, but cross-validated AUCs were modest and calibration/decision-analytic curves suggested limited incremental utility, indicating that further refinement and external validation would be needed before clinical translation. Third, a model based solely on routine clinical variables performed poorly (mean/median CV AUC 0.44; hold-out AUC 0.38), supporting the idea that airway transcriptomic profiling captures information not reflected in standard bedside indices. Overall, the transcriptomic model showed a hold-out AUC of 0.628, whereas the clinical-only model in the hospital cohort showed AUC values close to chance (mean/median CV AUC 0.44; hold-out AUC 0.38). The co-expression module most positively associated with FE status (M1) contained hub genes such as XBP1, HSPA5, and ISG1, linking ER stress and interferon-associated signals within a shared network neighborhood. In the HRV challenge dataset, IFN module activity increased significantly in HRV-exposed samples (p=0.00474), with a similar but non-significant trend for the IED module (p=0.077). Taken together, these findings suggest that virus-responsive interferon programs and ER stress signatures are components of the pediatric frequent exacerbator phenotype, although the magnitude of transcriptomic separation and predictive performance in this dataset were modest. Our results are consistent with, but also extend, prior work on immune dysregulation in asthma. Several studies have reported that airway epithelial cells in asthma display complex inflammatory programs in which type 2 cytokines, interferon-stimulated genes, and stress-response pathways can be co-activated rather than neatly segregated into distinct “endotypes” 12-13. The enrichment of UPR and ER stress genes in frequent exacerbators is concordant with experimental data showing that ER stress contributes to epithelial dysfunction, mucus hypersecretion, and remodeling in asthma models 14-15. XBP1, one of the hub genes in our co-expression module, is a key transcription factor in the IRE1 arm of the UPR and has been implicated in both inflammatory cytokine production and airway structural changes 14. Its central position in the frequent exacerbator module supports a mechanistic link between chronic epithelial stress and a propensity to severe attacks. The prominence of interferon-associated genes in frequent exacerbators aligns with the central role of respiratory viruses in precipitating pediatric asthma exacerbations. Deficient or dysregulated interferon responses have been associated with increased susceptibility to virus-induced exacerbations in both in vitro and in vivo models 16-18. In our network analysis, an interferon-stimulated signal (represented by ISG1) co-occurred with UPR hub genes such as XBP1 and HSPA5 within the FE-associated module, suggesting that antiviral and ER-stress programs may be coupled in exacerbation-prone children. The directional concordance between the frequent exacerbator transcriptomic signature and the response to experimental HRV infection in independent nasal challenge cohort strengthens this interpretation. Modules enriched for UPR and interferon genes were upregulated both in frequent exacerbators (compared with non-frequent exacerbators) and in HRV-infected airway samples (compared with baseline), indicating that the same gene networks engaged during acute viral infection are tonically activated or more easily induced in the frequent exacerbator endotype. This is consistent with prior observations that virus-induced exacerbations in asthma are characterized by distinct transcriptional networks centered on interferon regulatory factors and antiviral genes 16,20. Our data extend these findings by showing that similar interferon/UPR co-expression programs are already imprinted in the nasal epithelium of children who clinically manifest frequent severe exacerbations. A key strength of this study is the inclusion of a real-world hospital cohort that allowed us to compare molecular signatures with conventional clinical risk markers. In this cohort, a multivariable logistic model constructed solely from baseline demographic, treatment, lung function, and inflammatory indices—variables that are readily available in routine practice—showed poor discrimination for FE (AUC close to 0.5), despite a reasonable sample size and event rate. This suggests that, once prior exacerbation history is excluded to avoid tautological prediction, commonly measured clinical parameters contain limited independent information about who will go on to experience frequent severe attacks. By contrast, the EN-based transcriptomic model in GSE211158, while not achieving perfect accuracy, consistently reached an AUC around 0.70. Taken together, these observations imply that the airway molecular state, particularly the balance between IFN/UPR activation and epithelial barrier integrity, may capture latent susceptibility that is not well reflected by standard markers such as FEV₁, blood eosinophils, or IgE. From a translational standpoint, this supports the concept of combining molecular profiling with clinical assessment to enrich for high-risk children in trials of preventive therapies, rather than relying solely on clinical criteria. From a clinical standpoint, our study has two main implications. First, the feasibility of a nasal swab–based gene panel, obtained through a relatively non-invasive procedure, offers a practical avenue for risk stratification in pediatric asthma. Although the Elastic Net model achieved only moderate discrimination, an AUC around 0.70 is comparable to many early-stage biomarker models and could be useful as part of a composite risk assessment that also considers clinical variables such as prior exacerbation history, lung function, and atopic status. Nasal transcriptomic profiling has already been proposed as a tool to define asthma endotypes and monitor disease activity, and our results suggest it may also help identify children at highest risk of frequent severe exacerbations. Second, the identification of interferon and UPR pathways as central features of the frequent exacerbator endotype raises the possibility of targeted interventions. On the one hand, insufficient or mistimed interferon responses have been implicated in severe viral exacerbations, and therapeutic modulation of interferon signaling has been explored, for example with inhaled interferon-β in adults with asthma and viral respiratory infections 17. On the other hand, exaggerated or chronic interferon pathway activation, particularly in the context of unresolved ER stress, may also be harmful by perpetuating epithelial injury and inflammation. Similarly, several preclinical studies have shown that chemical chaperones or other ER stress–modulating agents can attenuate airway hyperresponsiveness and inflammation in asthma models 14,21. Taken together, our findings suggest that a subset of children with asthma and frequent exacerbations may benefit from therapies aimed at correcting ER stress and fine-tuning antiviral responses, possibly in combination with standard anti-inflammatory and biologic treatments. Several limitations should be considered. First, the transcriptomic analyses are based on a single discovery cohort with a relatively small sample size, which may limit statistical power and generalizability; although cross-validation and a hold-out set were used to reduce overfitting, the performance of the Elastic Net model and the stability of the selected genes require confirmation in independent pediatric cohorts, and the cross-sectional design with nasal profiling at a single time point precludes conclusions about causality or temporal sequence between IFN/UPR activation and frequent exacerbations. Second, the hospital cohort was derived from a single center and reflects routine clinical data, which may not capture all relevant risk factors or be representative of other settings; important markers such as FeNO, detailed allergen sensitization profiles, or composite symptom scores were not systematically available and could potentially improve prediction. Third, the WGCNA-like analysis used a simplified k-means–based approach rather than a full scale-free network optimization, so the identified modules should be interpreted as approximate co-expression clusters. Finally, the external validation cohort involved experimental HRV infection in a different population and tissue context; while the directional concordance of IFN/UPR modules is reassuring, replication in additional pediatric frequent-exacerbator cohorts is still needed. Future studies should aim to validate and refine the frequent exacerbator transcriptomic signature in larger, prospective pediatric cohorts with standardized phenotyping and longitudinal follow-up. Repeated sampling before, during, and after exacerbations would help to disentangle stable, baseline risk signatures from dynamic responses to acute triggers. Integrating nasal transcriptomics with additional omics layers, such as blood cytokine profiles, genomic data, or microbiome composition, may yield a more comprehensive and robust risk prediction model. Mechanistic work is also needed to clarify the roles of specific hub genes and pathways highlighted here. For example, functional studies of XBP1 and other interferon-/stress-associated hub genes (including HSPA5 and ISG1) in airway epithelial cells and immune cells from children with asthma could elucidate how perturbations in UPR and interferon signaling contribute to epithelial barrier dysfunction, mucus production, and exaggerated responses to viral infections. Finally, early-phase interventional studies targeting ER stress or antiviral response pathways in carefully selected high-risk children may help determine whether modulation of these pathways can reduce the frequency or severity of exacerbations.

Conclusion

In this study, we combined nasal epithelial transcriptomics, machine learning and network analysis to characterize a frequent exacerbator endotype in pediatric asthma and to compare it with a clinical-only risk model from an independent hospital cohort. An Elastic Net transcriptomic model achieved a hold-out AUC of 0.628 and highlighted an FE-associated network module containing ER stress and interferon-related hub genes (e.g., XBP1, HSPA5, ISG1), with directional support from an independent HRV-A16 challenge dataset. By contrast, a model based solely on routine clinical variables performed poorly (mean/median CV AUC 0.44; hold-out AUC 0.38), suggesting that airway transcriptomic profiles capture exacerbation risk not reflected in standard bedside indices. These findings support nasal transcriptomics as a potential complement to conventional clinical assessment for identifying high-risk children, while also underscoring the need for larger external validations to improve predictive performance and clinical utility. Funding This work was supported by the National Natural Science Foundation of China (NSFC) General Program (Grant No. 82570009), project entitled “ZFP91-mediated K48-linked polyubiquitination of M2-1 protein and its role in inhibiting respiratory syncytial virus replication” Conflict of interest The authors declare that they have no conflicts of interest and no competing financial or personal relationships that could have influenced the work reported in this manuscript. Data availability statement The nasal epithelial RNA-sequencing datasets analysed in this study (GSE211158 and GSE97668) are publicly available from the NCBI Gene Expression Omnibus. The de-identified clinical dataset from the hospital pediatric asthma cohort contains potentially sensitive patient information and, in accordance with institutional and national data protection regulations, cannot be deposited in a public data repository and therefore does not have a DOI. In line with the journal’s “share upon reasonable request” policy, these data are available from the corresponding author (H.CL) upon reasonable request and subject to approval by the institutional ethics committee. Author contributions Zhang Yuxiang was responsible for data analysis, software and model construction, and drafting the initial version of the manuscript. Miao Yu, Wang Nana, and Xue Jiating contributed to data collection and organization. Haochuang Li, as the corresponding author, supervised the study, contributed to the overall conceptual framework, and critically reviewed and revised the manuscript. All authors read and approved the final manuscript.

References

1. Bacharier LB, Guilbert TW, Mauger DT, et al. Early administration of azithromycin and prevention of severe lower respiratory tract illnesses in preschool children with a history of such illnesses: a randomized clinical trial. JAMA (2015) 314(19):2034–2044. doi:10.1001/jama.2015.13896 2. Fitzpatrick AM, Teague WG. Severe asthma in children: insights from the National Heart, Lung, and Blood Institute Severe Asthma Research Program. Pediatr Allergy Immunol Pulmonol (2010) 23(2):131–138. doi:10.1089/ped.2010.0021 3. Kim H, Ellis AK, Fischer D, et al. Asthma biomarkers in the age of biologics. Allergy Asthma Clin Immunol (2017) 13:48. doi:10.1186/s13223-017-0219-4 4. Fahy JV. Type 2 inflammation in asthma: present in most, absent in many. Nat Rev Immunol (2015) 15(1):57–65. doi:10.1038/nri3786 5. Altman MC, Gill MA, Whalen E, et al. Transcriptome networks identify mechanisms of viral and nonviral asthma exacerbations in children. Nat Immunol (2019) 20(5):637–651. doi:10.1038/s41590-019-0374-2 6. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res (2002) 30(1):207–210. doi:10.1093/nar/30.1.207 7.Denney L, Byrne AJ, Shea TJ, et al. Nasal epithelial cells from asthmatic children differentially express innate immune genes and may be more susceptible to viral infection. Clin Exp Allergy (2018) 48(2):165–174. doi:10.1111/cea.13063 8. Message SD, Laza-Stanca V, Mallia P, et al. Rhinovirus-induced lower respiratory illness is increased in asthma and related to virus load and Th1/Th2 cytokine and IL-10 production. Proc Natl Acad Sci USA (2008) 105(36):13562–13567. doi:10.1073/pnas.0804181105 9.Pathinayake PS, Hsu AC, Waters DW, et al. Understanding the unfolded protein response in the pathogenesis of asthma. Front Immunol (2018) 9:175. doi:10.3389/fimmu.2018.00175 10.Dastghaib S, Li H, Miao Y, et al. Mechanisms targeting the unfolded protein response in asthma. Am J Respir Cell Mol Biol (2021) 65(2):127–139. doi:10.1165/rcmb.2019-0235TR 11.Wark PA, Johnston SL, Bucchieri F, et al. Asthmatic bronchial epithelial cells have a deficient innate immune response to infection with rhinovirus. J Exp Med (2005) 201(6):937–947. doi:10.1084/jem.20041901 12. Contoli M, Message SD, Laza-Stanca V, et al. Role of deficient type III interferon-λ production in asthma exacerbations. Nat Med (2006) 12(9):1023–1026. doi:10.1038/nm1442 13.Heijink IH, Kuchibhotla VNS, Roffel MP, et al. Epithelial cell dysfunction, a major driver of asthma development. Allergy (2020) 75(8):1902–1917. doi:10.1111/all.14381 14.Bhakta NR, Christenson SA, Nerella S, et al. IFN-stimulated gene expression, type 2 inflammation, and endoplasmic reticulum stress in asthma. Am J Respir Crit Care Med (2018) 197(3):313–324. doi:10.1164/rccm.201706-1070OC 15.Bosco A, Ehteshami S, Panyala S, Martinez FD. Interferon regulatory factor 7 is a major hub connecting interferon-mediated responses in virus-induced asthma exacerbations in vivo. J Allergy Clin Immunol (2012) 129(1):88–94. doi:10.1016/j.jaci.2011.10.038 16.Zaheer RS, Wiehler S, Hudy MH, et al. Human rhinovirus-induced ISG15 selectively modulates epithelial antiviral immunity. Mucosal Immunol (2014) 7(5):1127–1138. doi:10.1038/mi.2013.128 17.Djukanović R, Harrison T, Johnston SL, et al. The effect of inhaled IFN-β on worsening of asthma symptoms caused by viral infections: a randomized trial. Am J Respir Crit Care Med (2014) 190(2):145–154. doi:10.1164/rccm.201312-2235OC 18.Makhija L, Krishnan V, Rehman R, et al. Chemical chaperones mitigate experimental asthma by attenuating endoplasmic reticulum stress. Am J Respir Cell Mol Biol (2014) 50(5):923–931. doi:10.1165/rcmb.2013-0320OC 19.Kim SR, Lee YC. Endoplasmic reticulum stress and the related signaling networks in severe asthma. Allergy Asthma Immunol Res (2015) 7(2):106–117. doi:10.4168/aair.2015.7.2.106 20.Janssen-Heininger YMW, Poynter ME, Budd RC. Endoplasmic reticulum stress and glutathione therapeutics in chronic lung disease. Redox Biol (2020) 33:101516. doi:10.1016/j.redox.2020.101516 21.Brightling CE, Gupta A, Gonem S, Porsbjerg CM. The epithelial era of asthma research: knowledge gaps and future directions for patient care. Eur Respir Rev (2024) 33(172):240021. doi:10.1183/16000617.0021-2024 Information & Authors Information Version history Copyright This work is licensed under a Non Exclusive No Reuse License. Authors Metrics & Citations Metrics Article Usage 114views 59downloads Citations Download citation Yuxiang Zhang, Yu Miao, Jiating Xue, et al. Nasal IFN and UPR Transcriptomic Signatures Complement Clinical Predictors of Frequent Exacerbations in Pediatric Asthma. Authorea. 05 February 2026. DOI: https://doi.org/10.22541/au.177028044.40839935/v1 DOI: https://doi.org/10.22541/au.177028044.40839935/v1 If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00