Ciliary-Enriched and Keratinization-Dominant Endotypes Drive Molecular Heterogeneity in AECOPD

preprint OA: closed
Full text JSON View at publisher
Full text 138,150 characters · extracted from preprint-html · click to expand
Ciliary-Enriched and Keratinization-Dominant Endotypes Drive Molecular Heterogeneity in AECOPD | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Ciliary-Enriched and Keratinization-Dominant Endotypes Drive Molecular Heterogeneity in AECOPD Jiacheng Zhong, Zhen Wang, Hanting Li, Kai Yang, Yu Meng, Shanze Chen, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9347057/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 5 You are reading this latest preprint version Abstract Background: Acute exacerbations of COPD (AECOPD) involve poorly characterized airway host responses. Integrating sputum proteomics with radiomics may reveal underlying molecular heterogeneity and identify clinically relevant biomarkers. Methods: We performed deep proteomic profiling on induced sputum from patients with AECOPD (n=35), stable COPD (n=19), and healthy controls (n=15). Weighted gene co-expression network analysis (WGCNA) and consensus clustering were used to define molecular endotypes. Proteomic modules were correlated with quantitative chest CT radiomic features including ground-glass opacity (GGO), consolidation, and parenchymal heterogeneity. Results: We identified two major endotypes in AECOPD. The ciliary-enriched endotype showed coordinated upregulation of ciliary proteins (e.g., PCM1, CFAP58, TMEM231) and was associated with elevated MUC5AC. In contrast, the keratinization-dominant endotype was characterized by epithelial repair programs. Radiomic analysis revealed that the keratinization-dominant module positively correlated with GGO volume, consolidation burden, and parenchymal heterogeneity, while the ciliary module was linked to more homogeneous lung parenchyma. Across the AECOPD cohort, mucosal immunoglobulins, particularly IgA-related proteins, were markedly depleted and served as strong independent predictors of protection against future severe exacerbations, outperforming traditional clinical indicators. Conclusion: AECOPD is molecularly heterogeneous, featuring a ciliary-enriched endotype with paradoxical ciliary activation and a keratinization-dominant endotype associated with greater inflammatory burden on imaging. Baseline immunoglobulin levels provide powerful prognostic information. These findings offer a new framework for endotype-driven precision management in COPD. AECOPD ciliary proteins immunoglobulins keratins MUC5AC Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 1. Introduction Chronic Obstructive Pulmonary Disease (COPD) is a complex condition characterized by persistent respiratory symptoms and airflow limitation. Acute exacerbations of COPD (AECOPD) are common throughout the disease course. Large-scale multicenter studies have shown that COPD patients experience an average of 0.5 to 3.5 acute exacerbations per year, significantly impacting lung function, quality of life, and accelerating disease progression, while also increasing mortality rates [ 1 ] . Understanding the pathogenesis of AECOPD through altered protein levels could aid in identifying these sporadic episodes. Existing diagnostic markers for AECOPD, such as serum C-reactive protein (CRP) and Serum Amyloid A (SAA) [ 2 ] , are systemic markers that may not directly reflect the local airway environment. Recent pilot studies have started to focus on AECOPD [ 3 ] , but they did not include biomarkers from airway secretions, such as induced sputum or bronchoalveolar lavage fluid. While some studies have evaluated the microbiome profiles of AECOPD sputum, they lack critical information on the proteome, which delineate the host response to these disease episodes [ 4 ] . Current clinical assessments of AECOPD rely on lung function measurements, symptom scores, acute exacerbation histories, and imaging data. Although these approaches provide valuable information, they lack the precision needed to develop personalized treatment strategies and have had limited success in altering the disease course of COPD [ 5 ] . A systematic review identified 27 prediction models for acute exacerbations of COPD from 1382 international studies [ 6 ] , but concluded that none effectively predicts the risk of exacerbations or offers guidance for individualized treatment. Similarly, Keene JD et al. incorporated peripheral blood biomarkers into two large COPD cohorts (SPIROMICS and COPDGene) but found the predictive value of these biomarkers to be low—often lower than that of traditional clinical indicators [ 7 ] . Historically, the strongest predictor of future AECOPD has been a patient's history of exacerbations [ 8 ] . However, since this indicator can fluctuate from year to year, it is crucial to identify novel biomarkers that can predict AECOPD independently of previous exacerbation history. In this study, we used in-depth sputum proteomics to characterize the host response in AECOPD. We hypothesized that the airway molecular landscape during an exacerbation is distinct from that of stable disease. Our primary goals were threefold: first, to profile the global changes in the sputum proteome during AECOPD; second, to investigate the unexpected interplay between ciliary function and mucosal immunity; and finally, to determine whether these molecular patterns can define subgroups of patients with different underlying biology or clinical outcomes. To achieve this, we compared the sputum proteomic profiles of stable COPD patients, AECOPD patients, and healthy controls. We employed network analysis and unsupervised clustering to identify coherent biological programs and patient subgroups. Additionally, we developed a prognostic model to assess the ability of protein biomarkers to predict the risk of future exacerbations. 2. Methods 2.1. Study Design and Ethics This study was approved by the Ethics Committee of Shenzhen People's Hospital (reference number: KY-LL-2020294-01) and was conducted in strict accordance with the principles of the Helsinki Declaration. All participants provided informed consent. This retrospective study includes both a case-control comparison (COPD patients vs. healthy controls) and a case-case comparison (patients in stable condition vs. those experiencing AECOPD), based on data collected from our hospital. The primary aim is to identify sputum proteomic biomarkers that can detect COPD in the general population, differentiate between AECOPD and stable periods, and serve as prognostic markers for predicting further AECOPD episodes after the baseline AECOPD. AECOPD in this study was defined as an increase in respiratory symptoms, following Anthonisen's criteria [ 9 ] , along with the need for additional pharmacological treatment. Moderate AECOPD was defined as a worsening of symptoms for more than 2 consecutive days, requiring treatment with systemic glucocorticoids, antibiotics, or both. Severe AECOPD was defined as moderate AECOPD that also required (enhanced) oxygen therapy, intensification of long-term home non-invasive ventilation (NIV), or escalation to hospital care. Exclusion criteria for all participants: Confirmed diagnosis of cystic fibrosis, pneumonia risk factors, or other respiratory disorders (e.g., asthma, lung cancer, pulmonary embolism, interstitial pulmonary disease, active pulmonary tuberculosis, severe bronchiectasis, etc.). Coexisting renal insufficiency, hematological diseases, rheumatic immune diseases, immunodeficiencies, or hyperthyroidism. History of lung surgery. Long-term corticosteroid or antibiotic therapy. Inclusion criteria for healthy controls: Male or female aged 18–70 years. Normal lung function test results. Inclusion criteria for COPD patients: Male or female aged 40–85 years. Confirmed diagnosis of COPD based on post-bronchodilator spirometry with FEV1 ≤ 80% of predicted normal and FEV1/FVC ratio. No moderate or severe COPD exacerbation unresolved for at least one month prior to enrollment. 2.2. Participants enrollment and sample collection Each COPD patient's history of AECOPD was carefully documented. We defined episodes of AECOPD in the year prior to baseline as any exacerbation characterized by increased respiratory symptoms and the need for additional pharmacological treatment. This number was referred to as "Episode_of_AECOPD" in our prognostic analysis. Additionally, we recorded the number of severe AECOPD episodes requiring hospitalization during the same period, which was labeled "Hospitalized_AECOPD" in the analysis. Stable COPD patients were consecutively recruited from community health centers (CHCs), while AECOPD patients were consecutively enrolled either from our outpatient clinic (for moderate AECOPD) or from the respiratory ward (for severe AECOPD). To reduce potential variability, sputum samples were collected from moderate AECOPD patients within 48 hours of symptom onset; for AECOPD patients requiring inpatient care, sputum samples were collected within 48 hours of admission, and the length of hospital stay was recorded to assess short-term prognosis. In total, we recruited 35 AECOPD patients and 19 stable COPD patients. To monitor long-term prognosis of AECOPD patients, we conducted monthly remote follow-ups over a 36-month period, recording each patient’s first episode of severe AECOPD requiring hospitalization. Pulmonary carcinogenesis, mortality from non-respiratory diseases, and loss to follow-up were considered censored events. To further validate that our findings were disease-specific, we recruited an additional 15 healthy volunteers. This included 12 older healthy controls aged 50–70 (AgedHC Group) and 3 younger healthy controls aged under 35 (YoungHC Group). All participants provided written informed consent after receiving comprehensive information about the study's objectives and procedures. 2.3. Proteomic Proteomic analyses were conducted in two separate batches with slight procedural variations, employing both data dependent acquisition (DDA) and data independent acquisition (DIA) mass spectrometry approaches. Sample preparation adhered to standard operating procedures across both batches. Below, we present the details of the mass spectrometry (MS) assay methods and parameters used in batch 1, which were similar to those applied in batch 2. 2.3.1. DDA Mass Spectrometry Assay All fraction samples were analyzed using a TIMSTOF mass spectrometer (Bruker) coupled to an Evosep One liquid chromatography system. The MS was operated in data-dependent acquisition mode to generate an ion mobility-enhanced spectral library. Accumulation and ramp times were each set to 100 ms, and mass spectra were recorded within an m/z range of 100–1700 in positive electrospray mode, with a dynamic exclusion time of 24.0 seconds. The ion source voltage was set to 1500 V, temperature to 180°C, and dry gas flow rate to 3 L/min. Ion mobility was scanned over a range of 0.75 to 1.35 Vs/cm², followed by 8 cycles of PASEF MS/MS. 2.3.2. Mass Spectrometry Assay for DIA Peptides from each sample were analyzed on a TIMSTOF mass spectrometer (Bruker) coupled to an Evosep One liquid chromatography system (Denmark) in data-independent acquisition (DIA) mode. The mass spectrometer collected ion mobility MS spectra across a mass range of m/z 100–1700, with up to 4 windows defined for individual 100 ms TIMS scans based on the m/z-ion mobility plane. During PASEF MS/MS scanning, collision energy was linearly ramped as a function of mobility, from 20 eV at 1/K0 = 0.85 Vs/cm² to 59 eV at 1/K0 = 1.30 Vs/cm². 2.3.3. Mass spectrometry data analysis For DDA library data, the FASTA sequence database was searched using Spectronaut™ 14.4.200727.47784 (Biognosys) software. The database was downloaded from the website: http://www.uniprot.org . iRT peptide sequences (Biognosys|iRT Kit|) were added. The parameters were set as follows: enzyme is trypsin, with a maximum of 1 missed cleavage; fixed modification is carbamidomethylation (C), and dynamic modifications are oxidation (M) and N-terminal acetylation (Protein N-term). All reported data are based on 99% confidence for protein identification, with a false discovery rate (FDR) ≤ 1%. DIA data was analyzed in Spectronaut™ 14.4.200727.47784 using the above-constructed spectral library. The main software parameters were set as follows: retention time prediction type is dynamic iRT, interference correction at the MS2 level is enabled, and cross-run normalization is enabled. All results were filtered with a Q-value cutoff of 0.01 (equivalent to FDR < 1%). 2.4. Radiomics Retrospective chest CT scans from 35 AECOPD patients were acquired and anonymized prior to analysis. All DICOM series were preprocessed using SimpleITK: images were resampled to an isotropic resolution of 1×1×1 mm³ using linear interpolation and converted to NIfTI format. Anatomical segmentation was performed automatically with MEDPSeg [ 10 ] , a deep learning framework that generated binary masks for the whole lung, airway tree, pulmonary vessels, and five lung lobes without manual correction. Ground-glass opacity (GGO) volume and GGO percentage of involvement (POI), as well as consolidation volume, were also calculated from the segmented masks. Quantitative biomarkers were extracted from the segmented regions. Lung parenchyma was defined as the lung mask excluding the airways. Emphysema severity was quantified as the percentage of low-attenuation areas (LAA%) within the parenchyma at thresholds of -910, -950, and − 970 HU, both globally and per lobe. First-order radiomic features, including mean, standard deviation (std_hu), skewness, kurtosis, and 15th percentile of Hounsfield units (hu), were extracted from the intensity distributions of the lung parenchyma, airways, and vessels. 2.5. Bioinformatic analyses 2.5.1. Preprocessing We performed bioinformatic analyses based on protein expression matrices generated by the DIA software tools. First, we converted UniProt codes into their corresponding gene symbols. For proteins without corresponding gene symbols, the UniProt codes were retained. Protein expressions labeled as "filtered" or "NA" were converted to 0, and the expression matrices were subsequently log2 transformed. To address batch effects, we employed the specialized tool harmonizR [ 11 ] , which is designed for this purpose. The effectiveness of batch effect removal was validated using PCA plots. 2.5.2. Differential expression analysis We assessed associations between protein expression and different sample groups, including AECOPD, stable COPD, and healthy controls, by using a batch-corrected protein expression matrix and a general linear model (GLM) in R. Covariates included demographic variables (age, gender, BMI, smoking status, and pack-years), with gender, smoking status, and medication use treated as binary variables, and age, BMI, and pack-years as continuous variables. For intra-COPD comparisons (e.g., AECOPD vs. stable COPD, frequent exacerbators vs. infrequent exacerbators), additional clinical metadata on maintenance medications (LAMA+LABA, LABA + ICS, and LABA+LAMA + ICS) were incorporated. To account for potential oral contamination, a set of oral-related protein markers was selected, including salivary enzymes (AMY1B), salivary mucins (MUC7), proline-rich salivary proteins (PRB1, PRB2, PRB3, PRB4), antimicrobial peptides (HTN1, PIP), salivary immune-related proteins (BPIFA2), and oral epithelial keratins (KRT1, KRT4, KRT6A, KRT10, KRT13, KRT16). An oral contamination score was calculated for each sample using single-sample gene set enrichment analysis (ssGSEA). Other clinical data, such as comorbidities, lung function, and blood test results, were excluded due to a high proportion of missing values. After adjusting for covariates, proteins with a p-value < 0.05 were considered statistically significant differentially expressed proteins (DEPs). The log2 fold-change (log2FC) of DEPs was calculated as follows: $$\:log2FC=\frac{\left(\sum\:_{i=1}^{m}log2\left({G1}_{i}\right)\right)}{m}-\frac{\left(\sum\:_{j=1}^{n}log2\left({G2}_{j}\right)\right)}{n}$$ where G1𝑖 and G2j represent the batch-corrected expression of the protein in the 𝑖th sample of group 1 and the jth sample of group 2, respectively, and m and n correspond to the total number of samples in group 1 and group 2. 2.5.3. Gene set enrichment analysis (GSEA) To associate gene expression with biological functions, we used the differentially expressed proteins (DEPs) and their corresponding log2 fold-change (log2FC) values as input for Gene Set Enrichment Analysis (GSEA). The analysis was performed by querying pre-defined gene sets from the Gene Ontology (GO) database using the clusterProfiler R package [ 12 ] . The GSEA algorithm traversed the ranked protein list, adjusting the running-sum statistic based on whether a protein belonged to a specific gene set. When a protein in the list was part of a gene set, the running-sum statistic increased in proportion to the protein's expression level; otherwise, it decreased at a constant rate. The enrichment score (ES) was defined as the maximum deviation of the running-sum statistic from zero. Gene sets with a p-value < 0.05 were considered statistically significant. 2.5.4. Prognostic analysis To predict the prognosis of AECOPD following baseline onset, we first evaluated all proteins and clinical metadata using univariate Cox regression analysis based on the batch-corrected proteomic matrix. Variables with a z-test p-value 0.05 were included in a forest plot. For proteins that passed this filtering process, we performed GO enrichment analysis and selected clusters of proteins associated with relevant terms to construct biofunction-specific AECOPD prognostic models. The selected variables, identified based on enriched pathways, were subsequently used to develop a multivariate Cox regression model to predict future AECOPD. To address multicollinearity, we applied an L1-penalized Cox model to remove redundant protein predictors with less contribution to predicting future AECOPD. The optimal penalty parameter (λ) was tuned using cross-validation. Variables with non-zero coefficients were defined as predictor genes and entered into the first round of multivariate Cox regression, where variables that violated the PH assumption were removed. In the second round of multivariate Cox regression, we identified predictor variables for future AECOPD and constructed a linear risk score model for each patient: $$\:Risk\:Score={\beta\:}_{0}+{\sum\:}_{i=1}^{n}{{\beta\:}_{i}x}_{i}$$ where 𝑥𝑖 represents the feature values and 𝛽𝑖 the corresponding coefficients. After calculating the risk scores for patients in batch 1 and batch 2, event-free survival curves for high-risk and low-risk groups (stratified based on the median risk score) were generated using the Kaplan-Meier method and compared using the log-rank test. The performance of the prognostic model, along with the known predictor, history of AECOPD, was evaluated using the concordance index (C-index). 2.5.5. Consensus clustering To dissect the inherent heterogeneity among AECOPD patients, we employed consensus clustering using the ConsensusClusterPlus R package [ 13 ] . We subsetted the batch-corrected matrix to include only AECOPD patients and first selected the most informative genes by removing those with zero variance (i.e., where all values are the same). Genes with zero Median Absolute Deviation (MAD) are considered uninformative for clustering and were excluded from further analysis. We set the maximum number of clusters (maxK) to 6 and used consensus clustering to evaluate the stability of the clusters by repeatedly clustering subsets of the data and measuring the consistency of the results. This approach helps determine the optimal number of clusters (K) in an unsupervised manner. A smaller Proportion of Ambiguously Clustered (PAC) value indicates more definitive and stable clustering. By calculating the PAC for each number of clusters (from K = 2 to maxK = 6), we selected the K that minimized PAC to ensure that the chosen clustering solution provided the most stable and least ambiguous grouping of the samples. The assignment of each AECOPD patient was added as clinical metadata and used in WGCNA to identify endotype-specific protein clusters. 2.5.6. Weighted correlation network analysis (WGCNA) To associate protein expression patterns with clinical metadata and explain the heterogeneity of AECOPD, we used the AECOPD-specific matrix mentioned above to perform WGCNA [ 14 ] . A signed co-expression network was constructed using a stepwise approach. First, the pickSoftThreshold function was employed to determine the optimal soft power β from a series of candidates. This power was used to calculate the adjacency between two arbitrary genes m and n as follows: $$\:\text{A}\text{d}\text{j}\text{a}\text{c}\text{e}\text{n}\text{c}\text{y}=\:{\left|cor(m,\:n)\right|}^{\beta\:}$$ A scale-free topological overlap matrix was then built based on the adjacency matrix. Next, with the minimum module size set to 30, we identified modules containing highly co-expressed proteins. These modules were represented by the module eigengene (ME), which refers to the first principal component (PC1) of the inter-module gene expression profile. Based on hierarchical clustering of MEs, modules with high similarity were merged using the mergeCloseModules function with a manually defined threshold. MEs were recalculated as merged module eigengenes (mergedMEs). Finally, a module-trait correlation heatmap was generated based on the Pearson correlation between clinical traits and mergedMEs. Modules with the highest association with a clinical trait of interest were selected for subsequent analyses. Proteins in the selected modules were referred to as module proteins. 3. Results 3.1. Batch effect correction Two batches of sputum samples were initially preprocessed in R and visualized with principal component analysis (PCA), revealing a strong batch effect (Fig. 1 A). This effect was corrected using harmonizR (Fig. 1 B). After correction, no clear separation was observed among the three groups (AECOPD, AgedHC, and Stable), except for three YoungHC samples. These findings suggest that COPD and AECOPD biomarkers are subtle and require more detailed analyses to identify robust diagnostic markers. 3.2. Comparison of demographic data We consecutively enrolled 54 patients with AECOPD or stable COPD, along with 15 healthy controls, for demographic and clinical comparisons. Significant group differences were observed in age and gender. AECOPD patients were older than both stable COPD patients and controls, with a median age of 70 years (IQR 63–72), compared to 62 years (IQR 54.5–67) and 54 years (IQR 51–56), respectively (p < 0.01). Gender distribution also differed significantly: 94% of AECOPD patients were male, compared to 74% of stable COPD and 40% of controls (p < 0.01). No significant differences were found in smoking status or BMI across the three groups. Within the COPD cohort, AECOPD patients were older than those with stable disease (p = 0.01) and more frequently received triple therapy (31% vs. 5%, p = 0.04). No significant differences were observed in gender, BMI, or smoking history. Oral contamination scores were not significantly different among groups and were therefore not included as covariates. Based on these findings, age, gender, BMI, smoking status, and treatment regimen (for intra-COPD comparisons) were included as covariates in subsequent analyses. Table 1 Demographic Comparison Among AECOPD, Stable COPD, and Healthy Controls Gender Male (%) AECOPD (n = 35) Control (n = 15) Stable (n = 19) p 0.94 0.4 0.74 # 0* Active_Smoker (%) 0.26 0.2 0.32 # 0.82 Age 70 (63–72) 54 (51–56) 62 (54.5–67) 0* BMI 22.5 (20.07–23.45) 23.7 (22.85–25.9) 23 (21.7–25.3) 0.08 Oral Contamination Score (ssGSEA) 1.37 (1.28–1.55) 1.4 (1.3–1.52) 1.39 (1.23–1.54) 0.92 Table 2 Demographic Comparison of AECOPD vs Stable COPD Gender Male (%) AECOPD (n = 35) Stable (n = 19) p 0.94 0.74 # 0.08 Active_Smoker (%) 0.26 0.32 0.89 LAMA_LABA (%) 0.09 0.05 # 1 LABA_ICS (%) 0.26 0.11 # 0.29 Triple_therapy (%) 0.31 0.05 # 0.04* Age 70 (63–72) 62 (54.5–67) 0.01* BMI 22.5 (20.07–23.45) 23 (21.7–25.3) 0.17 All binary data are presented as percentages (%), while quantitative data are expressed as the median (Q1-Q3). P-values were calculated using the Wilcoxon rank-sum test or Kruskal-Wallis test, with Fisher’s exact test used where indicated by the nature of the data (denoted by #). A p-value < 0.05 was considered statistically significant (*). 3.3. Differential expression analyses and GSEA Differential expression analysis was performed among AECOPD, stable COPD, and healthy controls (HC, combining YoungHC and AgedHC). Differentially expressed proteins were subjected to GSEA. Compared with HC, AECOPD showed decreased levels of proteins related to cytosolic ribosomes and macroautophagy (Fig. 1 C). Cilium-associated proteins were upregulated in AECOPD compared with stable COPD, while proteins linked to cell death, antigen binding, and immunoglobulin production showed the opposite trend (Fig. 1 D). In stable COPD vs. HC, wound healing and cilium-associated proteins were coordinately downregulated (Fig. 1 E). Volcano plots below each GSEA panel display the -log10 transformed p-values (GLM test) and log2 fold changes for proteins involved in these pathways, highlighting immunoglobulin isotype heavy chains. In Fig. 1 D, upregulated proteins associated with immunoglobulin production in AECOPD (e.g., IGKV1-12, IGLV1-47, IGKV2-28, IGKV4-1, IGKV3D-11, IGKV1D-33, IGLV5-45, IGLV5-52) did not include isotype-related proteins (IGHG2, IGHG4, IGHM, IGHA1, IGHA2). 3.4. Association of cilium function and mucins and apoptotic markers To determine whether these cilium-associated changes were related to apoptosis-induced epithelial detachment, we examined canonical apoptosis markers (BCL2L13, BAX, CASP3). None of these markers were differentially expressed between the AECOPD and stable COPD groups, suggesting that this phenomenon is more likely due to ciliogenesis rather than epithelial shedding caused by apoptosis. Cilia mediate mucus clearance [ 15 ] , suggesting that alterations in cilium-associated proteins may influence mucin levels. Although mucin-related pathways were not differentially enriched across groups (Fig. 2 A), across the entire cohort we observed a negative association between MUC5B and several cilium-associated proteins (PCM1, ARFGEF2, CFAP58, ATP2B4, HVCN1). In contrast, MUC5AC, a gel-forming mucin, was exclusively and significantly associated with the cilium-associated protein TMEM231, which was upregulated in AECOPD compared to stable COPD. Several immunoglobulin variable regions downregulated in AECOPD (e.g., IGHV3-15, IGKV1-12, IGLV1-47, IGKV2-28, IGKV4-1, IGKV3D-11, IGKV1D-33, IGLV5-45, IGLV5-52) correlated strongly with MUC5B (Fig. 2 A). Immunoglobulin heavy chains, particularly IGHA1 and IGHA2, were also significantly associated with their corresponding variable regions. As key components of secretory IgA, their presence highlights the epithelial and secretory nature of the local immune repertoire. The Venn diagram inset in (Fig. 2 A) shows the overlap of cilium-associated proteins across comparisons. Notably, PCM1 and TMEM231 were downregulated in stable COPD compared to AECOPD or healthy controls (HC). Among cilium-associated proteins upregulated in AECOPD, ARFGEF2 was elevated in frequent exacerbators versus stable COPD, whereas ATP2B4 and TMEM231 were elevated in infrequent exacerbators versus stable COPD. Heatmaps in Fig. 2 B display protein levels across functional groups: cilium (gold), mucins (blue), apoptosis regulators (red), immunoglobulins (turquoise), wound healing (purple), and cytosolic ribosome/macroautophagy (black). Top annotations indicate group origin and the metadata "NumOfAE." 3.5. Construction of AECOPD prognostic model During the 36-month follow-up, 18 of 39 AECOPD patients (46%) experienced at least one hospitalization due to exacerbations. To identify prognostic biomarkers for future AECOPD events, we conducted univariate Cox regression analysis. Enrichment analysis indicated that these biomarkers were predominantly associated with mucin-type O-glycan biosynthesis, lysosome, membrane lipid metabolic processes, and the immunoglobulin complex (Fig. 3 B). With the exception of IGHV3-23 and IGLV3-12, most immunoglobulin-related biomarkers were linked to reduced risk (hazard ratio < 1; Fig. 3 A). Considering the heterogeneity of AECOPD and its complex pathophysiology, we applied LASSO feature selection (Fig. 3 C–E) and constructed a multivariate Cox prognostic model (Fig. 4 ). Patients were stratified into high- and low-risk groups based on the derived risk score. Kaplan–Meier survival analysis demonstrated significant separation between groups, with high-risk patients showing poorer survival probabilities (log-rank p < 0.05). Importantly, the composite risk score outperformed the traditional clinical predictor “Episode_of_AECOPD” across the full cohort and in both Batch 1 and Batch 2. We also highlight an example immunoglobulin marker, IGKV1D-13, which showed a significant protective association with future exacerbations in Batch 2 (log-rank p < 0.05) and a less pronounced protective trend in Batch 1 (log-rank p = 0.16), likely due to the relatively small sample size. 3.6. Identification of AECOPD endotypes Consensus clustering identified six clusters (k = 6), of which clusters 1 and 6 (each containing only one patient) were excluded (Fig. 5 A–B). Co-expression analysis identified modules of highly correlated proteins, and modules with eigengene similarity above 0.6 (height cutoff = 0.4) were merged (Fig. 5 C–D). Module–trait correlations were then calculated using Pearson correlation (Fig. 5 E). The neutrophilic module (khaki) was associated with neutrophil activation and migration (Fig. 7 A), consensus cluster 4, longer length of hospital stay, and former smoking status. It showed both positive and negative associations with blood neutrophil and eosinophil counts and was positively linked to CRP levels (Fig. 7 B), likely representing neutrophilic AECOPD. Consensus cluster 5 displayed the opposite trend to cluster 4 and showed the highest (though modest) correlation with the immunoglobulin module (mediumvioletred; cor = 0.25, p = 0.1). The cilium module (navyblue) was positively associated with ciliary proteins (Fig. 9 B), MUC5AC expression (Fig. 7 A), and improved airflow (FEV1/FVC%; Fig. 6 A). The keratinized epithelial module (darkseagreen) was enriched for cytoplasmic translation, epidermis development, and keratinocyte differentiation (Fig. 9 A). It was negatively correlated with MUC5AC (r = -0.54, p = 8e-04; Fig. 6 B) and CRP, but positively correlated with blood sodium levels (Fig. 7 B). This module was strongly associated with consensus cluster 3 (cilium-depleted AECOPD) and inversely associated with cluster 2 (cilium-enriched AECOPD). The immunoglobulin module (mediumvioletred) was negatively associated with the prognostic risk score (cor = -0.41, p = 0.01; Fig. 5 E) but showed no significant association with any consensus cluster. It exhibited strong correlations with multiple immunoglobulin genes, including IGLV3-19, IGHV3-73, IGKV1-12, IGLV1-47, IGKV2-28, IGKV4-1, IGKV3D-11, and IGKV1D-33, which were downregulated in AECOPD compared to stable COPD (Fig. 6 B). The strongest associations were observed for IGHA1 and IGHA2, whereas correlations with IGHM were weaker. 3.7. Integration of proteomic modules with radiomic phenotypes To connect WGCNA-derived proteomic modules with lung structural changes, we correlated module eigengenes with quantitative radiomic features extracted from automated chest CT segmentation. The keratinized epithelial module was positively correlated with ground-glass opacity (GGO) volume, percentage of opacity involvement (POI), consolidation volume, overall lung findings volume, and parenchymal heterogeneity (std_hu), with representative CT images of the corresponding consensus cluster 3 shown in (Fig. S2). In contrast, the ciliary and immunoglobulin modules showed inverse correlations with GGO volume, consolidation, POI, and std_hu, indicating more homogeneous lung parenchyma and reduced inflammatory opacities, with representative CT images of consensus clusters 2 and 5 presented in (Figs. S1 and S4). Emphysema indices (LAA% at − 910, −950, and − 970 HU) showed the strongest associations with the neutrophilic module, with representative CT images of consensus cluster 4 shown in (Fig. S3). These radiomic patterns highlight distinct structural phenotypes across endotypes. The keratinization-dominant profile is linked to greater parenchymal heterogeneity and inflammatory burden, whereas the ciliary- and immunoglobulin-enriched profiles correspond to more homogeneous lung attenuation and lower inflammatory opacity. The neutrophilic module, in turn, was most strongly associated with emphysematous changes, suggesting that neutrophilic inflammation may preferentially contribute to parenchymal destruction rather than acute consolidative or ground-glass opacities. 3.8. Summary of proteomic results As shown in (Fig. 9 D), we analyzed the relationship between gene significance (GS) for oral contamination score and module membership (MM) in the darkseagreen module (Keratinized epithelial module) to verify its potential origin. The significant negative correlation (r = -0.3, p = 5.1e-06) excluded oral origin, indicating instead an airway epithelium source for this module. To refine significant proteomic markers for cluster 2, we applied dual thresholds to both GS and MM, thereby focusing subsequent analyses on proteins demonstrating both strong biological significance and high module connectivity. The selected cilium proteins, along with those in the Keratinized epithelial module, are displayed in the Circlize heatmap (Fig. 9 C), which shows protein expression (outer ring) and STRING-based interaction networks (inner ring). Strong intra- and inter-module interactions were observed among ciliary and keratinized epithelial proteins. Specifically, MUC5B interacted with KRT5, whereas MUC5AC interacted with KRT4, KRT13, KRT14, and KRT17. Cluster 3 patients exhibited higher expression of keratinized epithelial proteins and lower levels of MUC5AC and cilium-associated proteins, whereas Cluster 2 showed the opposite pattern (Fig. 9 E). The circlize heatmap demonstrated consistent module assignments across the entire cohort, including both Batch 1 and Batch 2 (Fig. 9 E). The heatmap in Fig. 9 D highlights associations within the immunoglobulin module, particularly with IGHA2 and IGHM (but not IGHG2), as well as correlations between the cilium and immunoglobulin modules with the FEV1/FVC actual/predicted ratio. Prognostic proteins (indicated in blue font), including IGHV3-15, IGHV3-23, IGLV3-12, IGKV1D-13, IGKV1-39, and others, were associated with future exacerbations. Notably, IGHV3-15 was both downregulated in AECOPD and shown to be protective in survival models (Fig. 3 A, Fig. 2 B). 4. Discussion In this study, we observed a coordinated upregulation of cilium-associated proteins in AECOPD compared with stable COPD, including PCM1, CFAP58, and TMEM231. These proteins play critical roles in centriolar satellite organization, ciliary motility, and transition-zone integrity, respectively [ 16 – 19 ] . While reduced ciliary function has been well documented in stable COPD [ 19 ] , to our knowledge, this is the first study to report increased sputum levels of cilium-associated proteins during acute exacerbations. We initially hypothesized that this elevation might reflect apoptosis-induced epithelial shedding. However, canonical apoptosis markers (BCL2L13, BAX, CASP3) showed no significant difference between AECOPD and stable COPD, suggesting an active ciliogenesis process rather than passive cell detachment. This interpretation is supported by the positive correlation between the ciliary functional module and lung function parameters such as FEV1/FVC. Mucin analysis revealed a clear functional dichotomy. MUC5B was negatively correlated with most upregulated ciliary proteins, consistent with its essential role in baseline mucociliary clearance and immune homeostasis [ 20 – 23 ] . In contrast, MUC5AC showed strong positive correlations with both the ciliary and immunoglobulin functional modules, while displaying a strong negative correlation with the keratinized epithelial module. An excessive increase in MUC5AC is known to thicken mucus and impair clearance [ 24 ] , and higher baseline MUC5AC predicts accelerated lung function decline in COPD [ 25 ] . We interpret the elevated MUC5AC in the ciliary-enriched endotype as reflecting enhanced mucus mobilization and expectoration driven by preserved ciliary function and humoral immunity. Conversely, lower MUC5AC in the keratinization-dominant endotype likely results from impaired mucociliary clearance due to ongoing epithelial repair and mucus retention within the airways. An intriguing finding of our study is that GSEA revealed increased expression of cilium-related proteins but decreased immunoglobulin abundance compared with stable COPD, suggesting an imbalance between mechanical (cilia-driven) and humoral (antibody-mediated) mucosal defenses. Consistent with this, prognostic analyses showed that higher immunoglobulin levels were generally protective against future exacerbations, underscoring their critical role in airway defense. In WGCNA, the positive correlations between immunoglobulin modules and mucus components (MUC5AC and MUC5B) indicate that humoral responses during exacerbation are closely associated with mucus hypersecretion and microbial stimulation. Although reductions in total IgG and specific IgG subclasses are established risk factors for COPD exacerbations and hospitalizations [ 26 ] , the contribution of IgA subclasses has received less attention. In our study, the immunoglobulin module was strongly linked to IGHA1 and IGHA2, implying that airway IgA may facilitate pathogen trapping and clearance within mucus during AECOPD, this role is analogous to that of secretory IgA in the intestinal mucosa [ 27 ] . Collectively, these findings indicate that a major airway defense deficit in COPD lies in IgA, instead of other isotypes. Using WGCNA and consensus clustering, we identified two major molecular endotypes. The ciliary-enriched endotype was characterized by active ciliogenesis and elevated MUC5AC, whereas the keratinization-dominant endotype featured epithelial repair programs, including keratinocyte differentiation and increased expression of KRT5, KRT14, and KRT15 [ 28 , 29 ] . Integration with quantitative chest CT radiomics further delineated these endotypes. The keratinized epithelial module was positively associated with ground-glass opacity (GGO) volume, consolidation burden, and parenchymal heterogeneity, consistent with greater inflammatory infiltration and tissue remodeling. In contrast, both the ciliary and immunoglobulin modules showed inverse correlations with GGO volume, consolidation, and parenchymal heterogeneity, corresponding to more homogeneous lung attenuation and better-preserved lung function. These radiomic associations suggest that the ciliary-enriched and immunoglobulin-intact profile corresponds to a state with fewer inflammatory foci and relatively preserved airway homeostasis, whereas the keratinization-dominant profile reflects more severe ongoing inflammation and structural disorganization. Collectively, our results demonstrate that AECOPD is molecularly heterogeneous. One subset of patients exhibits ciliary activation that supports mucus expectoration, while another is dominated by epithelial repair responses accompanied by greater imaging-evident inflammatory burden. Superimposed on this heterogeneity is a broad defect in humoral immunity, particularly involving secretory IgA, which carries independent prognostic significance. This study provides new insight into the pathogenesis of AECOPD and demonstrates the value of integrating sputum proteomics with radiomics for defining biologically and clinically meaningful endotypes. Larger multicenter cohorts and functional studies are warranted to validate these findings and translate them into precision therapeutic strategies for COPD. Declarations Clinical trial number: not applicable. Ethics, Consent to Participate, and Consent to Publish declarations This study was approved by the Ethics Committee of Shenzhen People's Hospital (reference number: KY-LL-2020294-01) and was conducted in strict accordance with the principles of the Helsinki Declaration. All participants provided informed consent, including an agreement to publication. Data Availability The data reported in this paper have been deposited in the OMIX, China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (https://ngdc.cncb.ac.cn/omix: accession no. OMIX007953). Acknowledgement LW conceived the manuscript. JZ performed the analysis and wrote the manuscript. HL, FZ, YM, and SC collected the samples. LW and KY revised the manuscript. Funding Declaration This work was supported by the grant from the National Key R&D Program of China (2022YFF0710802; 2022YFF0710800), Shenzhen Clinical Research Center for Respiratory Disease (LCYSSQ20220823091203007), Shenzhen Key Laboratory of Respiratory Diseases (SYSPG20241211173920041), National Natural Science Foundation of China (82470033; 82103941) Shenzhen Medical Academy of Research and Translation (C2302001) The Guangdong Basic and Applied Basic Research Foundation (2022A1515011966; 2024A1515012923). Conflict of Interest Statement Jiacheng Zhong, Zhen Wang, Hanting Li, Kai Yang, Yu Meng, Shanze Chen, and Lingwei Wang declare that they have no conflicts of interest relevant to the content of this manuscript. References Seemungal TA, Hurst JR, Wedzicha JA. Exacerbation rate, health status and mortality in COPD–a review of potential interventions. Int J Chronic Obstr Pulm Dis. 2009;4:203–23. Bozinovski S, Hutchinson A, Thompson M, et al. Serum amyloid a is a biomarker of acute exacerbations of chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2008;177:269–78. Kim SH, Ahn HS, Park JS, et al. A Proteomics-Based Analysis of Blood Biomarkers for the Diagnosis of COPD Acute Exacerbation. Int J Chronic Obstr Pulm Dis. 2021;16:1497–508. Mayhew D, Devos N, Lambert C, et al. Longitudinal profiling of the lung microbiome in the AERIS study demonstrates repeatability of bacterial and eosinophilic COPD exacerbations. Thorax. 2018;73:422–30. Aaron SD. Reaching for the Holy Grail of Chronic Obstructive Pulmonary Disease Outcomes. Can Medications Modify Lung Function Decline? Am J Respir Crit Care Med. 2018;197:2–4. Guerra B, Gaveikaite V, Bianchi C et al. Prediction models for exacerbations in patients with COPD. European respiratory review: an official journal of the European Respiratory Society. 2017, 26 . Keene JD, Jacobson S, Kechris K, et al. Biomarkers Predictive of Exacerbations in the SPIROMICS and COPDGene Cohorts. Am J Respir Crit Care Med. 2017;195:473–81. Waeijen-Smit K, DiGiandomenico A, Bonnell J, et al. Early diagnostic BioMARKers in exacerbations of chronic obstructive pulmonary disease: protocol of the exploratory, prospective, longitudinal, single-centre, observational MARKED study. BMJ open. 2023;13:e068787. Anthonisen NR, Manfreda J, Warren CP, et al. Antibiotic therapy in exacerbations of chronic obstructive pulmonary disease. Ann Intern Med. 1987;106:196–204. Diedre S, Carmo JAR, Alejandro P, Comellas JM, Reinhardt SE, Gerard. Letícia Rittner, Roberto A. Lotufo MEDPSeg: hierarchical polymorphic multitask learning for the segmentation of ground-glass opacities, consolidation, and pulmonary structures on computed tomography. https://doi.org/10.48550/arXiv.2312.02365 Voß H, Schlumbohm S, Barwikowski P, et al. HarmonizR enables data harmonization across independent proteomic datasets with appropriate handling of missing values. Nat Commun. 2022;13:3523. Wu T, Hu E, Xu S et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation (Cambridge (Mass)). 2021, 2 , 100141. Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinf (Oxford England). 2010;26:1572–3. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. Bakshani CR, Morales-Garcia AL, Althaus M, et al. Evolutionary conservation of the antimicrobial function of mucus: a first defence against infection. NPJ biofilms microbiomes. 2018;4:14. Hall EA, Kumar D, Prosser SL et al. Centriolar satellites expedite mother centriole remodeling to promote ciliogenesis. eLife. 2023, 12 . Li ZZ, Zhao WL, Wang GS et al. The novel testicular enrichment protein Cfap58 is required for Notch-associated ciliogenesis. Biosci Rep. 2020, 40 . Zhang Q, Yang S, Chen X, et al. Identification of novel TMEM231 gene splice variants and pathological findings in a fetus with Meckel Syndrome. Front Genet. 2023;14:1252873. Ancel J, Belgacemi R, Diabasana Z et al. Impaired Ciliary Beat Frequency and Ciliogenesis Alteration during Airway Epithelial Cell Differentiation in COPD. Diagnostics (Basel, Switzerland). 2021, 11 . Song D, Iverson E, Kaler L, et al. MUC5B mobilizes and MUC5AC spatially aligns mucociliary transport on human airway epithelium. Sci Adv. 2022;8:eabq5049. Kesimer M, Ford AA, Ceppe A, et al. Airway Mucin Concentration as a Marker of Chronic Bronchitis. N Engl J Med. 2017;377:911–22. Roy MG, Livraghi-Butrico A, Fletcher AA, et al. Muc5b is required for airway defence. Nature. 2014;505:412–6. Carpenter J, Wang Y, Gupta R et al. Assembly and organization of the N-terminal region of mucin MUC5AC: Indications for structural and functional distinction from MUC5B. Proceedings of the National Academy of Sciences of the United States of America. 2021, 118 . Raclawska DS, Ttofali F, Fletcher AA, et al. Mucins and Their Sugars. Critical Mediators of Hyperreactivity and Inflammation. Annals Am Thorac Soc. 2016;13(Suppl 1):S98–99. Radicioni G, Ceppe A, Ford AA, et al. Airway mucin MUC5AC and MUC5B concentrations and the initiation and progression of chronic obstructive pulmonary disease: an analysis of the SPIROMICS cohort. Lancet Respiratory Med. 2021;9:1241–54. Leitao Filho FS, Ra SW, Mattman A, et al. Serum IgG subclass levels and risk of exacerbations and hospitalizations in patients with COPD. Respir Res. 2018;19:30. Mantis NJ, Rol N, Corthésy B. Secretory IgA's complex roles in immunity and mucosal homeostasis in the gut. Mucosal Immunol. 2011;4:603–11. Ievlev V, Lynch TJ, Freischlag KW et al. Krt14 and Krt15 differentially regulate regenerative properties and differentiation potential of airway basal cells. JCI insight. 2023, 8 . Zhu W, Han L, Wu Y, et al. Keratin 15 protects against cigarette smoke-induced epithelial mesenchymal transformation by MMP-9. Respir Res. 2023;24:297. Additional Declarations No competing interests reported. Supplementary Files supplementaryfigures.pdf (A) Principal component analysis (PCA) showing clear separation between the two batches of sputum proteome profiles. (B) After batch correction, no significant separation is observed between batches or among AECOPD, age-matched healthy controls (HC), and stable COPD patients. (C–E) Top panels: GSEA plots highlighting enriched (positive enrichment score) or depleted (negative enrichment score) biological pathways in Group 1 vs. Group 2. Bottom panels: Volcano plots showing log fold change (logFC) and p-values for leading-edge proteins from the corresponding GSEA. Positive logFC indicates higher expression in Group 1. Cite Share Download PDF Status: Under Review Version 1 posted Reviewers invited by journal 13 Apr, 2026 Editor invited by journal 10 Apr, 2026 Editor assigned by journal 10 Apr, 2026 Submission checks completed at journal 10 Apr, 2026 First submitted to journal 07 Apr, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9347057","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":622141013,"identity":"0e759dc5-18bb-4e29-8c1b-78337a0d3f61","order_by":0,"name":"Jiacheng Zhong","email":"","orcid":"","institution":"ShenZhen People’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Jiacheng","middleName":"","lastName":"Zhong","suffix":""},{"id":622141015,"identity":"0394bb3b-5010-4d91-8fd6-33795c2004eb","order_by":1,"name":"Zhen Wang","email":"","orcid":"","institution":"ShenZhen People’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Zhen","middleName":"","lastName":"Wang","suffix":""},{"id":622141017,"identity":"5a8772fe-f578-48c1-8560-c2f586378125","order_by":2,"name":"Hanting Li","email":"","orcid":"","institution":"ShenZhen People’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Hanting","middleName":"","lastName":"Li","suffix":""},{"id":622141020,"identity":"bdd0a9f9-182a-4c79-aa4a-8781910d313b","order_by":3,"name":"Kai Yang","email":"","orcid":"","institution":"ShenZhen People’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Kai","middleName":"","lastName":"Yang","suffix":""},{"id":622141021,"identity":"636fcb24-3ae3-452e-ade7-69c505fbb2b1","order_by":4,"name":"Yu Meng","email":"","orcid":"","institution":"ShenZhen People’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Yu","middleName":"","lastName":"Meng","suffix":""},{"id":622141022,"identity":"6b64b7be-391a-4f45-9662-ef7a4682cf3f","order_by":5,"name":"Shanze Chen","email":"","orcid":"","institution":"ShenZhen People’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Shanze","middleName":"","lastName":"Chen","suffix":""},{"id":622141023,"identity":"45c3ef97-ec5b-40e3-ae72-2ad99f4fcb1e","order_by":6,"name":"Lingwei Wang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABHElEQVRIie3QsUoDMRjA8Q8KNx1kTRG8V4gUjhsKvkbHBKFT7hS6dJCj03XQ2rXi0Fc43yBHQJcLXW8UBHHokEkcDjHXeiIlbVeH/IcvIfCDJAAu17/tCm8WoQHoZgMQHSFkS4pFSwTgY2S7dPwfAocIeVbyHUiUBIOLQvbrNEFIFVJfY0DTG2olZTKMzMVGZ9WQyjjzRt1FQoV4woBLldtIKHhIDGGPD5zIeOKzvPKJEB42h7GdrNYtudQyqjHLV6UhXwdIxXsvDVmecJDgEZYLTkSR7Sfn1ToEakgevJFillF237xF3WF/31u6c97Tepyy5a181Z91yuZIST3+6J+i6cxKTB6mzdeJ3wP/z7TX0c0MJjvE5XK5XG3fy29sORi1Ff4AAAAASUVORK5CYII=","orcid":"","institution":"ShenZhen People’s Hospital","correspondingAuthor":true,"prefix":"","firstName":"Lingwei","middleName":"","lastName":"Wang","suffix":""}],"badges":[],"createdAt":"2026-04-07 15:23:29","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9347057/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9347057/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":107356551,"identity":"484495af-6aca-455d-b96d-d06288559d93","added_by":"auto","created_at":"2026-04-20 17:01:43","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":207952,"visible":true,"origin":"","legend":"\u003cp\u003e(A) Principal component analysis (PCA) showing clear separation between the two batches of sputum proteome profiles. (B) After batch correction, no significant separation is observed between batches or among AECOPD, age-matched healthy controls (HC), and stable COPD patients. (C–E) Top panels: GSEA plots highlighting enriched (positive enrichment score) or depleted (negative enrichment score) biological pathways in Group 1 vs. Group 2. Bottom panels: Volcano plots showing log fold change (logFC) and p-values for leading-edge proteins from the corresponding GSEA. Positive logFC indicates higher expression in Group 1.\u003c/p\u003e","description":"","filename":"figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/1aa80502b7ea76ef07ca34f4.png"},{"id":107356553,"identity":"907f8b73-703b-4fe1-8791-079ddfd6b6fa","added_by":"auto","created_at":"2026-04-20 17:01:43","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":186867,"visible":true,"origin":"","legend":"\u003cp\u003e(A) Correlation matrix of apoptosis markers, AE signature proteins (including cilia and immunoglobulins), immunoglobulin heavy chains, and mucins. Significant correlations (p \u0026lt; 0.05) are represented by colored circles. The inset shows intersections between comparisons. (B) Heatmap of key proteins from the GSEA analysis in Fig. 1, along with immunoglobulin heavy chains and mucins. No significant differences in oral contamination scores were observed across patient groups. Proteins differentially expressed between AECOPD and stable COPD are highlighted in bold. Expression values are adjusted for age, sex, BMI, and medication status. Color coding: gold, cilium-related genes (upregulated in AECOPD vs. stable COPD, or downregulated in stable COPD vs. healthy controls); blue, mucin genes (downregulated in AECOPD vs. stable COPD); brown, apoptosis genes; turquoise, immunoglobulin production/isotype determination genes; purple, wound healing genes (downregulated in stable COPD vs. healthy controls); black, genes downregulated in AECOPD vs. controls.\u003c/p\u003e","description":"","filename":"figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/bba431dade2d8875e8415c62.png"},{"id":107484914,"identity":"bc3a2d49-adfd-4731-9596-605a9bf3343c","added_by":"auto","created_at":"2026-04-22 02:33:17","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":88289,"visible":true,"origin":"","legend":"\u003cp\u003e(A) Forest plot showing the independent prognostic value of features identified by univariate Cox regression. Solid boxes and lines represent hazard ratios (HRs) and 95% confidence intervals (CIs). (B) Bubble plot showing GO/KEGG functional enrichment of prognostic genes. The color gradient indicates enrichment significance. (C) LASSO regression coefficients for predictor features. (D) Partial likelihood deviation curve for LASSO coefficients. (E) Trace plot of LASSO coefficients, showing coefficient shrinkage with increasing penalty parameter λ.\u003c/p\u003e","description":"","filename":"figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/b87de7e70192d6bc23b61bf3.png"},{"id":107486279,"identity":"ee74043e-2509-4649-b9cf-e843a51b7762","added_by":"auto","created_at":"2026-04-22 02:37:58","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":143689,"visible":true,"origin":"","legend":"\u003cp\u003eLeft: Kaplan–Meier survival curves for Batch 1, Batch 2, and the entire cohort, stratified by history of AECOPD and by risk score. Right: Kaplan–Meier survival curves of IGKV1D-13 in Batch 1 and Batch 2. Vertical and horizontal dashed lines indicate the 50% survival probability and the corresponding time point. Shaded areas represent 95% confidence intervals (CIs) for each group. Bottom left: Forest plot showing hazard ratios (HRs), 95% CIs, and p-values for key prognostic features (*p \u0026lt; 0.05).\u003c/p\u003e","description":"","filename":"figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/19e87b35900da45c7665c103.png"},{"id":107356555,"identity":"6e8fa6b1-1c3d-4605-a6d5-34f2586aecc4","added_by":"auto","created_at":"2026-04-20 17:01:43","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":207323,"visible":true,"origin":"","legend":"\u003cp\u003e(A) Consensus matrix heatmap with consensus values shown on a white-to-blue scale. (B) Cumulative distribution function (CDF) plot of consensus distributions for each k. (C) Module eigengenes, with a red line indicating the cutoff for merging dynamic modules. (D) Clustering dendrogram of proteins with module colors assigned before and after merging. (E) Heatmap of correlations between protein modules and demographic metadata, color-coded by correlation values and p-values.\u003c/p\u003e","description":"","filename":"figure5.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/2d5de234b84c54a382233305.png"},{"id":107356556,"identity":"70c64e44-568b-4a53-9323-1e0296dcc324","added_by":"auto","created_at":"2026-04-20 17:01:43","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":263420,"visible":true,"origin":"","legend":"\u003cp\u003e(A–B) Heatmaps showing correlations between protein modules and lung function parameters. (C) Bubble plot illustrating GO/KEGG enrichment of proteins in the lavender module (actin filament–related). Color represents enrichment significance. Sky-blue rectangles indicate proteins incorporated into the multivariate Cox regression model for survival analysis.\u003c/p\u003e","description":"","filename":"figure6.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/f4f71ba6cbbdb03f419da51c.png"},{"id":107356560,"identity":"d93ff7a8-8355-4057-b448-75a19305617b","added_by":"auto","created_at":"2026-04-20 17:01:43","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":204150,"visible":true,"origin":"","legend":"\u003cp\u003e(A–B) Bubble plots of GO/KEGG enrichment for proteins in the khaki and mediumvioletred modules, associated with neutrophil and immunoglobulin pathways, respectively. Color indicates enrichment significance. (C) Heatmap of correlations between protein modules and blood test parameters.\u003c/p\u003e","description":"","filename":"figure7.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/5b767a9a0a62f51e994da3db.png"},{"id":107356558,"identity":"21e3cc0c-12e2-4c04-b3eb-c534902b1407","added_by":"auto","created_at":"2026-04-20 17:01:43","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":312080,"visible":true,"origin":"","legend":"\u003cp\u003e(A–B) Heatmap of correlations between protein modules and radiological results. LAA:low attenuation area; hu: Hounsfield units, LUL: Left Upper Lobe, LLL: Left Lower Lobe, RUL: Right Upper Lobe, RML: Right Middle Lobe, RLL: Right Lower Lobe; Findings Volume: extent of consolidation and GGO.\u003c/p\u003e","description":"","filename":"figure8.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/49929fab4cd47c77ac8185f3.png"},{"id":107356559,"identity":"a0b78468-28ab-4fa5-a083-564d0936b0ca","added_by":"auto","created_at":"2026-04-20 17:01:43","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":296276,"visible":true,"origin":"","legend":"\u003cp\u003e(A–B) Bubble plots of GO/KEGG enrichment for proteins in the darkseagreen (keratinized epithelial) and navyblue (cilium) modules. (C) Heatmap of immunoglobulins in the mediumvioletred module. Bold italic font indicates immunoglobulin components downregulated in AECOPD compared to stable COPD. Sky-blue shading highlights proteins significant in both the forest plot (Fig. 3A) and Cox regression analysis. Top annotations include risk score, FEV1/FVC (% predicted), MUC5AC, MUC5B, IGHG2, IGHM, and IGHA2 levels. (D) Scatter plots showing correlations between gene significance (GS) for consensus module 2 and module membership (MM) in the navyblue module, along with relationships between GS and MM for the oral contamination score in the darkseagreen module. Vertical and horizontal lines indicate the thresholds for GS and MM, respectively. (E) Circular heatmap displaying scaled, batch-corrected protein expression data (derived from proteins meeting the GS and MM thresholds in panel D), overlaid with a STRING-based protein–protein interaction network. The network illustrates protein interactions, while the heatmap represents protein expression across samples. Rectangles indicate consensus clusters 2–5, and “0” marks the two isolated samples in consensus clusters 1 and 6. Protein names on the left and right correspond to keratinized epithelial- and cilium-associated proteins, respectively.\u003c/p\u003e","description":"","filename":"figure9.png","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/d6e9f403c4e11005fa7169b7.png"},{"id":107488755,"identity":"0de5be7a-02a6-4767-85bd-f3284801f3b9","added_by":"auto","created_at":"2026-04-22 02:45:46","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2194728,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/c3f8d26a-38e3-453b-8577-1ceded4a42e4.pdf"},{"id":107487345,"identity":"8bda5644-a188-42ad-a2e4-663a38a2812e","added_by":"auto","created_at":"2026-04-22 02:41:02","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":109589,"visible":true,"origin":"","legend":"(A) Principal component analysis (PCA) showing clear separation between the two batches of sputum proteome profiles. (B) After batch correction, no significant separation is observed between batches or among AECOPD, age-matched healthy controls (HC), and stable COPD patients. (C\u0026ndash;E) Top panels: GSEA plots highlighting enriched (positive enrichment score) or depleted (negative enrichment score) biological pathways in Group 1 vs. Group 2. Bottom panels: Volcano plots showing log fold change (logFC) and p-values for leading-edge proteins from the corresponding GSEA. Positive logFC indicates higher expression in Group 1.","description":"","filename":"supplementaryfigures.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9347057/v1/4133d7d90f9aec7317e94f4c.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Ciliary-Enriched and Keratinization-Dominant Endotypes Drive Molecular Heterogeneity in AECOPD","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eChronic Obstructive Pulmonary Disease (COPD) is a complex condition characterized by persistent respiratory symptoms and airflow limitation. Acute exacerbations of COPD (AECOPD) are common throughout the disease course. Large-scale multicenter studies have shown that COPD patients experience an average of 0.5 to 3.5 acute exacerbations per year, significantly impacting lung function, quality of life, and accelerating disease progression, while also increasing mortality rates \u003csup\u003e[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/sup\u003e. Understanding the pathogenesis of AECOPD through altered protein levels could aid in identifying these sporadic episodes. Existing diagnostic markers for AECOPD, such as serum C-reactive protein (CRP) and Serum Amyloid A (SAA) \u003csup\u003e[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]\u003c/sup\u003e, are systemic markers that may not directly reflect the local airway environment. Recent pilot studies have started to focus on AECOPD \u003csup\u003e[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]\u003c/sup\u003e, but they did not include biomarkers from airway secretions, such as induced sputum or bronchoalveolar lavage fluid. While some studies have evaluated the microbiome profiles of AECOPD sputum, they lack critical information on the proteome, which delineate the host response to these disease episodes \u003csup\u003e[\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eCurrent clinical assessments of AECOPD rely on lung function measurements, symptom scores, acute exacerbation histories, and imaging data. Although these approaches provide valuable information, they lack the precision needed to develop personalized treatment strategies and have had limited success in altering the disease course of COPD \u003csup\u003e[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]\u003c/sup\u003e. A systematic review identified 27 prediction models for acute exacerbations of COPD from 1382 international studies \u003csup\u003e[\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]\u003c/sup\u003e, but concluded that none effectively predicts the risk of exacerbations or offers guidance for individualized treatment. Similarly, Keene JD et al. incorporated peripheral blood biomarkers into two large COPD cohorts (SPIROMICS and COPDGene) but found the predictive value of these biomarkers to be low\u0026mdash;often lower than that of traditional clinical indicators \u003csup\u003e[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]\u003c/sup\u003e. Historically, the strongest predictor of future AECOPD has been a patient's history of exacerbations \u003csup\u003e[\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]\u003c/sup\u003e. However, since this indicator can fluctuate from year to year, it is crucial to identify novel biomarkers that can predict AECOPD independently of previous exacerbation history.\u003c/p\u003e \u003cp\u003eIn this study, we used in-depth sputum proteomics to characterize the host response in AECOPD. We hypothesized that the airway molecular landscape during an exacerbation is distinct from that of stable disease. Our primary goals were threefold: first, to profile the global changes in the sputum proteome during AECOPD; second, to investigate the unexpected interplay between ciliary function and mucosal immunity; and finally, to determine whether these molecular patterns can define subgroups of patients with different underlying biology or clinical outcomes. To achieve this, we compared the sputum proteomic profiles of stable COPD patients, AECOPD patients, and healthy controls. We employed network analysis and unsupervised clustering to identify coherent biological programs and patient subgroups. Additionally, we developed a prognostic model to assess the ability of protein biomarkers to predict the risk of future exacerbations.\u003c/p\u003e"},{"header":"2. Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1. Study Design and Ethics\u003c/h2\u003e \u003cp\u003e This study was approved by the Ethics Committee of Shenzhen People's Hospital (reference number: KY-LL-2020294-01) and was conducted in strict accordance with the principles of the Helsinki Declaration. All participants provided informed consent.\u003c/p\u003e \u003cp\u003eThis retrospective study includes both a case-control comparison (COPD patients vs. healthy controls) and a case-case comparison (patients in stable condition vs. those experiencing AECOPD), based on data collected from our hospital. The primary aim is to identify sputum proteomic biomarkers that can detect COPD in the general population, differentiate between AECOPD and stable periods, and serve as prognostic markers for predicting further AECOPD episodes after the baseline AECOPD.\u003c/p\u003e \u003cp\u003eAECOPD in this study was defined as an increase in respiratory symptoms, following Anthonisen's criteria \u003csup\u003e[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]\u003c/sup\u003e, along with the need for additional pharmacological treatment. Moderate AECOPD was defined as a worsening of symptoms for more than 2 consecutive days, requiring treatment with systemic glucocorticoids, antibiotics, or both. Severe AECOPD was defined as moderate AECOPD that also required (enhanced) oxygen therapy, intensification of long-term home non-invasive ventilation (NIV), or escalation to hospital care.\u003c/p\u003e \u003cp\u003eExclusion criteria for all participants:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eConfirmed diagnosis of cystic fibrosis, pneumonia risk factors, or other respiratory disorders (e.g., asthma, lung cancer, pulmonary embolism, interstitial pulmonary disease, active pulmonary tuberculosis, severe bronchiectasis, etc.).\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eCoexisting renal insufficiency, hematological diseases, rheumatic immune diseases, immunodeficiencies, or hyperthyroidism.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eHistory of lung surgery.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eLong-term corticosteroid or antibiotic therapy.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003eInclusion criteria for healthy controls:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eMale or female aged 18\u0026ndash;70 years.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eNormal lung function test results.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003eInclusion criteria for COPD patients:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eMale or female aged 40\u0026ndash;85 years.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eConfirmed diagnosis of COPD based on post-bronchodilator spirometry with FEV1\u0026thinsp;\u0026le;\u0026thinsp;80% of predicted normal and FEV1/FVC ratio.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eNo moderate or severe COPD exacerbation unresolved for at least one month prior to enrollment.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e2.2. Participants enrollment and sample collection\u003c/h2\u003e \u003cp\u003eEach COPD patient's history of AECOPD was carefully documented. We defined episodes of AECOPD in the year prior to baseline as any exacerbation characterized by increased respiratory symptoms and the need for additional pharmacological treatment. This number was referred to as \"Episode_of_AECOPD\" in our prognostic analysis. Additionally, we recorded the number of severe AECOPD episodes requiring hospitalization during the same period, which was labeled \"Hospitalized_AECOPD\" in the analysis.\u003c/p\u003e \u003cp\u003eStable COPD patients were consecutively recruited from community health centers (CHCs), while AECOPD patients were consecutively enrolled either from our outpatient clinic (for moderate AECOPD) or from the respiratory ward (for severe AECOPD). To reduce potential variability, sputum samples were collected from moderate AECOPD patients within 48 hours of symptom onset; for AECOPD patients requiring inpatient care, sputum samples were collected within 48 hours of admission, and the length of hospital stay was recorded to assess short-term prognosis. In total, we recruited 35 AECOPD patients and 19 stable COPD patients.\u003c/p\u003e \u003cp\u003eTo monitor long-term prognosis of AECOPD patients, we conducted monthly remote follow-ups over a 36-month period, recording each patient\u0026rsquo;s first episode of severe AECOPD requiring hospitalization. Pulmonary carcinogenesis, mortality from non-respiratory diseases, and loss to follow-up were considered censored events.\u003c/p\u003e \u003cp\u003eTo further validate that our findings were disease-specific, we recruited an additional 15 healthy volunteers. This included 12 older healthy controls aged 50\u0026ndash;70 (AgedHC Group) and 3 younger healthy controls aged under 35 (YoungHC Group).\u003c/p\u003e \u003cp\u003e All participants provided written informed consent after receiving comprehensive information about the study's objectives and procedures.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e2.3. Proteomic\u003c/h2\u003e \u003cp\u003eProteomic analyses were conducted in two separate batches with slight procedural variations, employing both data dependent acquisition (DDA) and data independent acquisition (DIA) mass spectrometry approaches. Sample preparation adhered to standard operating procedures across both batches. Below, we present the details of the mass spectrometry (MS) assay methods and parameters used in batch 1, which were similar to those applied in batch 2.\u003c/p\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003e2.3.1. DDA Mass Spectrometry Assay\u003c/h2\u003e \u003cp\u003eAll fraction samples were analyzed using a TIMSTOF mass spectrometer (Bruker) coupled to an Evosep One liquid chromatography system. The MS was operated in data-dependent acquisition mode to generate an ion mobility-enhanced spectral library. Accumulation and ramp times were each set to 100 ms, and mass spectra were recorded within an m/z range of 100\u0026ndash;1700 in positive electrospray mode, with a dynamic exclusion time of 24.0 seconds. The ion source voltage was set to 1500 V, temperature to 180\u0026deg;C, and dry gas flow rate to 3 L/min. Ion mobility was scanned over a range of 0.75 to 1.35 Vs/cm\u0026sup2;, followed by 8 cycles of PASEF MS/MS.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section3\"\u003e \u003ch2\u003e2.3.2. Mass Spectrometry Assay for DIA\u003c/h2\u003e \u003cp\u003ePeptides from each sample were analyzed on a TIMSTOF mass spectrometer (Bruker) coupled to an Evosep One liquid chromatography system (Denmark) in data-independent acquisition (DIA) mode. The mass spectrometer collected ion mobility MS spectra across a mass range of m/z 100\u0026ndash;1700, with up to 4 windows defined for individual 100 ms TIMS scans based on the m/z-ion mobility plane. During PASEF MS/MS scanning, collision energy was linearly ramped as a function of mobility, from 20 eV at 1/K0\u0026thinsp;=\u0026thinsp;0.85 Vs/cm\u0026sup2; to 59 eV at 1/K0\u0026thinsp;=\u0026thinsp;1.30 Vs/cm\u0026sup2;.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section3\"\u003e \u003ch2\u003e2.3.3. Mass spectrometry data analysis\u003c/h2\u003e \u003cp\u003eFor DDA library data, the FASTA sequence database was searched using Spectronaut\u0026trade; 14.4.200727.47784 (Biognosys) software. The database was downloaded from the website: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.uniprot.org\u003c/span\u003e\u003cspan address=\"http://www.uniprot.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. iRT peptide sequences (Biognosys|iRT Kit|) were added. The parameters were set as follows: enzyme is trypsin, with a maximum of 1 missed cleavage; fixed modification is carbamidomethylation (C), and dynamic modifications are oxidation (M) and N-terminal acetylation (Protein N-term). All reported data are based on 99% confidence for protein identification, with a false discovery rate (FDR)\u0026thinsp;\u0026le;\u0026thinsp;1%. DIA data was analyzed in Spectronaut\u0026trade; 14.4.200727.47784 using the above-constructed spectral library. The main software parameters were set as follows: retention time prediction type is dynamic iRT, interference correction at the MS2 level is enabled, and cross-run normalization is enabled. All results were filtered with a Q-value cutoff of 0.01 (equivalent to FDR\u0026thinsp;\u0026lt;\u0026thinsp;1%).\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e2.4. Radiomics\u003c/h2\u003e \u003cp\u003eRetrospective chest CT scans from 35 AECOPD patients were acquired and anonymized prior to analysis. All DICOM series were preprocessed using SimpleITK: images were resampled to an isotropic resolution of 1\u0026times;1\u0026times;1 mm\u0026sup3; using linear interpolation and converted to NIfTI format. Anatomical segmentation was performed automatically with MEDPSeg \u003csup\u003e[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]\u003c/sup\u003e, a deep learning framework that generated binary masks for the whole lung, airway tree, pulmonary vessels, and five lung lobes without manual correction. Ground-glass opacity (GGO) volume and GGO percentage of involvement (POI), as well as consolidation volume, were also calculated from the segmented masks.\u003c/p\u003e \u003cp\u003eQuantitative biomarkers were extracted from the segmented regions. Lung parenchyma was defined as the lung mask excluding the airways. Emphysema severity was quantified as the percentage of low-attenuation areas (LAA%) within the parenchyma at thresholds of -910, -950, and \u0026minus;\u0026thinsp;970 HU, both globally and per lobe. First-order radiomic features, including mean, standard deviation (std_hu), skewness, kurtosis, and 15th percentile of Hounsfield units (hu), were extracted from the intensity distributions of the lung parenchyma, airways, and vessels.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003e2.5. Bioinformatic analyses\u003c/h2\u003e \u003cdiv id=\"Sec11\" class=\"Section3\"\u003e \u003ch2\u003e2.5.1. Preprocessing\u003c/h2\u003e \u003cp\u003eWe performed bioinformatic analyses based on protein expression matrices generated by the DIA software tools. First, we converted UniProt codes into their corresponding gene symbols. For proteins without corresponding gene symbols, the UniProt codes were retained. Protein expressions labeled as \"filtered\" or \"NA\" were converted to 0, and the expression matrices were subsequently log2 transformed. To address batch effects, we employed the specialized tool harmonizR \u003csup\u003e[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]\u003c/sup\u003e, which is designed for this purpose. The effectiveness of batch effect removal was validated using PCA plots.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section3\"\u003e \u003ch2\u003e2.5.2. Differential expression analysis\u003c/h2\u003e \u003cp\u003eWe assessed associations between protein expression and different sample groups, including AECOPD, stable COPD, and healthy controls, by using a batch-corrected protein expression matrix and a general linear model (GLM) in R. Covariates included demographic variables (age, gender, BMI, smoking status, and pack-years), with gender, smoking status, and medication use treated as binary variables, and age, BMI, and pack-years as continuous variables. For intra-COPD comparisons (e.g., AECOPD vs. stable COPD, frequent exacerbators vs. infrequent exacerbators), additional clinical metadata on maintenance medications (LAMA+LABA, LABA\u0026thinsp;+\u0026thinsp;ICS, and LABA+LAMA\u0026thinsp;+\u0026thinsp;ICS) were incorporated. To account for potential oral contamination, a set of oral-related protein markers was selected, including salivary enzymes (AMY1B), salivary mucins (MUC7), proline-rich salivary proteins (PRB1, PRB2, PRB3, PRB4), antimicrobial peptides (HTN1, PIP), salivary immune-related proteins (BPIFA2), and oral epithelial keratins (KRT1, KRT4, KRT6A, KRT10, KRT13, KRT16). An oral contamination score was calculated for each sample using single-sample gene set enrichment analysis (ssGSEA). Other clinical data, such as comorbidities, lung function, and blood test results, were excluded due to a high proportion of missing values.\u003c/p\u003e \u003cp\u003eAfter adjusting for covariates, proteins with a p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 were considered statistically significant differentially expressed proteins (DEPs). The log2 fold-change (log2FC) of DEPs was calculated as follows:\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\n$$\\:log2FC=\\frac{\\left(\\sum\\:_{i=1}^{m}log2\\left({G1}_{i}\\right)\\right)}{m}-\\frac{\\left(\\sum\\:_{j=1}^{n}log2\\left({G2}_{j}\\right)\\right)}{n}$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003ewhere G1\u0026#119894; and G2j represent the batch-corrected expression of the protein in the \u0026#119894;th sample of group 1 and the jth sample of group 2, respectively, and \u003cem\u003em\u003c/em\u003e and \u003cem\u003en\u003c/em\u003e correspond to the total number of samples in group 1 and group 2.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section3\"\u003e \u003ch2\u003e2.5.3. Gene set enrichment analysis (GSEA)\u003c/h2\u003e \u003cp\u003eTo associate gene expression with biological functions, we used the differentially expressed proteins (DEPs) and their corresponding log2 fold-change (log2FC) values as input for Gene Set Enrichment Analysis (GSEA). The analysis was performed by querying pre-defined gene sets from the Gene Ontology (GO) database using the clusterProfiler R package \u003csup\u003e[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]\u003c/sup\u003e. The GSEA algorithm traversed the ranked protein list, adjusting the running-sum statistic based on whether a protein belonged to a specific gene set. When a protein in the list was part of a gene set, the running-sum statistic increased in proportion to the protein's expression level; otherwise, it decreased at a constant rate. The enrichment score (ES) was defined as the maximum deviation of the running-sum statistic from zero. Gene sets with a p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 were considered statistically significant.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section3\"\u003e \u003ch2\u003e2.5.4. Prognostic analysis\u003c/h2\u003e \u003cp\u003eTo predict the prognosis of AECOPD following baseline onset, we first evaluated all proteins and clinical metadata using univariate Cox regression analysis based on the batch-corrected proteomic matrix. Variables with a z-test p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 and a proportional hazards (PH) assumption p-value\u0026thinsp;\u0026gt;\u0026thinsp;0.05 were included in a forest plot. For proteins that passed this filtering process, we performed GO enrichment analysis and selected clusters of proteins associated with relevant terms to construct biofunction-specific AECOPD prognostic models. The selected variables, identified based on enriched pathways, were subsequently used to develop a multivariate Cox regression model to predict future AECOPD. To address multicollinearity, we applied an L1-penalized Cox model to remove redundant protein predictors with less contribution to predicting future AECOPD. The optimal penalty parameter (λ) was tuned using cross-validation. Variables with non-zero coefficients were defined as predictor genes and entered into the first round of multivariate Cox regression, where variables that violated the PH assumption were removed. In the second round of multivariate Cox regression, we identified predictor variables for future AECOPD and constructed a linear risk score model for each patient:\u003cdiv id=\"Equb\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equb\" name=\"EquationSource\"\u003e\n$$\\:Risk\\:Score={\\beta\\:}_{0}+{\\sum\\:}_{i=1}^{n}{{\\beta\\:}_{i}x}_{i}$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003ewhere \u0026#119909;\u0026#119894; represents the feature values and \u0026#120573;\u0026#119894; the corresponding coefficients. After calculating the risk scores for patients in batch 1 and batch 2, event-free survival curves for high-risk and low-risk groups (stratified based on the median risk score) were generated using the Kaplan-Meier method and compared using the log-rank test. The performance of the prognostic model, along with the known predictor, history of AECOPD, was evaluated using the concordance index (C-index).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section3\"\u003e \u003ch2\u003e2.5.5. Consensus clustering\u003c/h2\u003e \u003cp\u003eTo dissect the inherent heterogeneity among AECOPD patients, we employed consensus clustering using the ConsensusClusterPlus R package \u003csup\u003e[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]\u003c/sup\u003e. We subsetted the batch-corrected matrix to include only AECOPD patients and first selected the most informative genes by removing those with zero variance (i.e., where all values are the same). Genes with zero Median Absolute Deviation (MAD) are considered uninformative for clustering and were excluded from further analysis. We set the maximum number of clusters (maxK) to 6 and used consensus clustering to evaluate the stability of the clusters by repeatedly clustering subsets of the data and measuring the consistency of the results. This approach helps determine the optimal number of clusters (K) in an unsupervised manner. A smaller Proportion of Ambiguously Clustered (PAC) value indicates more definitive and stable clustering. By calculating the PAC for each number of clusters (from K\u0026thinsp;=\u0026thinsp;2 to maxK\u0026thinsp;=\u0026thinsp;6), we selected the K that minimized PAC to ensure that the chosen clustering solution provided the most stable and least ambiguous grouping of the samples. The assignment of each AECOPD patient was added as clinical metadata and used in WGCNA to identify endotype-specific protein clusters.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section3\"\u003e \u003ch2\u003e2.5.6. Weighted correlation network analysis (WGCNA)\u003c/h2\u003e \u003cp\u003eTo associate protein expression patterns with clinical metadata and explain the heterogeneity of AECOPD, we used the AECOPD-specific matrix mentioned above to perform WGCNA \u003csup\u003e[\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]\u003c/sup\u003e. A signed co-expression network was constructed using a stepwise approach. First, the pickSoftThreshold function was employed to determine the optimal soft power \u003cem\u003eβ\u003c/em\u003e from a series of candidates. This power was used to calculate the adjacency between two arbitrary genes \u003cem\u003em\u003c/em\u003e and \u003cem\u003en\u003c/em\u003e as follows:\u003cdiv id=\"Equc\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equc\" name=\"EquationSource\"\u003e\n$$\\:\\text{A}\\text{d}\\text{j}\\text{a}\\text{c}\\text{e}\\text{n}\\text{c}\\text{y}=\\:{\\left|cor(m,\\:n)\\right|}^{\\beta\\:}$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eA scale-free topological overlap matrix was then built based on the adjacency matrix. Next, with the minimum module size set to 30, we identified modules containing highly co-expressed proteins. These modules were represented by the module eigengene (ME), which refers to the first principal component (PC1) of the inter-module gene expression profile. Based on hierarchical clustering of MEs, modules with high similarity were merged using the mergeCloseModules function with a manually defined threshold. MEs were recalculated as merged module eigengenes (mergedMEs). Finally, a module-trait correlation heatmap was generated based on the Pearson correlation between clinical traits and mergedMEs. Modules with the highest association with a clinical trait of interest were selected for subsequent analyses. Proteins in the selected modules were referred to as module proteins.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003e3.1. Batch effect correction\u003c/h2\u003e \u003cp\u003eTwo batches of sputum samples were initially preprocessed in R and visualized with principal component analysis (PCA), revealing a strong batch effect (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). This effect was corrected using harmonizR (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). After correction, no clear separation was observed among the three groups (AECOPD, AgedHC, and Stable), except for three YoungHC samples. These findings suggest that COPD and AECOPD biomarkers are subtle and require more detailed analyses to identify robust diagnostic markers.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003e3.2. Comparison of demographic data\u003c/h2\u003e \u003cp\u003eWe consecutively enrolled 54 patients with AECOPD or stable COPD, along with 15 healthy controls, for demographic and clinical comparisons. Significant group differences were observed in age and gender. AECOPD patients were older than both stable COPD patients and controls, with a median age of 70 years (IQR 63\u0026ndash;72), compared to 62 years (IQR 54.5\u0026ndash;67) and 54 years (IQR 51\u0026ndash;56), respectively (p\u0026thinsp;\u0026lt;\u0026thinsp;0.01). Gender distribution also differed significantly: 94% of AECOPD patients were male, compared to 74% of stable COPD and 40% of controls (p\u0026thinsp;\u0026lt;\u0026thinsp;0.01). No significant differences were found in smoking status or BMI across the three groups.\u003c/p\u003e \u003cp\u003eWithin the COPD cohort, AECOPD patients were older than those with stable disease (p\u0026thinsp;=\u0026thinsp;0.01) and more frequently received triple therapy (31% vs. 5%, p\u0026thinsp;=\u0026thinsp;0.04). No significant differences were observed in gender, BMI, or smoking history. Oral contamination scores were not significantly different among groups and were therefore not included as covariates. Based on these findings, age, gender, BMI, smoking status, and treatment regimen (for intra-COPD comparisons) were included as covariates in subsequent analyses.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDemographic Comparison Among AECOPD, Stable COPD, and Healthy Controls\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eGender Male (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAECOPD (n\u0026thinsp;=\u0026thinsp;35)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eControl (n\u0026thinsp;=\u0026thinsp;15)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eStable (n\u0026thinsp;=\u0026thinsp;19)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003ep\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.94\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.4\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.74\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u003csup\u003e#\u003c/sup\u003e0*\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eActive_Smoker (%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.26\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.32\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e\u003csup\u003e#\u003c/sup\u003e0.82\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAge\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e70 (63\u0026ndash;72)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e54 (51\u0026ndash;56)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e62 (54.5\u0026ndash;67)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0*\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBMI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e22.5 (20.07\u0026ndash;23.45)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e23.7 (22.85\u0026ndash;25.9)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e23 (21.7\u0026ndash;25.3)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0.08\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOral Contamination Score (ssGSEA)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1.37 (1.28\u0026ndash;1.55)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e1.4 (1.3\u0026ndash;1.52)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.39 (1.23\u0026ndash;1.54)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e0.92\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDemographic Comparison of AECOPD vs Stable COPD\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eGender Male (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAECOPD (n\u0026thinsp;=\u0026thinsp;35)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eStable (n\u0026thinsp;=\u0026thinsp;19)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003ep\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.94\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.74\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003csup\u003e#\u003c/sup\u003e0.08\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eActive_Smoker (%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.26\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.32\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.89\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLAMA_LABA (%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003csup\u003e#\u003c/sup\u003e1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLABA_ICS (%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.26\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003csup\u003e#\u003c/sup\u003e0.29\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTriple_therapy (%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0.31\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003csup\u003e#\u003c/sup\u003e0.04*\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAge\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e70 (63\u0026ndash;72)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e62 (54.5\u0026ndash;67)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.01*\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBMI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e22.5 (20.07\u0026ndash;23.45)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e23 (21.7\u0026ndash;25.3)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.17\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eAll binary data are presented as percentages (%), while quantitative data are expressed as the median (Q1-Q3). P-values were calculated using the Wilcoxon rank-sum test or Kruskal-Wallis test, with Fisher\u0026rsquo;s exact test used where indicated by the nature of the data (denoted by #). A p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 was considered statistically significant (*).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003e3.3. Differential expression analyses and GSEA\u003c/h2\u003e \u003cp\u003eDifferential expression analysis was performed among AECOPD, stable COPD, and healthy controls (HC, combining YoungHC and AgedHC). Differentially expressed proteins were subjected to GSEA. Compared with HC, AECOPD showed decreased levels of proteins related to cytosolic ribosomes and macroautophagy (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). Cilium-associated proteins were upregulated in AECOPD compared with stable COPD, while proteins linked to cell death, antigen binding, and immunoglobulin production showed the opposite trend (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eD). In stable COPD vs. HC, wound healing and cilium-associated proteins were coordinately downregulated (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eE). Volcano plots below each GSEA panel display the -log10 transformed p-values (GLM test) and log2 fold changes for proteins involved in these pathways, highlighting immunoglobulin isotype heavy chains. In Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eD, upregulated proteins associated with immunoglobulin production in AECOPD (e.g., IGKV1-12, IGLV1-47, IGKV2-28, IGKV4-1, IGKV3D-11, IGKV1D-33, IGLV5-45, IGLV5-52) did not include isotype-related proteins (IGHG2, IGHG4, IGHM, IGHA1, IGHA2).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003e3.4. Association of cilium function and mucins and apoptotic markers\u003c/h2\u003e \u003cp\u003eTo determine whether these cilium-associated changes were related to apoptosis-induced epithelial detachment, we examined canonical apoptosis markers (BCL2L13, BAX, CASP3). None of these markers were differentially expressed between the AECOPD and stable COPD groups, suggesting that this phenomenon is more likely due to ciliogenesis rather than epithelial shedding caused by apoptosis.\u003c/p\u003e \u003cp\u003eCilia mediate mucus clearance \u003csup\u003e[\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]\u003c/sup\u003e, suggesting that alterations in cilium-associated proteins may influence mucin levels. Although mucin-related pathways were not differentially enriched across groups (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA), across the entire cohort we observed a negative association between MUC5B and several cilium-associated proteins (PCM1, ARFGEF2, CFAP58, ATP2B4, HVCN1). In contrast, MUC5AC, a gel-forming mucin, was exclusively and significantly associated with the cilium-associated protein TMEM231, which was upregulated in AECOPD compared to stable COPD.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eSeveral immunoglobulin variable regions downregulated in AECOPD (e.g., IGHV3-15, IGKV1-12, IGLV1-47, IGKV2-28, IGKV4-1, IGKV3D-11, IGKV1D-33, IGLV5-45, IGLV5-52) correlated strongly with MUC5B (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). Immunoglobulin heavy chains, particularly IGHA1 and IGHA2, were also significantly associated with their corresponding variable regions. As key components of secretory IgA, their presence highlights the epithelial and secretory nature of the local immune repertoire.\u003c/p\u003e \u003cp\u003eThe Venn diagram inset in (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA) shows the overlap of cilium-associated proteins across comparisons. Notably, PCM1 and TMEM231 were downregulated in stable COPD compared to AECOPD or healthy controls (HC). Among cilium-associated proteins upregulated in AECOPD, ARFGEF2 was elevated in frequent exacerbators versus stable COPD, whereas ATP2B4 and TMEM231 were elevated in infrequent exacerbators versus stable COPD. Heatmaps in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB display protein levels across functional groups: cilium (gold), mucins (blue), apoptosis regulators (red), immunoglobulins (turquoise), wound healing (purple), and cytosolic ribosome/macroautophagy (black). Top annotations indicate group origin and the metadata \"NumOfAE.\"\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003e3.5. Construction of AECOPD prognostic model\u003c/h2\u003e \u003cp\u003eDuring the 36-month follow-up, 18 of 39 AECOPD patients (46%) experienced at least one hospitalization due to exacerbations. To identify prognostic biomarkers for future AECOPD events, we conducted univariate Cox regression analysis. Enrichment analysis indicated that these biomarkers were predominantly associated with mucin-type O-glycan biosynthesis, lysosome, membrane lipid metabolic processes, and the immunoglobulin complex (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). With the exception of IGHV3-23 and IGLV3-12, most immunoglobulin-related biomarkers were linked to reduced risk (hazard ratio\u0026thinsp;\u0026lt;\u0026thinsp;1; Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). Considering the heterogeneity of AECOPD and its complex pathophysiology, we applied LASSO feature selection (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC\u0026ndash;E) and constructed a multivariate Cox prognostic model (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). Patients were stratified into high- and low-risk groups based on the derived risk score. Kaplan\u0026ndash;Meier survival analysis demonstrated significant separation between groups, with high-risk patients showing poorer survival probabilities (log-rank p\u0026thinsp;\u0026lt;\u0026thinsp;0.05). Importantly, the composite risk score outperformed the traditional clinical predictor \u0026ldquo;Episode_of_AECOPD\u0026rdquo; across the full cohort and in both Batch 1 and Batch 2. We also highlight an example immunoglobulin marker, IGKV1D-13, which showed a significant protective association with future exacerbations in Batch 2 (log-rank p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) and a less pronounced protective trend in Batch 1 (log-rank p\u0026thinsp;=\u0026thinsp;0.16), likely due to the relatively small sample size.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec23\" class=\"Section2\"\u003e \u003ch2\u003e3.6. Identification of AECOPD endotypes\u003c/h2\u003e \u003cp\u003eConsensus clustering identified six clusters (k\u0026thinsp;=\u0026thinsp;6), of which clusters 1 and 6 (each containing only one patient) were excluded (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA\u0026ndash;B). Co-expression analysis identified modules of highly correlated proteins, and modules with eigengene similarity above 0.6 (height cutoff\u0026thinsp;=\u0026thinsp;0.4) were merged (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC\u0026ndash;D). Module\u0026ndash;trait correlations were then calculated using Pearson correlation (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eE).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe neutrophilic module (khaki) was associated with neutrophil activation and migration (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eA), consensus cluster 4, longer length of hospital stay, and former smoking status. It showed both positive and negative associations with blood neutrophil and eosinophil counts and was positively linked to CRP levels (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eB), likely representing neutrophilic AECOPD. Consensus cluster 5 displayed the opposite trend to cluster 4 and showed the highest (though modest) correlation with the immunoglobulin module (mediumvioletred; cor\u0026thinsp;=\u0026thinsp;0.25, p\u0026thinsp;=\u0026thinsp;0.1).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe cilium module (navyblue) was positively associated with ciliary proteins (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eB), MUC5AC expression (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eA), and improved airflow (FEV1/FVC%; Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e6\u003c/span\u003eA).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe keratinized epithelial module (darkseagreen) was enriched for cytoplasmic translation, epidermis development, and keratinocyte differentiation (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eA). It was negatively correlated with MUC5AC (r = -0.54, p\u0026thinsp;=\u0026thinsp;8e-04; Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e6\u003c/span\u003eB) and CRP, but positively correlated with blood sodium levels (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eB). This module was strongly associated with consensus cluster 3 (cilium-depleted AECOPD) and inversely associated with cluster 2 (cilium-enriched AECOPD).\u003c/p\u003e \u003cp\u003eThe immunoglobulin module (mediumvioletred) was negatively associated with the prognostic risk score (cor = -0.41, p\u0026thinsp;=\u0026thinsp;0.01; Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eE) but showed no significant association with any consensus cluster. It exhibited strong correlations with multiple immunoglobulin genes, including IGLV3-19, IGHV3-73, IGKV1-12, IGLV1-47, IGKV2-28, IGKV4-1, IGKV3D-11, and IGKV1D-33, which were downregulated in AECOPD compared to stable COPD (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e6\u003c/span\u003eB). The strongest associations were observed for IGHA1 and IGHA2, whereas correlations with IGHM were weaker.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec24\" class=\"Section2\"\u003e \u003ch2\u003e3.7. Integration of proteomic modules with radiomic phenotypes\u003c/h2\u003e \u003cp\u003eTo connect WGCNA-derived proteomic modules with lung structural changes, we correlated module eigengenes with quantitative radiomic features extracted from automated chest CT segmentation. The keratinized epithelial module was positively correlated with ground-glass opacity (GGO) volume, percentage of opacity involvement (POI), consolidation volume, overall lung findings volume, and parenchymal heterogeneity (std_hu), with representative CT images of the corresponding consensus cluster 3 shown in (Fig. S2). In contrast, the ciliary and immunoglobulin modules showed inverse correlations with GGO volume, consolidation, POI, and std_hu, indicating more homogeneous lung parenchyma and reduced inflammatory opacities, with representative CT images of consensus clusters 2 and 5 presented in (Figs. S1 and S4). Emphysema indices (LAA% at \u0026minus;\u0026thinsp;910, \u0026minus;950, and \u0026minus;\u0026thinsp;970 HU) showed the strongest associations with the neutrophilic module, with representative CT images of consensus cluster 4 shown in (Fig. S3). These radiomic patterns highlight distinct structural phenotypes across endotypes. The keratinization-dominant profile is linked to greater parenchymal heterogeneity and inflammatory burden, whereas the ciliary- and immunoglobulin-enriched profiles correspond to more homogeneous lung attenuation and lower inflammatory opacity. The neutrophilic module, in turn, was most strongly associated with emphysematous changes, suggesting that neutrophilic inflammation may preferentially contribute to parenchymal destruction rather than acute consolidative or ground-glass opacities.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec25\" class=\"Section2\"\u003e \u003ch2\u003e3.8. Summary of proteomic results\u003c/h2\u003e \u003cp\u003eAs shown in (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eD), we analyzed the relationship between gene significance (GS) for oral contamination score and module membership (MM) in the darkseagreen module (Keratinized epithelial module) to verify its potential origin. The significant negative correlation (r = -0.3, p\u0026thinsp;=\u0026thinsp;5.1e-06) excluded oral origin, indicating instead an airway epithelium source for this module. To refine significant proteomic markers for cluster 2, we applied dual thresholds to both GS and MM, thereby focusing subsequent analyses on proteins demonstrating both strong biological significance and high module connectivity. The selected cilium proteins, along with those in the Keratinized epithelial module, are displayed in the Circlize heatmap (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eC), which shows protein expression (outer ring) and STRING-based interaction networks (inner ring). Strong intra- and inter-module interactions were observed among ciliary and keratinized epithelial proteins. Specifically, MUC5B interacted with KRT5, whereas MUC5AC interacted with KRT4, KRT13, KRT14, and KRT17. Cluster 3 patients exhibited higher expression of keratinized epithelial proteins and lower levels of MUC5AC and cilium-associated proteins, whereas Cluster 2 showed the opposite pattern (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eE). The circlize heatmap demonstrated consistent module assignments across the entire cohort, including both Batch 1 and Batch 2 (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eE). The heatmap in Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eD highlights associations within the immunoglobulin module, particularly with IGHA2 and IGHM (but not IGHG2), as well as correlations between the cilium and immunoglobulin modules with the FEV1/FVC actual/predicted ratio. Prognostic proteins (indicated in blue font), including IGHV3-15, IGHV3-23, IGLV3-12, IGKV1D-13, IGKV1-39, and others, were associated with future exacerbations. Notably, IGHV3-15 was both downregulated in AECOPD and shown to be protective in survival models (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA, Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB).\u003c/p\u003e \u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eIn this study, we observed a coordinated upregulation of cilium-associated proteins in AECOPD compared with stable COPD, including PCM1, CFAP58, and TMEM231. These proteins play critical roles in centriolar satellite organization, ciliary motility, and transition-zone integrity, respectively \u003csup\u003e[\u003cspan additionalcitationids=\"CR17 CR18\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/sup\u003e. While reduced ciliary function has been well documented in stable COPD \u003csup\u003e[\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/sup\u003e, to our knowledge, this is the first study to report increased sputum levels of cilium-associated proteins during acute exacerbations. We initially hypothesized that this elevation might reflect apoptosis-induced epithelial shedding. However, canonical apoptosis markers (BCL2L13, BAX, CASP3) showed no significant difference between AECOPD and stable COPD, suggesting an active ciliogenesis process rather than passive cell detachment. This interpretation is supported by the positive correlation between the ciliary functional module and lung function parameters such as FEV1/FVC.\u003c/p\u003e \u003cp\u003eMucin analysis revealed a clear functional dichotomy. MUC5B was negatively correlated with most upregulated ciliary proteins, consistent with its essential role in baseline mucociliary clearance and immune homeostasis \u003csup\u003e[\u003cspan additionalcitationids=\"CR21 CR22\" citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]\u003c/sup\u003e. In contrast, MUC5AC showed strong positive correlations with both the ciliary and immunoglobulin functional modules, while displaying a strong negative correlation with the keratinized epithelial module. An excessive increase in MUC5AC is known to thicken mucus and impair clearance \u003csup\u003e[\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]\u003c/sup\u003e, and higher baseline MUC5AC predicts accelerated lung function decline in COPD \u003csup\u003e[\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]\u003c/sup\u003e. We interpret the elevated MUC5AC in the ciliary-enriched endotype as reflecting enhanced mucus mobilization and expectoration driven by preserved ciliary function and humoral immunity. Conversely, lower MUC5AC in the keratinization-dominant endotype likely results from impaired mucociliary clearance due to ongoing epithelial repair and mucus retention within the airways.\u003c/p\u003e \u003cp\u003eAn intriguing finding of our study is that GSEA revealed increased expression of cilium-related proteins but decreased immunoglobulin abundance compared with stable COPD, suggesting an imbalance between mechanical (cilia-driven) and humoral (antibody-mediated) mucosal defenses. Consistent with this, prognostic analyses showed that higher immunoglobulin levels were generally protective against future exacerbations, underscoring their critical role in airway defense. In WGCNA, the positive correlations between immunoglobulin modules and mucus components (MUC5AC and MUC5B) indicate that humoral responses during exacerbation are closely associated with mucus hypersecretion and microbial stimulation. Although reductions in total IgG and specific IgG subclasses are established risk factors for COPD exacerbations and hospitalizations \u003csup\u003e[\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]\u003c/sup\u003e, the contribution of IgA subclasses has received less attention. In our study, the immunoglobulin module was strongly linked to IGHA1 and IGHA2, implying that airway IgA may facilitate pathogen trapping and clearance within mucus during AECOPD, this role is analogous to that of secretory IgA in the intestinal mucosa \u003csup\u003e[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]\u003c/sup\u003e. Collectively, these findings indicate that a major airway defense deficit in COPD lies in IgA, instead of other isotypes.\u003c/p\u003e \u003cp\u003eUsing WGCNA and consensus clustering, we identified two major molecular endotypes. The ciliary-enriched endotype was characterized by active ciliogenesis and elevated MUC5AC, whereas the keratinization-dominant endotype featured epithelial repair programs, including keratinocyte differentiation and increased expression of KRT5, KRT14, and KRT15 \u003csup\u003e[\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e, \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]\u003c/sup\u003e. Integration with quantitative chest CT radiomics further delineated these endotypes. The keratinized epithelial module was positively associated with ground-glass opacity (GGO) volume, consolidation burden, and parenchymal heterogeneity, consistent with greater inflammatory infiltration and tissue remodeling. In contrast, both the ciliary and immunoglobulin modules showed inverse correlations with GGO volume, consolidation, and parenchymal heterogeneity, corresponding to more homogeneous lung attenuation and better-preserved lung function. These radiomic associations suggest that the ciliary-enriched and immunoglobulin-intact profile corresponds to a state with fewer inflammatory foci and relatively preserved airway homeostasis, whereas the keratinization-dominant profile reflects more severe ongoing inflammation and structural disorganization.\u003c/p\u003e \u003cp\u003eCollectively, our results demonstrate that AECOPD is molecularly heterogeneous. One subset of patients exhibits ciliary activation that supports mucus expectoration, while another is dominated by epithelial repair responses accompanied by greater imaging-evident inflammatory burden. Superimposed on this heterogeneity is a broad defect in humoral immunity, particularly involving secretory IgA, which carries independent prognostic significance.\u003c/p\u003e \u003cp\u003eThis study provides new insight into the pathogenesis of AECOPD and demonstrates the value of integrating sputum proteomics with radiomics for defining biologically and clinically meaningful endotypes. Larger multicenter cohorts and functional studies are warranted to validate these findings and translate them into precision therapeutic strategies for COPD.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eClinical trial number: not applicable.\u003c/p\u003e\n\u003cp\u003eEthics, Consent to Participate, and Consent to Publish declarations\u003c/p\u003e\n\u003cp\u003eThis study was approved by the Ethics Committee of Shenzhen People's Hospital (reference number: KY-LL-2020294-01) and was conducted in strict accordance with the principles of the Helsinki Declaration. All participants provided informed consent, including an agreement to publication.\u003c/p\u003e\n\u003cp\u003eData Availability\u003c/p\u003e\n\u003cp\u003eThe data reported in this paper have been deposited in the OMIX, China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (https://ngdc.cncb.ac.cn/omix: accession no. OMIX007953).\u003c/p\u003e\n\u003cp\u003eAcknowledgement\u003c/p\u003e\n\u003cp\u003eLW conceived the manuscript. JZ performed the analysis and wrote the manuscript. HL, FZ, YM, and SC collected the samples. LW and KY revised the manuscript.\u003c/p\u003e\n\u003cp\u003eFunding Declaration\u003c/p\u003e\n\u003cp\u003eThis work was supported by the grant from the National Key R\u0026amp;D Program of China (2022YFF0710802; 2022YFF0710800), Shenzhen Clinical Research Center for Respiratory Disease (LCYSSQ20220823091203007), Shenzhen Key Laboratory of Respiratory Diseases (SYSPG20241211173920041), National Natural Science Foundation of China (82470033; 82103941) Shenzhen Medical Academy of Research and Translation (C2302001) The Guangdong Basic and Applied Basic Research Foundation (2022A1515011966; 2024A1515012923).\u003c/p\u003e\n\u003cp\u003eConflict of Interest Statement\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eJiacheng Zhong, Zhen Wang, Hanting Li, Kai Yang, Yu Meng, Shanze Chen, and Lingwei Wang declare that they have no conflicts of interest relevant to the content of this manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eSeemungal TA, Hurst JR, Wedzicha JA. Exacerbation rate, health status and mortality in COPD\u0026ndash;a review of potential interventions. Int J Chronic Obstr Pulm Dis. 2009;4:203\u0026ndash;23.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBozinovski S, Hutchinson A, Thompson M, et al. Serum amyloid a is a biomarker of acute exacerbations of chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2008;177:269\u0026ndash;78.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim SH, Ahn HS, Park JS, et al. A Proteomics-Based Analysis of Blood Biomarkers for the Diagnosis of COPD Acute Exacerbation. Int J Chronic Obstr Pulm Dis. 2021;16:1497\u0026ndash;508.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMayhew D, Devos N, Lambert C, et al. Longitudinal profiling of the lung microbiome in the AERIS study demonstrates repeatability of bacterial and eosinophilic COPD exacerbations. Thorax. 2018;73:422\u0026ndash;30.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAaron SD. Reaching for the Holy Grail of Chronic Obstructive Pulmonary Disease Outcomes. Can Medications Modify Lung Function Decline? Am J Respir Crit Care Med. 2018;197:2\u0026ndash;4.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuerra B, Gaveikaite V, Bianchi C et al. Prediction models for exacerbations in patients with COPD. \u003cem\u003eEuropean respiratory review: an official journal of the European Respiratory Society.\u003c/em\u003e 2017, \u003cem\u003e26\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKeene JD, Jacobson S, Kechris K, et al. Biomarkers Predictive of Exacerbations in the SPIROMICS and COPDGene Cohorts. Am J Respir Crit Care Med. 2017;195:473\u0026ndash;81.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWaeijen-Smit K, DiGiandomenico A, Bonnell J, et al. Early diagnostic BioMARKers in exacerbations of chronic obstructive pulmonary disease: protocol of the exploratory, prospective, longitudinal, single-centre, observational MARKED study. BMJ open. 2023;13:e068787.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAnthonisen NR, Manfreda J, Warren CP, et al. Antibiotic therapy in exacerbations of chronic obstructive pulmonary disease. Ann Intern Med. 1987;106:196\u0026ndash;204.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDiedre S, Carmo JAR, Alejandro P, Comellas JM, Reinhardt SE, Gerard. Let\u0026iacute;cia Rittner, Roberto A. Lotufo MEDPSeg: hierarchical polymorphic multitask learning for the segmentation of ground-glass opacities, consolidation, and pulmonary structures on computed tomography. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.2312.02365\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.2312.02365\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVo\u0026szlig; H, Schlumbohm S, Barwikowski P, et al. HarmonizR enables data harmonization across independent proteomic datasets with appropriate handling of missing values. Nat Commun. 2022;13:3523.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu T, Hu E, Xu S et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. \u003cem\u003eInnovation (Cambridge (Mass)).\u003c/em\u003e 2021, \u003cem\u003e2\u003c/em\u003e, 100141.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinf (Oxford England). 2010;26:1572\u0026ndash;3.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLangfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBakshani CR, Morales-Garcia AL, Althaus M, et al. Evolutionary conservation of the antimicrobial function of mucus: a first defence against infection. NPJ biofilms microbiomes. 2018;4:14.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHall EA, Kumar D, Prosser SL et al. Centriolar satellites expedite mother centriole remodeling to promote ciliogenesis. \u003cem\u003eeLife.\u003c/em\u003e 2023, \u003cem\u003e12\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi ZZ, Zhao WL, Wang GS et al. The novel testicular enrichment protein Cfap58 is required for Notch-associated ciliogenesis. \u003cem\u003eBiosci Rep.\u003c/em\u003e 2020, \u003cem\u003e40\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang Q, Yang S, Chen X, et al. Identification of novel TMEM231 gene splice variants and pathological findings in a fetus with Meckel Syndrome. Front Genet. 2023;14:1252873.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAncel J, Belgacemi R, Diabasana Z et al. Impaired Ciliary Beat Frequency and Ciliogenesis Alteration during Airway Epithelial Cell Differentiation in COPD. \u003cem\u003eDiagnostics (Basel, Switzerland).\u003c/em\u003e 2021, \u003cem\u003e11\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong D, Iverson E, Kaler L, et al. MUC5B mobilizes and MUC5AC spatially aligns mucociliary transport on human airway epithelium. Sci Adv. 2022;8:eabq5049.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKesimer M, Ford AA, Ceppe A, et al. Airway Mucin Concentration as a Marker of Chronic Bronchitis. N Engl J Med. 2017;377:911\u0026ndash;22.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRoy MG, Livraghi-Butrico A, Fletcher AA, et al. Muc5b is required for airway defence. Nature. 2014;505:412\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCarpenter J, Wang Y, Gupta R et al. Assembly and organization of the N-terminal region of mucin MUC5AC: Indications for structural and functional distinction from MUC5B. \u003cem\u003eProceedings of the National Academy of Sciences of the United States of America.\u003c/em\u003e 2021, \u003cem\u003e118\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRaclawska DS, Ttofali F, Fletcher AA, et al. Mucins and Their Sugars. Critical Mediators of Hyperreactivity and Inflammation. Annals Am Thorac Soc. 2016;13(Suppl 1):S98\u0026ndash;99.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRadicioni G, Ceppe A, Ford AA, et al. Airway mucin MUC5AC and MUC5B concentrations and the initiation and progression of chronic obstructive pulmonary disease: an analysis of the SPIROMICS cohort. Lancet Respiratory Med. 2021;9:1241\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeitao Filho FS, Ra SW, Mattman A, et al. Serum IgG subclass levels and risk of exacerbations and hospitalizations in patients with COPD. Respir Res. 2018;19:30.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMantis NJ, Rol N, Corth\u0026eacute;sy B. Secretory IgA's complex roles in immunity and mucosal homeostasis in the gut. Mucosal Immunol. 2011;4:603\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIevlev V, Lynch TJ, Freischlag KW et al. Krt14 and Krt15 differentially regulate regenerative properties and differentiation potential of airway basal cells. \u003cem\u003eJCI insight.\u003c/em\u003e 2023, \u003cem\u003e8\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhu W, Han L, Wu Y, et al. Keratin 15 protects against cigarette smoke-induced epithelial mesenchymal transformation by MMP-9. Respir Res. 2023;24:297.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-pulmonary-medicine","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"pulm","sideBox":"Learn more about [BMC Pulmonary Medicine](http://bmcpulmmed.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/pulm/default.aspx","title":"BMC Pulmonary Medicine","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"AECOPD, ciliary proteins, immunoglobulins, keratins, MUC5AC","lastPublishedDoi":"10.21203/rs.3.rs-9347057/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9347057/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground:\u003c/strong\u003e Acute exacerbations of COPD (AECOPD) involve poorly characterized airway host responses. Integrating sputum proteomics with radiomics may reveal underlying molecular heterogeneity and identify clinically relevant biomarkers.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods:\u003c/strong\u003e We performed deep proteomic profiling on induced sputum from patients with AECOPD (n=35), stable COPD (n=19), and healthy controls (n=15). Weighted gene co-expression network analysis (WGCNA) and consensus clustering were used to define molecular endotypes. Proteomic modules were correlated with quantitative chest CT radiomic features including ground-glass opacity (GGO), consolidation, and parenchymal heterogeneity.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults: \u003c/strong\u003eWe identified two major endotypes in AECOPD. The ciliary-enriched endotype showed coordinated upregulation of ciliary proteins (e.g., PCM1, CFAP58, TMEM231) and was associated with elevated MUC5AC. In contrast, the keratinization-dominant endotype was characterized by epithelial repair programs. Radiomic analysis revealed that the keratinization-dominant module positively correlated with GGO volume, consolidation burden, and parenchymal heterogeneity, while the ciliary module was linked to more homogeneous lung parenchyma. Across the AECOPD cohort, mucosal immunoglobulins, particularly IgA-related proteins, were markedly depleted and served as strong independent predictors of protection against future severe exacerbations, outperforming traditional clinical indicators.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion:\u003c/strong\u003e AECOPD is molecularly heterogeneous, featuring a ciliary-enriched endotype with paradoxical ciliary activation and a keratinization-dominant endotype associated with greater inflammatory burden on imaging. Baseline immunoglobulin levels provide powerful prognostic information. These findings offer a new framework for endotype-driven precision management in COPD.\u003c/p\u003e","manuscriptTitle":"Ciliary-Enriched and Keratinization-Dominant Endotypes Drive Molecular Heterogeneity in AECOPD","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-20 17:01:38","doi":"10.21203/rs.3.rs-9347057/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewersInvited","content":"","date":"2026-04-13T09:37:07+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-04-10T07:49:35+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-04-10T05:37:34+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-04-10T05:36:35+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Pulmonary Medicine","date":"2026-04-07T15:06:21+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-pulmonary-medicine","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"pulm","sideBox":"Learn more about [BMC Pulmonary Medicine](http://bmcpulmmed.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/pulm/default.aspx","title":"BMC Pulmonary Medicine","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"9f27c44d-4d8a-48b8-9725-1c61cba0fd96","owner":[],"postedDate":"April 20th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-04-20T17:01:38+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-20 17:01:38","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9347057","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9347057","identity":"rs-9347057","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00