Identification and validation of programmed cell death-related biomarkers in severe pneumonia: A diagnostic observational study using bioinformatics analysis.

Zeng G; Chen J; Zhang C; Tu S; Fan X; Huang J; Cai L

doi:10.1097/md.0000000000046923

Identification and validation of programmed cell death-related biomarkers in severe pneumonia: A diagnostic observational study using bioinformatics analysis.

Zeng G, Chen J, Zhang C, Tu S, Fan X, Huang J, Cai L

Medicine · 2026 · doi:10.1097/md.0000000000046923 · PMID:41496038 · PMC12778093

OA: gold CC-BY-NC-4.0

📄 Open PDF Full text JSON View on PubMed View at publisher

⚙ AI-generated summary by claude@2026-08, 2026-08-01 ⓘ

This study identified and validated BCL2, CDKN2D, DYRK2, and S100A8 as programmed cell death-related biomarkers for severe pneumonia using bioinformatics and RT-qPCR, revealing their involvement in immune responses and cellular processes.

One-sentence paraphrase of the abstract; not a substitute for reading it. No clinical advice. How this works

⚙ AI-generated deep summary by claude@2026-06, 2026-06-17 · read from full text ⓘ

This diagnostic observational study used bioinformatics to identify and validate programmed cell death (PCD)-related biomarkers in blood from severe pneumonia patients, leveraging transcriptome datasets GSE65682 (760 cases vs 42 controls) and GSE171110 (44 cases vs 10 controls), with PCD-related genes compiled from the literature. Differential expression analysis, weighted gene co-expression network analysis (WGCNA), and pathway/functional enrichment (GO/KEGG) were used to derive candidate genes, then LASSO and SVM-RFE machine-learning approaches selected biomarker candidates, which were evaluated by Wilcoxon tests and ROC curves (AUC > 0.7, consistent direction across datasets). Functional similarity ranking, Spearman correlations, and GSEA further characterized associated signaling and biological processes, and immune cell proportions were estimated with CIBERSORT and correlated with biomarker expression. A limitation is that validation appears to rely primarily on statistical replication across datasets plus a small RT-qPCR sample set (5 severe pneumonia and 5 controls), rather than broader independent clinical cohorts or mechanistic experiments. This paper does not explicitly discuss endometriosis or adenomyosis; it was included in the corpus via a keyword match in the upstream search index.

Read from the paper's body, not the abstract. Not a substitute for reading the paper. No clinical advice. How this works

Abstract

Severe pneumonia is the leading cause of mortality in individuals with lung disease. Programmed cell death (PCD) is implicated in various pathologies; however, the precise relationship between severe pneumonia and PCD remains unclear. This study aimed to identify PCD-related biomarkers in severe pneumonia. Gene expression data were sourced from publicly available databases. Biomarkers were identified through weighted gene co-expression network analysis (WGCNA), machine learning, expression validation, and receiver operating characteristic analyses. Potential mechanisms underlying these biomarkers in severe pneumonia were explored using functional similarity analysis, enrichment analysis, and immune microenvironment analysis. Expression levels of the biomarkers were further validated by reverse transcription-quantitative polymerase chain reaction. Four biomarkers - BCL2, CDKN2D, DYRK2, and S100A8 - were identified. Functional similarity analysis highlighted strong functional parallels among these biomarkers. Notably, these 4 biomarkers were involved in key processes such as graft-versus-host disease, complement and coagulation cascades, ribosome activity, and spliceosome function. Immune microenvironment analysis revealed 12 differential immune cell types with significant negative correlations, with neutrophils exhibiting the strongest inverse correlation with monocytes. In contrast, M1 macrophages showed the strongest positive correlation with M0 macrophages. Reverse transcription-quantitative polymerase chain reaction validation demonstrated that CDKN2D and S100A8 were significantly upregulated, whereas DYRK2 was markedly downregulated in severe pneumonia. This study identified and validated 4 biomarkers associated with severe pneumonia, offering critical insights for the development of personalized therapeutic strategies for affected patients.

Full text 37,408 characters · extracted from pmc-nxml · 6 sections · click to expand

Section 5

This study identified 4 PCD-related biomarkers (CDKN2D, S100A8, BCL2, and DYRK2) in severe pneumonia and uncovered their potential mechanisms, offering a new theoretical foundation and direction for the early diagnosis and treatment of severe pneumonia.

Intro

Severe pneumonia is a critical lung infection characterized by inflammation and fluid accumulation in the alveoli, posing a significant risk of respiratory failure and death. [ 1 ] A prominent example is COVID-19-related pneumonia, triggered by the novel SARS-CoV-2 coronavirus, which has severely impacted global health. [ 2 ] In addition to the acute lung damage caused by severe pneumonia, patients may face prolonged pulmonary complications that persist even after recovery. [ 3 ] Several factors contribute to the severity of pneumonia, including male sex, advanced age, smoking history, obesity, and comorbidities such as hypertension, diabetes, and chronic respiratory conditions. [ 4 ] Following severe pneumonia, pulmonary immune homeostasis may be disrupted, and many patients experience neurological syndromes post-recovery, significantly diminishing their quality of life. [ 5 ] Despite advances in research, the molecular mechanisms underlying severe pneumonia remain incompletely understood, [ 6 ] necessitating further investigation. A comprehensive understanding of these mechanisms, along with the identification of novel biomarkers, is crucial for advancing our knowledge of the disease’s pathogenesis and developing targeted therapeutic strategies. Cell death is a fundamental biological process (BP) that occurs throughout growth, development, aging, disease, and other life phenomena. [ 1 ] Programmed cell death (PCD), also referred to as regulated cell death, is an essential mechanism in organismal development. Unlike accidental cell death, PCD is a controlled process that plays a pivotal role in maintaining tissue homeostasis and eliminating damaged or unnecessary cells. [ 7 , 8 ] PCD can occur via various mechanisms, categorized based on triggering stress, morphological features, regulatory signaling pathways, and effector molecules. These include apoptosis, anoikis, autophagy, cell death associated with alkali and copper deposition, immunogenic cell death, iron deposition, lysosome-dependent cell death, macrovesicular death, necroptosis, neutrophil death, oxygen deposition, pyroptosis, PARP-1-mediated apoptosis, and latent death. [ 8 , 9 ] Increasing evidence highlights the significant role of PCD in severe pneumonia. Studies indicate that, in addition to initiating inflammation and immune responses, many viral infections induce PCD in infected cells. [ 10 ] COVID-19 exacerbates disease severity by activating various cell death pathways, such as apoptosis, necroptosis, and pyroptosis. The activation of these pathways may contribute to the inflammatory response and tissue damage triggered by the virus, leading to the severe clinical manifestations of the disease. Consequently, programmed cell death-1 (PD-1) and its ligand (PD-L1) have been proposed as prognostic markers for COVID-19 severity. Further research has shown elevated copper levels in the lung tissue of rats with chronic obstructive pulmonary disease, suggesting its involvement in copper deposition. Studies also indicate significant changes in the immune landscape of mice recovering from severe pneumonia. [ 1 ] Therefore, targeting PCD may offer novel therapeutic approaches for severe pneumonia. However, the role of PCD-related genes (PCD-RGs) in severe pneumonia remains underexplored, and their specific mechanisms are not yet fully understood. Further research is needed to elucidate the involvement of these genes in severe pneumonia. In conclusion, this study aims to identify key PCD-related biomarkers using bioinformatics, with the goal of exploring their roles in severe pneumonia. The findings are expected to provide valuable insights for clinical treatment research, ultimately improving patient outcomes.

Author

Formal analysis: Jiawei Chen, Chuanlin Zhang, Siyi Tu, Xianghui Fan, Jin Huang, Lirong Cai. Investigation: Jiawei Chen, Chuanlin Zhang, Siyi Tu, Xianghui Fan, Jin Huang, Lirong Cai. Writing – original draft: Guozhi Zeng.

Methods

Raw transcriptome data for GSE65682 and GSE171110 were retrieved from the Gene Expression Omnibus (GEO) database. The GSE65682 dataset (platform GPL13667 ) comprised blood samples from 42 healthy individuals (normal group) and 760 cases of severe pneumonia (disease group). The GSE171110 dataset (platform GPL16791 ) included blood samples from 10 healthy individuals (normal group) and 44 cases of severe pneumonia (disease group). Detailed clinical information is provided in Table S1, Supplemental Digital Content, https://links.lww.com/MD/R73 . Additionally, PCD-RGs were compiled from the literature. After removing duplicates, 1548 unique PCD-RGs remained (Table S1, Supplemental Digital Content, https://links.lww.com/MD/R73 ). To assess whether the GSE65682 dataset had sufficient statistical power and minimize the risk of type II errors, a power analysis was conducted at a significance level of α = 0.05. The “limma” package (v 3.54.0) was then used to identify DEGs between the disease and normal groups. The selection criteria were set as |log 2 fold change (FC)| > 0.5 and adjusted P -value ( P .adj) < .05. The “ggplot2” (v 3.3.6) and “pheatmap” (v 1.0.12) packages were applied to visualize the DEGs in volcano plots and heatmaps, respectively. The top 10 up- and down-regulated genes were labeled based on log 2 FC rankings. For the GSE65682 dataset, using severe pneumonia as the phenotype, the “WGCNA” package (v 1.71) was utilized to perform WGCNA to identify genes associated with severe pneumonia. Hierarchical clustering was performed on all samples, and outlier samples were identified and removed. Before constructing the co-expression network, the optimal soft threshold (power) was determined by selecting a threshold that achieved a scale-free fit index ( R 2 ) close to 0.85, with the mean connectivity approaching 0. A dynamic hybrid cutting method was then employed to construct a cluster dendrogram of different gene modules, setting the minimum number of genes per module to 100. The correlation between severe pneumonia and gene modules was assessed using the Pearson function in WGCNA, and correlation heatmaps were plotted. Modules most strongly correlated (either positively or negatively) with the phenotype ( P < .05) were defined as key modules, and genes within these key modules were identified as WGCNA-genes for further analysis. The DEGs, PCD-RGs, and WGCNA-genes were intersected using “ggvenn” (v 0.1.9) to identify candidate genes. Gene Ontology (GO) functional annotation, covering BPs, molecular functions (MFs), and cellular components (CCs), as well as Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, were performed using “clusterProfiler” (v 4.6.2; P < .05). The top 5 functions in BPs, MFs, and CCs, along with the top 10 KEGG pathways, were displayed in order of significance based on −log10(p) ranking. To investigate the interactions among candidate genes, a protein–protein interaction network was constructed using the Search Tool for Retrieval of Interacting Genes (STRING, https://cn.string-db.org/ ) database (confidence = 0.4), and the network was visualized using Cytoscape (v 3.7.2). Machine learning models, including least absolute shrinkage and selection operator (LASSO) and support vector machines-recursive feature elimination (SVM-RFE), were applied to identify candidate biomarkers from the aforementioned genes. In the GSE65682 dataset, “glmnet” (v 4.1.4) was used to perform LASSO on candidate genes, selecting the penalty parameter (λ) that minimized the bias in likelihood estimation. The LASSO signature genes were determined when λ was at its minimum. SVM-RFE was performed using the “caret” package (v 6.0-93). Redundant variables were removed by eliminating the feature vectors generated by the SVM, followed by 10-fold cross-validation. The SVM signature genes were identified when the model error rate was at its lowest. The LASSO and SVM signature genes were then intersected using “ggvenn,” and the intersected genes were recorded as candidate biomarkers. In both the GSE65682 and GSE171110 datasets, the expression levels of candidate biomarkers between the disease and normal groups were compared using the Wilcoxon test ( P < .05) via “ggpubr” (v 0.6.0), and visualized by “ggplot2.” Additionally, receiver operating characteristic (ROC) curves for candidate biomarkers were generated using “pROC” (v 1.18.0) in both datasets. Biomarkers with an area under the curve (AUC) > 0.7 and significant inter-group differences with consistent trends were identified as candidate biomarkers ( P < .05). The “pacman” (v 0.5.0) package was used to calculate the geometric mean of functional similarities between proteins corresponding to the biomarkers, to assess the functional similarity among them. Biomarkers were ranked according to their average functional similarity. To further explore the altered BPs associated with the biomarkers in GSE65682 , Spearman correlation analysis was performed between each biomarker and other genes using the “psych” package (v 2.2.9). Biomarkers were ranked based on the magnitude of the correlation coefficients. The “c2.cp.kegg.v2023.1.Hs.symbols” reference gene set was exported from the Molecular Signatures Database. GSEA was then performed using “GSEABase” (v 1.60.0) with a significance threshold of P 1. The top 10 signaling pathways were visualized using “GseaVis” (v 0.0.8) based on P -value ranking. In the GSE65682 dataset, the proportion of 22 immune cell types between the disease and normal groups was estimated using the CIBERSORT algorithm ( P < .05). The Wilcoxon test was then employed to evaluate differences in the abundance of immune cells between the 2 groups, using the “ggpubr” package ( P < .05). The results were visualized through box plots generated by the “ggplot2” package ( P < .05). Additionally, Spearman’s correlation analysis was performed using the “psych” package to assess associations between differential immune cells and between differential immune cells and biomarkers. The correlation heatmap was generated using the “corrplot” package (v 0.92), with a threshold set at |correlation coefficient (cor)| > 0.3 and P < .05. RNA was isolated from 5 severe pneumonia and 5 normal blood samples collected from individuals using TRIzol reagent (Ambion, Austin). The samples were collected at Fujian University of Traditional Chinese Medicine Affiliated Second People’s Hospital, and the study was approved by the Ethics Committee of the 900th Hospital of PLA Joint Logistic Support Force (2025-066). All patients provided informed consent. After collection, RNA integrity was confirmed through electrophoresis, and purity was assessed using a NanoPhotometer N50. The isolated RNAs were subsequently used for cDNA synthesis with the SureScript first-strand cDNA synthesis kit (Servicebio, Wuhan, China). Reverse transcription-quantitative polymerase chain reaction (RT-qPCR) was performed using universal blue SYBR Green qPCR Master Mix (Servicebio, China) on a CFX Connect RT-qPCR detection system (BIO-RAD, Hercules ). Primers for biomarkers and the internal reference gene (GAPDH) are listed in Table S2, Supplemental digital Content, https://links.lww.com/MD/R74 , while the reaction system and RT-qPCR program are detailed in Tables S3–S6, Supplemental digital Content, https://links.lww.com/MD/R74 . Relative expression levels were determined using the 2 –ΔΔCT method. The t -test ( P < .05) was applied to evaluate group differences, and data visualization was conducted using GraphPad Prism 5 software (v 8.0; San Diego ). Ethical approval for this study was granted by the Ethics Committee of the Second People’s Hospital Affiliated to Fujian University of Traditional Chinese Medicine. All data analyses were performed using R software (v 4.2.2; Vienna, Austria ). The Wilcoxon test or t -test was used for group comparisons, with a significance threshold set at P < .05.

Results

The power analysis revealed that at α = 0.05, the effect size index (Cohen’s d ) reached 1.1927, indicating a large practical effect size between the disease and normal groups. The statistical power approached 1.0, confirming sufficient power to detect true differential expression, which validated the adequacy of the sample size and effect size in GSE65682 for reliable DEG identification. A total of 1397 DEGs were identified between the disease and normal groups in GSE65682 , consisting of 466 upregulated genes (e.g., MCEMP1, ARG1, and CLEC4D) and 931 downregulated genes (e.g., LRRN3, BCL11B, and FCER1A; Fig. 1 A, B). WGCNA was subsequently applied to identify genes associated with severe pneumonia. After confirming no outlier samples, a soft threshold (power) of 8 ( R 2 = 0.85) was selected for network topology analysis (Fig. 1 C, D). Using the dynamic hybrid cutting method, 11 modules were identified (including the gray module; Fig. 1 E). The MEred module (cor = 0.42, P < .05) and MEyellow module (cor = −0.55, P < .05) showed the strongest positive and negative correlations with severe pneumonia, respectively, and were selected as key modules (Fig. 1 F). From these, 1900 WGCNA-genes were obtained. Identification of DEGs and WGCNA-genes. (A) Volcano plot of DEGs between disease and normal groups in the GSE65682 dataset. The x -axis represents log 2 fold change, and the y -axis represents −log 10 P -value. Blue dots represent downregulated genes (down), red dots represent upregulated genes (up), and gray dots represent genes with no significant difference (not). (B) Heatmap of DEGs. The upper part shows the density distribution of gene expression levels, with a color-coded scale on the right indicating density values. The lower part displays the clustered heatmap of DEGs, where rows represent genes and columns represent samples. (C) Sample clustering tree. The upper part shows the sample clustering results, where the branches represent samples and the y -axis represents the height of hierarchical clustering. The lower part displays the traits corresponding to the branches, with red indicating the grouped samples. (D) Identification of soft threshold. The x -axes of both plots represent the weight parameter (power value). In the left plot, the y -axis shows the scale-free fit index (signed R 2 ), where a higher R 2 indicates a better approximation to a scale-free distribution. In the right plot, the y -axis represents the mean adjacency function value for all genes in the corresponding gene module. (E) Cluster dendrogram. Different colors represent different modules, with gray indicating genes that cannot be classified into any module. (F) Relationships between modules and traits (disease and normal samples). DEG = differentially expressed gene, WGCNA = weighted gene co-expression network analysis. Next, 1548 PCD-RGs, 1397 DEGs, and 1900 WGCNA-genes were intersected, yielding 80 candidate genes (Fig. 2 A). These 80 candidate genes were enriched in 957 GO pathways, including 828 BPs, such as regulation of apoptotic signaling pathway, 57 CCs, such as immunological synapse, and 72 MFs, such as protease binding (Fig. 2 B and Table S7, Supplemental Digital Content, https://links.lww.com/MD/R73 ). Additionally, the candidate genes were enriched in 59 KEGG pathways, including the T cell receptor signaling pathway, NF-κB signaling pathway, and Th17 cell differentiation (Fig. 2 C and Table S8, Supplemental Digital Content, https://links.lww.com/MD/R73 ). These results suggest that the candidate genes are involved in regulating cell death, immune system activation, vesicle-mediated transport, and enzyme activity regulation. The protein–protein interaction network revealed stronger interactions between MMP9, CD8A, LCK, PRF1, CD28, GZMB, BCL2, and ITGAM compared to other proteins in the network (Fig. 2 D). Identification and function analysis of candidate genes. (A) Venn diagrams showing the intersection of DEGs, WGCNA-genes, and programmed cell death-related genes (PCD-RGs). (B) Gene Ontology (GO) enrichment analysis of the candidate genes. Red, blue, and gray represent biological process (BP), cellular component (CC), and molecular function (MF), respectively. The first layer shows GO functions, the second layer shows the number of up- and down-regulated genes, the third layer shows the total number of enriched genes, and the fourth layer shows pathway IDs. (C) Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the candidate genes. Red represents KEGG pathways. The first layer shows KEGG pathways, the second layer shows the number of upregulated and downregulated genes, the third layer shows the total number of enriched genes, and the fourth layer shows KEGG pathway IDs. (D) Protein–protein interaction (PPI) network for candidate genes. Darker colors indicate a more central position in the network, with each edge representing a protein–protein interaction relationship. BP = biological process, CC = cellular component, DEG = differentially expressed gene, KEGG = Kyoto Encyclopedia of Genes and Genomes, MF = molecular function, PCD-RG = programmed cell death-related gene, PPI = protein–protein interaction, WGCNA = weighted gene co-expression network analysis To further refine candidate biomarkers, machine learning models were applied. LASSO regression analysis with lambda.min (0.0001371) identified 11 LASSO signature genes, including S100A8, PTGDS, and PTGDR (Fig. 3 A, B). Additionally, 23 SVM-RFE signature genes were identified through the SVM-RFE model, including S100A8, PRKCQ, and VNN1 (Fig. 3 C). The results of both algorithms were intersected, resulting in the identification of 5 candidate biomarkers: S100A8, DYRK2, CX3CR1, CDKN2D, and BCL2 (Fig. 3 D). In both the GSE65682 and GSE171110 datasets, BCL2, CDKN2D, DYRK2, and S100A8 exhibited consistent expression trends. Specifically, CDKN2D and S100A8 were significantly upregulated in the disease group, while BCL2 and DYRK2 showed the opposite trend ( P < .05; Fig. 3 E). In both datasets, the AUC values for BCL2, CDKN2D, DYRK2, and S100A8 exceeded 0.7, confirming their potential as biomarkers (Fig. 3 F). Determination of biomarkers. (A, B) Identification of 11 LASSO signature genes using the least absolute shrinkage and selection operator (LASSO) algorithm. Panel A shows the LASSO coefficient penalty plot, where the x -axis represents the log λ, and the y -axis represents the variable coefficients. As λ increases, the variable coefficients approach 0. When the optimal λ is reached, variables with a coefficient of 0 are eliminated. Panel B displays the λ selection plot in the LASSO model, illustrating 10-fold cross-validation for the tuning parameter. The x -axis represents the log λ, and the y -axis represents the model error. (C) Identification of 23 SVM-RFE signature genes using the support vector machine recursive feature elimination (SVM-RFE) algorithm. The blue line indicates the accuracy under different features. The x -axis shows the number of genes, and the y -axis shows the corresponding accuracy. (D) Venn diagram showing the overlap of genes from LASSO and SVM-RFE analyses to identify 5 candidate biomarkers. (E) Expression analysis of the candidate biomarkers in the GSE65682 and GSE171110 datasets. Red represents the disease group, and blue represents the normal group. The y -axis shows the expression levels of the corresponding genes in different groups. (F) Receiver operating characteristic (ROC) curves for the identified biomarkers. The x -axis represents the false positive rate (specificity), the y -axis represents the true positive rate (sensitivity), and AUC refers to the area under the ROC curve. AUC = area under the curve, LASSO = least absolute shrinkage and selection operator, ROC = receiver operating characteristic, SVM-RFE = support vector machine recursive feature elimination. Functional similarity analysis revealed significant functional overlap among S100A8, DYRK2, CDKN2D, and BCL2, with similarity scores exceeding 0.5 (Fig. 4 A). In the GSEA results for these biomarkers, a total of 20, 23, 18, and 23 pathways were enriched by BCL2, CDKN2D, DYRK2, and S100A8, respectively (Table S9, Supplemental Digital Content, https://links.lww.com/MD/R73 ). Pathways such as graft-versus-host disease, complement and coagulation cascades, ribosome, and spliceosome were notably associated with the biomarkers (Fig. 4 B). These associations suggest that the biomarkers may serve as indicators or predictors for diseases and conditions involving immune dysregulation, blood coagulation, protein synthesis, and RNA processing. Functional exploration of biomarkers. (A) Functional similarity analysis among the identified biomarkers. The y -axis represents the 4 biomarkers, and the x -axis represents the functional similarity score of the biomarkers. (B) Gene set enrichment analysis (GSEA) of key biomarkers, including BCL2, CDKN2D, DYRK2, and S100A8. The GSEA plot is divided into 2 parts. Part 1: The top 10 lines represent the enrichment score (ES) curves of genes. The y -axis denotes the corresponding running enrichment score (running ES), with peaks indicating the enrichment score for the gene set. The genes before the peak represent the core genes of the set. The x -axis represents each gene in the gene set, corresponding to the barcode-like vertical lines in part 2. Part 2: The barcode-like section represents “Hits,” where each vertical line corresponds to a gene in this gene set. ES = enrichment score, GSEA = gene set enrichment analysis. In GSE65682 , immune cell infiltration abundances, including M0 macrophages, were compared between disease and normal groups (Fig. 5 A). Neutrophils exhibited the highest abundance. Wilcoxon test analysis identified 12 immune cell types (e.g., activated CD4 memory T cells) with significant differences ( P < .05; Fig. 5 B). Spearman correlation analysis revealed mostly negative correlations among the differential immune cells, with neutrophils showing the strongest negative correlation with monocytes (cor = −0.53, P < .05). Conversely, M1 macrophages exhibited the strongest positive correlation with M0 macrophages (cor = 0.52, P < .05; Fig. 5 C). However, no significant correlations were observed between the 4 biomarkers and the differential immune cells (Fig. 5 D). Immune infiltration analysis of biomarkers. (A) Stacked diagram showing the distribution of immune cell infiltration in the GSE65682 dataset. The x -axis represents samples from different groups, the y -axis represents the proportion of immune cells in the samples, and different colors indicate different types of immune cells. (B) Boxplot comparing immune cell infiltration levels between disease and normal samples. The x -axis represents immune-infiltrating cells, and the y -axis represents the abundance value of immune-infiltrating cells. **** indicates P < .0001, *** indicates P < .001, * indicates P < .05, and ns indicates no significance. (C) Correlation heatmap depicting the relationships among 12 differentially infiltrated immune cells. Red and blue represent positive and negative correlations, respectively. The darker the color, the stronger the correlation. (D) Correlation bubble diagrams illustrating the associations between 4 biomarkers and 12 differentially infiltrated immune cells. The x -axis represents the correlation coefficient, and the y -axis represents the type of immune cells. The size of the circles corresponds to the magnitude of the correlation coefficient, and the color corresponds to the P -value. Compared to the normal group, CDKN2D and S100A8 were significantly upregulated in severe pneumonia, while DYRK2 showed a marked downregulation ( P < .05; Fig. 6 ). These findings were consistent with the earlier expression analysis, validating the study’s accuracy. Although BCL2 expression showed a downward trend, this result was not significant, possibly due to the small sample size. Notably, the differential expression of these biomarkers underscores their potential diagnostic value in severe pneumonia. RT-qPCR validation of biomarkers (BCL2, CDKN2D, DYRK2, and S100A8). The x -axis represents groups, and the y-axis quantifies the expression level of the biomarker. * indicates P < .05, and ns indicates no significance. RT-qPCR = reverse transcription-quantitative polymerase chain reaction.

Discussion

PCD is recognized as a fundamental process in the growth and development of living organisms, with accumulating evidence indicating its critical role in severe pneumonia. This study identified 4 PCD-related biomarkers (BCL2, CDKN2D, DYRK2, and S100A8) based on RNA sequencing data through various bioinformatics analyses, exploring their potential functional mechanisms, biological effects, and immune microenvironment characteristics. These findings provide valuable insights into the PCD-related pathological mechanisms and the development of new therapeutic targets for severe pneumonia. BCL2 is a key regulator of mitochondrial apoptosis, functioning as an antiapoptotic protein that promotes cell survival. [ 11 – 13 ] In patients with severe COVID-19, BCL2 expression is upregulated, aiding cell survival by inhibiting apoptosis. However, in patients with severe SARS-CoV-2 infection, the expression of BCL2 family members in T cells decreases, while the transcription of pro-apoptotic proteins Bax and Bak increases, resulting in reduced BCL2 expression. This suggests impaired T cell functionality in patients with severe COVID-19. [ 14 – 16 ] IFNg enhances BAX and BAK signaling pathways, promoting cell death by decreasing BCL2 transcription and increasing iNOS transcription. EGCG has been shown to exert antiapoptotic effects by upregulating BCL2 expression, improving lipopolysaccharide-induced pneumonia, enhancing cell survival, and alleviating pneumonia symptoms. [ 17 , 18 ] Additionally, mitochondrial pyruvate kinase M2 isoform (PKM2) phosphorylates BCL2, interacting with it to directly inhibit apoptosis. [ 18 ] Antiapoptotic members of the BCL2 family are involved in various malignancies, often showing overexpression in tumors. [ 19 ] This suggests that inflammatory responses may alter kinase or phosphatase activity, thereby influencing BCL2 function. CDKN2D (cyclin dependent kinase inhibitor 2D) is a member of the INK4 family of cyclin dependent kinase inhibitors. It forms stable complexes with CDK4 or CDK6, preventing the activation of CDK kinases and thereby regulating cell growth by controlling the G1 phase of the cell cycle. According to GSEA results, the CDKN2D gene is enriched in pathways such as ribosome and spliceosome. Reports indicate that CDKN2D is predicted to be a major contributor to the primary aging characteristic gene, Alzheimer’s disease (AD). Neurons expressing CDKN2D/p19 in AD are considered a distinct cell population with a senescent phenotype. [ 20 ] Increased CDKN2D expression has also been linked to renal tubular cell senescence, resulting in renal dysfunction and injury in patients with chronic ischemia. In the context of endometriosis, reduced miR-451 in normal endometrium may promote cell proliferation and reduce apoptosis by regulating CDKN2D, contributing to the disease’s pathogenesis. [ 21 ] CDKN2D exhibits strong correlations with various immune cells (B cells, CD8+ T cells, CD4+ T cells, macrophages, neutrophils, and dendritic cells) and immune checkpoints (CTLA-4, PD1, and PD-L1; correlation coefficient > 0.3), suggesting its potential as an immunotherapy target. [ 22 ] This suggests that during the inflammatory response, CDKN2D may help preserve lung cell homeostasis by inhibiting excessive cell proliferation, thereby preventing tissue damage caused by uncontrolled cell growth. DYRK2 (dual specificity tyrosine phosphorylation regulated kinase 2) is a member of the dual specificity tyrosine-regulated kinase (DYRK) family, which includes 5 members: DYRK1A, DYRK1B, DYRK2, DYRK3, and DYRK4. DYRK2 has been implicated in tumorigenesis and progression, emerging as a potential target for pharmacological intervention. [ 23 ] According to GSEA results, DYRK2 is enriched in pathways such as the tricarboxylic acid (TCA) cycle and others. Studies have demonstrated that DYRK2 plays a role in neuroinflammation regulation, with its upregulation or abnormal activity potentially exacerbating the inflammatory response in the nervous system. This has been linked to various neurodegenerative conditions. [ 24 ] In epilepsy, increased DYRK2 expression may contribute to nerve cell survival and pathological changes, while in AD, DYRK2 may affect neuronal health and function, influencing disease progression. Furthermore, DYRK2 is involved in signaling pathways associated with Parkinson’s disease pathology, where its expression or activity may influence neuronal survival and recovery following injury. DYRK2 has been identified as a tumor suppressor in multiple cancer types, including breast, colon, liver, ovarian, brain, and lung cancers. It initiates critical antitumor and pro-apoptotic pathways, and lower DYRK2 expression levels are correlated with poor patient prognoses. DYRK2 regulates proteins that either inhibit or promote the development and survival of specific cancer proteins. [ 25 ] In xenograft models, DYRK2 overexpression has been shown to inhibit liver cancer cell growth by inducing apoptosis and suppressing cell proliferation, suggesting that stabilization, maintenance, or forced expression of DYRK2 could have potential for novel anticancer gene therapies. [ 26 ] Additionally, DYRK2 has been identified as an actionable checkpoint in a key leukemia network, where it controls the activation of p53 and the proteasomal degradation of c-MYC, leading to impaired survival and self-renewal of leukemia stem cells. [ 27 ] Thus, it is plausible that DYRK2 may similarly regulate protein activity in lung cells, influencing their proliferation through comparable mechanisms. S100A8 (S100 Calcium Binding Protein A8) is a member of the S100 family of binding proteins, primarily expressed in neutrophils and monocytes. It plays a pivotal role in regulating various inflammatory responses and associated diseases. According to GSEA results, the S100A8 gene is enriched in pathways such as antigen processing and presentation, Allograft Rejection, and others. Studies have shown that S100A8/A9 levels significantly increase during COVID-19 infection, particularly in severe cases, and are closely correlated with the severity of inflammation. [ 28 , 29 ] During SARS-CoV-2 viremia, megakaryocytes (mk) did not induce type I interferon signaling or calprotectin (S100A8/A9) under any conditions. [ 30 ] Paquinimod, a specific inhibitor of S100A8/A9, significantly reduces viral load and alleviates pneumonia in SARS-CoV-2-infected mice. Furthermore, rhubarb in Xuanbai Chengqi Decoction mitigates severe viral pneumonia caused by influenza A by enhancing intestinal barrier function and promoting lung barrier repair, with a downregulation of S100A8 expression. [ 31 ] Thus, S100A8 serves as an inflammatory mediator and plays an integral role in the inflammatory response in patients with severe pneumonia. The GSEA results indicated that the 4 biomarkers – BCL2, CDKN2D, DYRK2, and S100A8 – are primarily enriched in 4 key pathways: graft-versus-host disease, complement and coagulation cascades, ribosome, and spliceosome. The complement and coagulation cascade pathways are critical in the pathogenesis of COVID-19 and carry significant prognostic implications. Overactivation of these pathways exacerbates the inflammatory response and promotes thrombosis, as complement activation induces coagulation, and inflammatory mediators in the coagulation process further activate the complement system. This creates a vicious cycle of inflammation and thrombosis, worsening the disease progression. Immune infiltration analysis revealed that neutrophils comprised the highest proportion among the various sample groups. Neutrophils play a critical role in the pathogenesis of severe COVID-19, exacerbating the condition by promoting inflammation, tissue damage, and thrombosis. In acute respiratory distress syndrome, neutrophil hyperactivation increases vascular permeability through the release of defensins and neutrophil elastase (NE). Furthermore, the formation of neutrophil extracellular traps (NETs) not only activates alveolar macrophages for pathogen clearance but also significantly enhances endothelial cell activation, thereby intensifying the inflammatory cycle. NETs can actively damage endothelial tissue. [ 32 , 33 ] Excessive neutrophil activation, elevated neutrophil-to-lymphocyte ratio, and the formation of NETs position neutrophils as a promising therapeutic target. [ 34 ] Elevated levels of C-reactive protein, IL-6, IL-10, and neutrophils further suggest the critical involvement of innate immune cells in the pathogenesis of severe disease. [ 35 ] NETs, formed from the release of decondensed chromatin to trap pathogens, can also trigger immune thrombosis. [ 36 ] Dexamethasone alters circulating neutrophils by modulating interferon responses, downregulating interferon-stimulated genes, and activating IL-1R2 on neutrophils during severe COVID-19. Additionally, dexamethasone induces the expansion of immunosuppressive immature neutrophils, transforming neutrophils from information receivers to active participants in the immune response. [ 37 ] In COVID-19, heparin and low molecular weight heparins neutralize extracellular cytotoxic histones, accelerate the degradation of DNA-EI-mediated NETs, and prevent the aggregation of NETs into nano- and microparticles. Activated complement in the bloodstream of patients with COVID-19 contributes to thrombotic NET formation, potentially exacerbating inflammation and thrombosis. Anticomplement therapies, such as eculizumab or AMY-101, have shown efficacy in a limited cohort of patients with severe or critical COVID-19, including those who are intubated. [ 38 ] Macrophage-mediated autophagy, a conserved self-degradation process of neutrophils, plays a pivotal role in neutrophil phagocytosis of pathogens. In COVID-19, the excessive release of NETs leads to lung tissue damage due to alveolar microcirculation disorder, potentially resulting in acute respiratory distress syndrome. [ 39 ] Elevated levels of cell-free DNA, myeloperoxidase-DNA (MPO-DNA), and citrullinated histone H3 (Cit-H3) in the serum of patients with COVID-19 further confirm NET formation, with the latter 2 serving as specific markers of NETs. [ 40 ] Clinical outcomes in patients with COVID-19 are often associated with decreased platelet, lymphocyte, hemoglobin, eosinophil, and basophil counts, alongside an increase in neutrophil counts and ratios of neutrophils to lymphocytes and platelets to lymphocytes. [ 41 ] This study identified 4 PCD-related biomarkers (BCL2, CDKN2D, DYRK2, S100A8) with multidimensional translational value for the clinical management of severe pneumonia. Currently, the diagnosis of severe pneumonia relies on imaging and traditional inflammatory markers, which suffer from delays and lack specificity. The significantly upregulated S100A8 and CDKN2D (confirmed by RT-qPCR, P < .05, with ROC AUC values exceeding 0.7 in both cohorts) can serve as combined indicators for early diagnosis. The downregulated BCL2, associated with lung cell apoptosis, can further assist in disease stratification and guide earlier lung protection interventions. Regarding prognosis and personalized management, severe pneumonia outcomes exhibit substantial variability, and reliable predictive indicators are scarce. The 4 biomarkers are collectively enriched in complement and coagulation pathways, which are central to thrombosis and disease progression in patients with COVID-19. Monitoring their levels can help predict thrombosis risk. For example, patients with elevated S100A8 expression and decreased DYRK2 expression may require earlier initiation of anticoagulant therapy to optimize clinical resource allocation. However, this study has certain limitations. First, the small sample size limits the ability to detect true differences, which may explain the lack of statistically significant differences in BCL2 expression between the disease and control groups. Second, the precise molecular regulatory mechanisms of these biomarkers in severe pneumonia remain unclear, and further experimental data are needed to validate these conclusions. Future research will involve expanding the sample size and incorporating additional clinical phenotypic data, such as disease progression and drug response, to better assess the clinical significance of these biomarkers. Furthermore, through cell experiments and animal models, a comprehensive investigation into the molecular mechanisms of these biomarkers in severe pneumonia will be undertaken to provide a more solid theoretical foundation for precise diagnosis and treatment of the disease.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: pmc-nxml ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

SciLite annotations

organisms 9

severe acute respiratory syndrome coronavirus 2 group 3 species suid herpesvirus 1 strain kaplan rattus sp. mus sp. severe acute respiratory syndrome coronavirus 2 severe acute respiratory syndrome coronavirus 2 mus sp. rodents

chemicals 12

copper iron oxygen copper lipopolysaccharide tricarboxylic acid laquinimod dexamethasone dexamethasone heparin heparins rituximab

Source provenance

europepmc: last seen: 2026-08-01T06:07:04.264727+00:00
scilite: last seen: 2026-06-21T06:47:03.627287+00:00
unpaywall: last seen: 2026-05-21T05:10:58.409756+00:00

License: CC-BY-NC-4.0