Construction and validation of a histone-related gene signature for the diagnosis of endometriosis

Ginekologia polska · 2025 · vol. 96(1) , pp. 22–34 · doi:10.5603/gpl.96199 · PMID:39411815
other OA: gold CC-BY-NC-ND-4.0
AI-generated summary by claude@2026-06, 2026-06-09

This study identified and validated a four-histone-related gene signature (JUNB, FRY, LMNB1, SPAG1) for diagnosing endometriosis and revealed increased plasma cells in affected patients.

One-sentence paraphrase of the abstract; not a substitute for reading it. No clinical advice. How this works

Abstract

OBJECTIVES: Endometriosis is a common chronic disease in childbearing women and a major cause of infertility. Our study aimed to identify and validate a novel gene signature for diagnosing endometriosis based on histone-related genes (HRGs), and to investigate their biological functions in endometriosis. MATERIAL AND METHODS: RNA sequence data were downloaded from the Gene Expression Omnibus database, and HRGs were retrieved from the GeneCards database. We identified differentially expressed genes using the limma package, and constructed a diagnostic model using the rms package. Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analyses were performed for visualization, annotation, and integrated discovery. Subsequently, we validated the model using the recall and decision curve analysis (DCA). Additionally, we analyzed the immune microenvironment features using CIBERSORT. RESULTS: A total of 18 differentially expressed HRGs were identified in patients with endometriosis compared with controls. GO and KEGG enrichment was mainly in spindle organization, positive regulation of the cell cycle process, progesterone-mediated oocyte maturation, and cellular senescence and cell cycle. We obtained a signature of four HRGs (JUNB, FRY, LMNB1, and SPAG1). DCA revealed that the diagnostic model benefits patients with endometriosis, regardless of the incidence. CIBERSORT analysis showed that the number of plasma cells increased significantly in endometriosis samples from all four datasets. CONCLUSIONS: Our findings provide novel insights into the function of HRGs in the development of endometriosis and identify a new signature of four HRGs that may serve as valuable diagnostic markers and therapeutic targets for this disease.
Full text 25,215 characters · extracted from oa-doi-fallback · 5 sections · click to expand

Introduction

Endometriosis is a hormone-dependent chronic disease occurring in childbearing women and characterized by the presence of endometrial-like tissue outside the uterus [1]. Endometriosis is a common cause of chronic pelvic pain and infertility, and both symptoms critically affect the quality of life and health [2]. Histological examination of lesions and direct visualization are the current gold standard for the diagnosis of endometriosis [3]. As invasive methods are essential, a definitive diagnosis of endometriosis usually requires a long time after symptom onset [4]. Therefore, further studies are necessary to explore the potential mechanisms and detect novel targets for endometriosis diagnosis and therapy. Histone proteins wrap and package DNA inside eukaryotic nuclei [5]. Post-translational histone modifications play a significant role in the regulation of gene expression and chromatin conformations, which are closely related to normal development and disease processes [6]. Histone modifications include methylation, ubiquitylation, acetylation, and phosphorylation, and they represent a universal set of epigenetic marks that regulate various biological processes [7]. Histone variants are important epigenetic regulators of the genome [5]. According to recent reports, histones are strongly correlated with the development of metabolic diseases, infertility, neuropsychiatric disorders, and nephropathy [8]. Furthermore, histone mutations are associated with multiple tumors such as sarcomas, carcinosarcomas, head and neck cancers, and gliomas [9]. The importance of histones highlights the need to further investigate the association between histones and various diseases, which can provide not only a new diagnostic approach, but also a novel therapeutic target. However, the relation between histone-related genes (HRGs) and endometriosis remains unclear. Many studies have utilized bioinformatic analyses to increase our understanding of the molecular mechanisms underlying endometriosis. However, the lack of further studies pertaining to immunomodulatory mechanisms makes the pathogenesis of endometriosis poorly understood and the small sample size may have impaired the credibility of the results. Hence, further analyses are essential to identify more reliable and accurate diagnostic biomarkers involving a large sample size to comprehensively explore the potential molecular mechanisms underpinning endometriosis.

Objectives

Our study aimed to utilize an integrative strategy to identify and validate a novel gene signature for diagnosing endometriosis based on HRGs, and to investigate their biological functions in endometriosis. Our study may provide dependable biomarkers for noninvasive diagnosis and new therapeutic targets for treatment of endometriosis.

Material and methods

RNA sequence data and bioinformatics analysis Four gene expression datasets with clinical information, including GSE7305 [10], GSE7307, GSE25628 [11] and GSE51981 [12], were downloaded from the Gene Expression Omnibus (GEO) database. All datasets were based on the GPL570 platform [HG-U133_Plus_2] and contained a total of 120 endometriosis samples and 111 controls. The GSE7305 dataset included 10 endometriosis samples and 10 controls. The GSE7307 dataset included 18 endometriosis samples and 23 controls. The GSE25628 dataset included 15 endometriosis samples and seven controls. The GSE51981 dataset included 77 endometriosis samples and 71 controls. All data were normalized using log2 (x + 1) for further analyses. Differentially expressed HRGs analysis To identify differentially expressed genes (DEGs), the limma package was used to compare the expression profiles of endometriosis and normal samples from the four datasets. Genes with p 1 were regarded as DEGs. In addition, 10751 HRGs were obtained from the GeneCards database and from previous studies. We obtained differentially expressed HRGs by taking the intersection of the four datasets of DEGs and HRGs. The results are shown in a volcano plot, heatmap, and Venn diagram. Functional and pathway enrichment analyses Gene Ontology (GO) analysis consists of three annotations: cellular component (CC), molecular function (MF), and biological process (BP). GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed to identify biological functions using the clusterProfiler package [13]. The results of the enrichment analysis were visualized using the GOplot package [14] and statistical significance was set at p < 0.05. Chromosome location analysis To examine the correlation of histone-related DEGs, we performed Spearman’s correlation analysis with the cowplot package and plotted the heatmap, scatter plots, and correlation curves, with a significance threshold of p < 0.05. The Rcircos package [15] was used to plot the histone-related DEG chromosome location diagrams, based on the information of chromosome gene location from the ENSEMBL database. Construction of protein-protein interaction (PPI) network To identify and predict protein-protein interactions, a PPI network was constructed utilizing the Search Tool for the Retrieval of Interacting Genes (STRING) database. Subsequently, the PPI network was visualized using Cytoscape (v3.7.2). The clustering coefficient algorithm was applied to detect the 10 hub genes in the PPI network. Least absolute shrinkage and selection operator (LASSO) regression analysis To screen the appropriate diagnostic biomarkers for endometriosis, we performed a LASSO-logistic analysis (iteration = 1000) using the glmnet package. LASSO regression is widely used to determine the most suitable biomarkers and prevent overfitting. Construction of a diagnostic model for endometriosis based on the HRGs signature In our study, we performed LASSO regression and multivariate Cox regression analyses to select appropriate variables. Additionally, we built and visualized a predictive nomogram model using the rms package with the selected diagnostic signature. Discrimination of the diagnostic model was evaluated using the concordance index (c-index) [16]. A c-index > 0.7 illustrated that the nomogram model had a higher diagnostic value. Calibration of the model was tested using the Hosmer–Lemeshow test. To assess the clinical value of the model, a decision curve analysis (DCA) curve was drawn using the ggDCA R package. Analysis of immune cell infiltration CIBERSORT (https://cibersort.stanford.edu/), an online tool was applied to immune cell subtype deconvolution using a gene expression matrix via linear support vector regression. The tool assessed immune cell infiltration levels in samples based on 22 immune cell subtype gene expression signatures. Construction of a competing endogenous RNA (ceRNA) network In our study, the miRanda, miRDB, and TargetScan databases were used to predict the micro RNAs (miRNAs) associated with key mRNAs. Next, the targeted miRNAs were obtained from the intersection of the results from the three databases and are shown with a Venn diagram. In addition, these miRNAs were used to predict long non-coding RNAs (lncRNAs) using the lncBase [17] and Starbase [18] databases, and the key lncRNAs were acquired from the intersection of the two database results. Finally, these results and key mRNAs were constructed and visualized using Cytoscape software. Statistical analysis All data processing and analyses were performed using R software (version 4.1.1). To assess the statistical significance of two groups of continuous variables, normally distributed variables were analyzed with independent t-tests and non-normally distributed variables were analyzed using the Mann–Whitney U test. The chi-square test or Fisher’s exact test was used for categorical variables. Pearson correlation analysis was used to calculate the correlation coefficients between genes. Statistical significance was set at p < 0.05.

Results

Identification of DEGs and functional enrichment analysis In our study, we performed differential expression analysis between endometriosis and normal samples from the GSE51981, GSE25628, GSE7307, and GSE7305 datasets. A total of 194 DEGs, 25 upregulated and 169 downregulated, were identified in GSE51981 (Fig. 1A, Tab. S1). A total of 717 DEGs, 388 upregulated and 329 downregulated, were identified in GSE25628 (Fig. 1B, Tab. S2). A total of 1878 DEGs, 1303 upregulated and 575 downregulated, were identified in GSE7307 (Fig. 1C, Tab. S3). A total of 1575 DEGs, 880 upregulated and 695 downregulated, were identified in GSE7305 (Fig. 1D, Tab. S4). Next, the intersection of the four groups of DEGs and HRGs was performed to obtain 18 histone-related DEGs (Fig. 1E). In addition, we showed the chromosome locations of 18 genes using the Rcircos R package (Fig. 1F) and explored the correlation of 18 genes in four datasets through Spearman’s correlation analysis (Fig. S1–4). To explore the biological function of the 18 histone-related DEGs, functional annotation was performed using GO and KEGG enrichment analyses (Fig. 1G–I, Tab. S5-6). These genes were mainly enriched in the BP terms spindle organization, positive regulation of cell cycle process, and positive regulation of cell cycle phase transition; the CC terms spindle pole centrosome, spindle pole, and spindle microtubule; and the MF terms tubulin binding, transcription corepressor binding, protein serine, and phospholipase binding. The enriched KEGG pathways included the p53 signaling pathway, progesterone-mediated oocyte maturation, cellular senescence, and cell cycle. To compare the differential expression of 18 genes, we performed Wilcoxon analysis on the four datasets, which show the results in box plots (Fig. 2A–D). These findings indicate that the biological function of histone-related DEGs may be correlated with the regulation of the cell cycle, and these genes mainly participate in cellular maturation, senescence, and the p53 pathway. Construction of a PPI network of histone-related DEGs Through interactions, proteins participate in biological processes, including signal transduction, regulation of gene expression, and adjustment of the cell cycle. As there is a close interaction between genes that regulate the same biological process, we constructed a PPI network using the STRING database to analyze the correlation of histone-related DEGs (Fig. 3A). The PPI network comprised 18 nodes and 75 edges. Subsequently, the cluster coefficient algorithm of Cytohubba was used to calculate the weight of each node. As shown in Figure 3B, LMNB1, CENPU, and SHCBP1 had the highest relative weights and the highest correlation with other DEGs. Construction of a diagnostic model for endometriosis Histones may play a significant role in endometriosis progression, and the differential expression of histones observed between endometriosis and normal samples may perform different biological functions. Hence, there is a great possibility of building a diagnostic model for endometriosis, based on histone-related DEGs. In our study, GSE51981 was used as the training set and the others were used as validation sets. We performed a LASSO-logistic analysis (iteration = 1000) to explore the association between the 18 genes and endometriosis. As shown in Figure 4A, the signature, including JUNB, FRY, LMNB1, and SPAG1, occurred 976 times during 1000 cycles of analysis. An endometriosis diagnostic model was constructed based on the above-mentioned four genes, and discrimination of the model was evaluated in the training and validation sets. The c-index of the diagnostic model in the training set was 0.72 and the c-indices in the validation sets were 0.82 (GSE25628), 0.91 (GSE7307), and 0.85 (GSE7305) ( Fig. 4A and 4B). These results illustrate that the model has high diagnostic value in patients with endometriosis. The recall curve exhibited the calibration of the model tested by the Hosmer–Lemeshow test. The results had statistical significance (p = 0.051), indicating that the prediction of the model was close to the actual data (Fig. 4C). Furthermore, DCA revealed that the diagnostic model would always benefit endometriosis patients, regardless of the incidence (Fig. 4D). Overall, these results indicate that this model may provide a reliable diagnostic approach for patients with endometriosis. Validation of key diagnostic genes and functional correlation analyses To illustrate the diagnostic value of key genes, we performed univariate logistic regression analysis using GSE25628 and GSE7307. As shown in Figure 5A and 5B, JUNB and FRY were risk factors for endometriosis, and LMNB1 and SPAG1 were protective factors. Functional correlation analyses showed that JUNB and LMNB1 had a higher correlation with other genes, whereas SPAG1 had a poor correlation (Fig. 5C and 5D). Immune infiltration analysis Previous studies have shown that the immune microenvironment plays a significant role in disease initiation and development. Therefore, identifying changes in the immune microenvironment between endometriosis and controls is important for diagnosis and treatment. We analyzed the differences in immune cell infiltration levels between the endometriosis and control groups. Twenty-two types of immune cell infiltration levels of the four datasets were evaluated using CIBERSORT, and the different abundances of infiltrating immune cells were compared using the Wilcoxon test. Our results show that, compared with normal samples, plasma cells increased significantly in endometriosis samples from all four datasets, and these differences were statistically significant (Fig. 6). Correlation analysis of key diagnostic genes and immune cell infiltration In this study, the relationship between key diagnostic gene expression and multiple immune cell infiltration was investigated using correlation analysis. As shown in Figure 7A–E, JUNB expression negatively correlated with plasmocytes, M2 macrophages, and resting mast cells, and positively correlated with monocytes. As shown in Figure 7F–I, LMNB1 was positively correlated with regulatory T cells (Tregs) and negatively correlated with follicular helper T cells, activated NK cells, and resting mast cells. FRY expression positively correlated with resting memory CD4+ T cells, activated dendritic cells, and M0 macrophages, and negatively correlated with CD8+ T cells, plasmocytes, activated NK cells, and follicular helper T cells (Fig. 7O–U). Finally, integration analysis of the correlation between key diagnostic genes and immune cells is shown with a dot plot (Fig. 7V). Construction of a ceRNA network To construct an endometriosis ceRNA network, we predicted key diagnostic genes related to miRNAs using the miRanda, miRDB, and TargetScan databases. We obtained 587, 188, and 1795 miRNAs from miRanda, miRDB, and TargetScan databases, respectively. In addition, we examined the intersection of these miRNAs and identified 84 key miRNAs (Fig. 8A, Tab. S7). Furthermore, to detect potential lncRNAs, we predicted lncRNAs related to 84 miRNAs using the lncBase and Starbase databases. A total of 1125 lncRNAs were obtained from the lncBase database and 489 were obtained from the Starbase database. After the intersection of the two groups of lncRNAs, 113 lncRNAs were obtained (Fig. 8B, Tab. S8). Finally, integration analysis of key diagnostic genes, miRNAs, and lncRNAs was performed using the ceRNA network (Fig. 8C).

Discussion

Endometriosis is a common chronic disease in childbearing women; however, some patients with endometriosis endure chronic pelvic pain without a definitive diagnosis [4]. Delayed diagnosis of endometriosis is usually attributed to a lack of definite clinical symptoms. In addition, there is a strong correlation between endometriosis and the causes of endometrioid and clear-cell ovarian cancers [1]. Hence, there is an urgent need to identify novel biomarkers for diagnosis and new therapeutic targets for treating endometriosis. In recent years, some scholars have utilized bioinformatics methods to identify specific mRNAs and miRNAs clearly associated with endometriosis, offering a novel perspective for comprehending the mechanism of this condition and exploring new therapeutic avenues [19]. Histone proteins are the most important components of post-translational modifications that participate in almost every disease [20]. According to previous studies, post-translational modifications may be potential drivers for the development of endometriosis [21]. In this study, we explored the role of histones in the development of endometriosis and identified a histone-related diagnostic signature for patients with endometriosis. We performed a series of comprehensive analyses of four GEO datasets (GSE7305, GSE7307, GSE25628, and GSE51981), including 120 endometriosis samples, 111 normal samples, and 18 differentially expressed HRGs. GO and KEGG analyses indicate that these genes were mainly associated with spindles, cell cycle regulation, and oocyte maturation. Next, through LASSO-logistic analysis, we obtained a diagnostic signature (JUNB, FRY, LMNB1, and SPAG1), which was demonstrated to have clinical value for patients with endometriosis. In addition, we found that the number of plasmocytes increased significantly in endometriosis samples from all datasets via the CIBERSORT analysis. Finally, a ceRNA network of diagnostic biomarkers was constructed to further explore the function of this signature in endometriosis. We first identified histone-related DEGs between the endometriosis and normal samples from the four datasets. Next, we performed GO and KEGG enrichment analyses to investigate the functions of these genes. The results show that the enriched GO terms were mainly associated with spindles, positive regulation of cell cycle, and transcription corepressor binding, and the enriched KEGG pathway contained the p53 signaling pathway, progesterone-mediated oocyte maturation, cellular senescence, and cell cycle. Previous studies have shown that correct and accurate mitotic spindle assembly is a prerequisite for complete cell cycle progression, and the spindle of the oocyte is easily impaired in endometriosis, which may be associated with infertility in patients with this disease [22]. Furthermore, the follicular fluid of infertile women with endometriosis may impair the meiotic spindles and nuclear maturation of mature oocytes [23]. Our results reveal that histone-related DEGs are involved in the organization process, spindle components, and progesterone-mediated oocyte maturation pathway, which in turn illustrates that these genes may be correlated with endometriosis-related infertility. The cell cycle of normal endometrial epithelial and stromal cells is damaged by the oxidative imbalance of endometriosis; however, endometriotic cells can maintain high levels of proliferation [24]. Abnormal cell cycle regulation has been reported to be involved in the pathogenesis of endometriosis [25]. In addition, senescence of endometriotic ovarian cumulus granulosa cells, caused by excessive oxidative stress, has been reported to contribute to endometriosis-related infertility [26]. Senescent cell accumulation in endometriosis tissues was recently revealed to facilitate the maintenance of the inflammatory microenvironment in endometriosis, which plays a significant role in the disease progression [27]. In our study, histone-related DEGs participated in the regulation of the cell cycle and cellular senescence, demonstrating their importance in the development of endometriosis. Finally, the p53 signaling pathway was enriched in KEGG analysis. P53 may be associated with upregulation of autophagy in ovarian endometriosis [28]. In addition, as a hallmark of cancer, P53 inactivation may be correlated with the malignant potential of endometriosis. P53 null mutations have been reported to contribute to the poor prognosis of endometrial carcinoma, ovarian endometrioid carcinoma, and ovarian clear cell carcinoma [29]. Overall, our enrichment analyses illustrate that histone-related DEGs are closely related to the process, infertility, and malignant transformation of endometriosis. In this study, we found that the number of plasmocytes increased significantly in endometriosis samples from all datasets compared to that in normal samples. Compared to normal endometrium, endometriotic lesions are essentially heterogeneous and exhibit altered immunoinflammatory profiles [30]. Immunological dysfunction plays a significant role in the growth of endometrial-like tissue outside the uterine cavity and be a therapeutic target for endometriosis [31]. Plasmacytes have been regarded as the foundation of humoral immunity, and the presence of a plasmacytic mass has been reported to be a feature of endometriosis lesions [32]. In our study, high levels of plasma cell infiltration were observed in endometriosis samples, consistent with previous studies. In our study, we constructed a diagnostic model to evaluate the clinical value of histone-related DEGs. LASSO-logistic analysis was applied to select the potential diagnostic signature, which included JUNB, FRY, LMNB1, and SPAG1. Next, we developed and validated a diagnostic nomogram model. Based on evaluation of the discrimination and calibration of the model, we found that this is a novel clinically valuable diagnostic model for endometriosis. JunB is a member of the AP-1 family of transcription factors and has been shown to have a close relationship with immune cells. JunB has a significant impact on neutrophil activation and is an important transcriptional modulator of macrophage activation [33]. Both neutrophils and macrophages play a significant role in the early stages of endometriosis development [34]. JunB is essential for Th17 cell pathogenicity. Th17 cells contribute to the development of endometriosis via their major cytokine, IL-17, by inducing angiogenesis and inflammation [34]. FRY, which is one of the largest genes, has more than 10,000 base pairs in the coding sequence and its amino acid sequence is highly conserved among species. FRY is related to the regulation of spindle organization and chromosome alignment [35]. Overall, JUNB and FRY influence the progression of endometriosis through activation of immune cells or regulation of mitosis. These findings suggest that JUNB and FRY may not only be diagnostic biomarkers but also provide novel targets for the treatment of endometriosis. Despite the observed close correlation between histone-related DEGs and endometriosis, our study has some limitations. First, all the data were obtained from a public database. In the future, we are committed to concentrating our efforts on validating and refining our diagnostic models through clinical trials, while also thoroughly investigating the role of the immune system in endometriosis, with the goal of offering patients more precise and efficacious diagnostic and treatment options. In addition, the current studies were carried out based only on the RNA level, which may result in one-sided results and a high false-positive rate. Finally, our study could not illustrate the direct mechanism of action of these HRGs in endometriosis. Hence, further experimental validations, such as western blotting and real-time polymerase chain reaction, are needed to elucidate the mechanisms of action of these HRGs in endometriosis. For the first time, we identified a signature of four HRGs (JUNB, FRY, LMNB1, and SPAG1) that may serve as novel diagnostic biomarkers for endometriosis. In addition, our results illustrate that JUNB may be correlated with the progression of endometriosis. Furthermore, a histone-related gene diagnostic model of endometriosis was constructed, which exhibited high diagnostic value. Our model might aid as a reliable and precise prediction method for endometriosis diagnosis. Finally, our results show that plasma cells may be involved in the progression of endometriosis. These findings extend our understanding of the role of histone genes in patients with endometriosis. Future research is warranted to further elucidate the mechanisms behind endometriosis and development of diagnostic and therapeutic protocol to combat the disease. Article information and declarations Data availability statement The data used and analyzed during the current study are available from Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/). Reference number: GSE7305, GSE7307, GSE25628 and GSE51981. Ethics statement An ethics statement was not required for this work. All data were obtained from Gene Expression Omnibus (GEO) database. Author contributions HY — concept, analysis and interpretation of data, article draft; DG, XY — analysis and interpretation of data, article draft; CW — analysis and interpretation of data, revised article critically; XL — concept, study design, revised article critically. Funding None. Acknowledgments We would like to acknowledge the reviewers for their helpful comments on this manuscript. Conflict of interest The authors declare no conflict of interest. Supplementary material The supplementary materials — Figures S1–S4 and Tables S1–S8.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Condition tags

endometriosisinfertility

MeSH descriptors

Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis Endometriosis

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-06-11T06:19:48.454388+00:00
pubmed
last seen: 2026-06-11T06:17:12.891333+00:00
unpaywall
last seen: 2026-05-11T08:34:28.763810+00:00
License: CC-BY-NC-ND-4.0 · commercial use OK · attribution required
Courtesy of the U.S. National Library of Medicine