Establishment Of A Novel Glycolysis-Immune-Related Diagnosis Gene Signature For Endometriosis By Machine Learning

In: Research Square · 2022 · doi:10.21203/rs.3.rs-1686939/v1 · W4282919537
preprint OA: green CC0
AI-generated summary by claude@2026-06+body, 2026-06-08

This study identified five glycolysis-related hub genes (CHPF, CITED2, GPC3, PDK3, ADH6) to establish a predictive model for endometriosis diagnosis, showing significant differences in immune cell infiltration between patients and controls.

One-sentence paraphrase of the abstract; not a substitute for reading it. No clinical advice. How this works

Full text 110,941 characters · extracted from preprint-html · click to expand
Establishment Of A Novel Glycolysis-Immune-Related Diagnosis Gene Signature For Endometriosis By Machine Learning | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Establishment Of A Novel Glycolysis-Immune-Related Diagnosis Gene Signature For Endometriosis By Machine Learning Qizhen Chen, Yufan Jiao, Zhe Yin, Xiayan Fu, Shana Guo, Jun Xiang, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-1686939/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Purpose: The objective of this study was to investigate the key glycolysis-related genes linked to immune cell infiltration in endometriosis and to develop a new endometriosis(EMS) predictive model. Methods: A training set and a test set were created from the NCBI GEO public database. We identified five glycolysis-related genes using LASSO and the Random Forest method. Then we developed and tested a prediction model for EMS diagnosis. The method CIBERSORT was used to compare the infiltration of 22 different immune cells. We looked into the relationship between key glycolysis-related genes and immune factors in eutopic endometrial of women with endometriosis. Besides, GO-based semantic similarity and logistic regression model analyses were used to investigate core genes. Results: The five glycolysis-related hub genes (CHPF, CITED2, GPC3, PDK3, ADH6) were used to establish a predictive model for EMS. In the training and test set, the AUC of the ROC prediction model was 0.777, 0.824, and 0.774, respectively. Additionally, there was a remarkable difference in the immune environment between EMS and control. Conclusion: The glycolysis-immune-based predictive model was established to forecast EMS patients’ diagnosis, and a detailed comprehension of the interactions between endometriosis, glycolysis, and the immune system, may be vital for the recognition of potential novel therapeutic approaches and targets for EMS patients. endometriosis glycolysis immune infiltration machine learning diagnosis Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Background Endometriosis is a chronic inflammatory illness in which endometrial tissue outside the uterus causes pelvic pain and infertility[ 1 ]. Endometriosis is a condition that affects 5–10% of reproductive-aged women worldwide. Despite its popularity, the rate of misdiagnosis is high. Most women have difficulties expressing their symptoms or believing that their symptoms are being normalized in an unsuitable way[ 2 ]. Furthermore, the current requirement for surgical diagnosis—typically via diagnostic laparoscopy—creates a barrier to early detection and treatment[ 2 ]. Therefore, an urgent need exists to create a reliable prognostic prediction model for patients in the early phase through a non-invasive method. Although there are numerous theories to consider explaining the causes of endometriosis, The explanation of retrograde menstruation proposed by Sampson is the most frequently accepted[ 3 ]. It claims that fragments of monthly endometrial tissue including viable endometrial glands and stroma are retrogradely expelled into the peritoneal cavity via the fallopian tubes, where they cling to and infect the underlying mesothelium[ 3 ]. Other factors, on the other hand, are required to promote endometrial stromal and glandular cell invasion and proliferation, including changes in the immune environment, reprogramming of glucose metabolism, and local complex hormone effects. The fact that cells rely on glycolysis to generate energy has long been known. From the view of evolution, cells grow up in an anaerobic environment, and also can tolerate anaerobic glycolysis, so glycolysis is considered to be the oldest ATP production pathway[ 4 ]. Recent studies have revealed a comprehension of aerobic glycolysis's benefits and specific advantages. Even though glycolysis produces less ATP than the Tricarboxylic acid oxidative phosphorylation pathway, proliferative cells prefer glycolysis for some reasons[ 5 ]. First, Glycolysis and the conversion of glucose to lactate are increased, resulting in faster and greater ATP generation[ 6 , 7 ]. Glycolytic ATP generation could be 100 times faster than the oxidative phosphorylation of tricarboxylic acids[ 8 ]. Meanwhile, the intracellular need is met by the modest synthesis of ATP from glycolysis. In conclusion, glycolysis may confer a selective growth advantage to proliferative cells[ 9 , 10 ]. Compared to normal endometria, previous studies indicated that endometrial epithelial cells and stromal cells would have higher abilities of proliferation, adhesion, and invasion[ 11 – 13 ]. Elevated levels of glycolysis(Warburg effect) can lead to lactate production and substance synthesis. Lactate accumulation promotes tumour cell migration, invasion, angiogenesis, and immune escape[ 14 , 15 ]. Interestingly, all the above cancer-like processes are also involved in the survival and invasion of eutopic endometria cells, thus contributing to the development of endometriosis[ 16 ]. A growing body of research suggests a link between glycolysis and immunological evasion[ 17 ]. While EM has benign clinical and pathological symptoms, it has cancer-like features such as spread, invasion, and hyperplasia[ 18 ]. Cancers with a significant Warburg effect develop a tumour microenvironment (TME) deprived of glucose, limiting local immune surveillance via nutritional competition[ 19 ]. Meanwhile, Immune cells can promote glycolysis in the same way that tumour cells can. Previous research has found that the immunological environment of eutopic endometria in women with endometriosis differs from that of normal endometria, but in endometriosis, the relationship between glycolysis and the immunological milieu is both poorly understood[ 20 ]. In conclusion, we wanted to create a model of endometriosis linked to glycolysis and investigate its connection with the immune microenvironment. The Gene Expression Omnibus database was used to obtain gene chips. LASSO and Random Forest were used to identify five prognostically related glycolytic genes. Following that, the association between the eutopic endometria immune environment and key genes were investigated, and a logistic regression model was built by ROC and verified by the test set. These findings could help researchers and clinicians better understand EMS. Materials And Methods Gene Expression Data Acquisition GEO was used to download RNA-sequence profiles and data with and without endometriosis. The following are the eligibility requirements: First, eutopic endometrial samples collected from endometriosis patients and healthy controls; second, samples having glandular and stromal components; lastly, the ladies in the study were in the proliferative and early secretory phases of the menstrual cycle. GSE25628, GSE51981, GSE7846 and GSE7305 were included in study as training set (n=155). The test set comprised GSE120103 (n=36)and GSE6364 (n=37) to confirm our predictive model ( TABLE1 ). For normalization, we utilized the "sva" utility in R to remove disparities across batches, because our datasets came from diverse cohorts and array platforms. For further analysis, we obtained 15,926 common genes. Table 1 The RNA-sequence profiles used in this study GEO Accession Platform Experiment Type EM(N) NORMAL(N) Tissue Year Dateset GSE25628 GPL571 mRNA array 8 6 endometrium 2010 Training GSE51981 GPL570 mRNA array 76 35 endometrium 2013 Training GSE7846 GPL570 mRNA array 5 5 endometrium 2007 Training GSE7305 GPL570 mRNA array 10 10 endometrium 2007 Training GSE120103 GPL6480 mRNA array 18 18 endometrium 2019 test GSE6364 GPL570 mRNA array 21 16 endometrium 2007 tset Glycolysis-related gene sets The Molecular Signatures Database (MSigDB) is a library of annotated gene sets for GSEA. Five gene sets relevant to glycolysis were retrieved, including BIOCARTA_GLYCOLYSIS_PATHWAY, BIOCARTA_FEEDER_PATHWAY, HALLMARK_GLYCOLYSIS, GO_GLYCOLYTIC_PROCESS, and REACTOME_GLYCOLYSIS. Using the "limma" program in R software, we found 262 glycolysis-related genes in 155 cases. Identification and Validation of Predictive Gene Signature The glycolysis-related diagnostic indicators of EMS were classified using LASSO logistic regression and Random Forest. The "glmnet" program was used to conduct the LASSO analysis, with the response type set to binomial and the alpha set to one. Random Forest is a technique that uses recursive partitioning to generate a binary tree21. The Random Forest method was given a number of trees of 500. Then, we choose the top 20 genes of RF analysis to interact with LASSO results and used VennDiagram to visualize the intersection of gene lists. Construction of EMS diagnostic model Machine algorithms Logistic Regression (LR), Random Forest (RF), and lasso regression (LASSO) were used to construct the diagnostic model of EMS based on the glycolysis-related diagnostic markers, the following is how a model was created: Risk score=expr_gene_1 x coef_gene_1+ expr_gene_2 x coef_gene_2+…+expr_gene_n x coef_gene_n The model's effectiveness and accuracy were assessed using ROC curves and AUC values. The nomogram's accuracy was assessed using the calibration plots. The best predicted value was indicated by the 45° line. The more perfect the result, the closer the curve was. The clinical utility of this model was examined using decision curve analysis (DCA). Evaluation of Immune Cell Subtype Distribution The CIBERSORT algorithm was used to infer the relative proportion of 22 different types of immuno-infiltrating cells from RNA-seq data of women with and without EMS. Gene expression and immune-cell content were subjected to Spearman correlation analysis. A statistically significant value was defined as p <0.05. Gene Set Enrichment Analysis (GSEA) Based on the expression of five hub genes, EMS patients were categorized into two groups: high and low. The GSEA analysis of the two groups was accomplished through the use of signal pathway differences. The background gene set data was obtained from the Molecular Signature Database. A maximum (500) and minimum gene set were used to select the gene set. Enriched gene sets were found after 1,000 permutations with a p <0.05 cutoff. The significantly enriched gene sets were then sorted in order of their significance. GSEA was used to investigate the relationships between various expression groups and biological processes. Semantic Similarity GO Annotations We use the GO Semsim software package of Wang’s method to explore the functional similarity between proteins[22]. In terms of molecular function (MF), biological processes (BP), and cellular component (CC), we calculated the geometric mean of GO semantic similarity. To measure functional similarity, the geometric average of semantic similarity was utilized. Co-expression Analysis of the Hub Genes The “corrplot” and “circlize” tools in R software were used to do correlation analysis. The “corrplot” tool in R was used to plot the Pearson correlation of Hub gene expression (version 1.64). The “circlize” package was used to generate circos plots. The colours "red" and "green" represent correlation coefficients. A positive connection is indicated by the red colour, whereas a negative correlation is indicated by the green colour. The stronger the relationship, the darker the colour and thicker the cord. Statistical Analysis For statistical analyses and visualization of results, R software (version 4.1.2) was utilized. A p -value of less than 0.05 was judged statistically significant. Significant correlation coefficients were defined as those with an absolute value more than 0.2 and a p -value less than 0.05. To create the predictive model, the logistic regression technique was employed. Results Identification of five glycolysis-related hub genes The training set was obtained from the NCBI GEO public database. There were 155 patients in total (EMS group,99; control group,56). The expression profiles of 262 glycolysis-related genes were derived using the Differentially expression profile. We used LASSO regression to perform feature screening to explore for glycolysis-related biomarkers in EMS. The LASSO regression revealed that 18 genes were found to be signature genes. The 262 glycolysis-related DEGs were then fed into the random forest classifier. The variable relevance of the output results was quantified in terms of decreasing accuracy and decreasing mean square error during the construction of the random forest model. The top 20 DEGs, ranked in order of relevance, were then chosen as candidate genes for further investigation. The intersection of random forest genes and LASSO regression genes resulted in 5 DEGs ( Figure 2 ). Establishment and Validation of the Diagnostic Model Based on five glycolysis-related hub genes We created a predictive model using five genes below: chondroitin polymerizing factor(CHPF), Cbp/p300 interacting transactivator with Glu/Asp rich carboxy-terminal domain 2(CITED2), glypican 3(GPC3), alcohol dehydrogenase 6(ADH6), pyruvate dehydrogenase kinase 3(PDK3). The following risk model was created using coefficients for the five hub genes: Risk score=(2.751*CHPF)+(3.880*PDK3)+(0.631*CITED2)+(0.502*GPC3)-(3.075*ADH6) CHPF, CITED2, GPC3, ADH6, and PDK3 were used to create a diagnostic prediction model for EMS using a multivariable logistic regression model and shown as a nomogram ( Figure 3f ). The performance of this model was examined using the area under the receiver operating(ROC). In the training set, the area under the ROC analysis (AUC) for this model is 0.777 ( Figure 3a ), and the AUC of the model in the test set is 0.824 and 0.774, respectively( Figure 3d-e ). The calibration curve revealed that the model matched well with the actual and predicted probability of an EMS occurrence ( Figure 3b ). The nomogram's C-index for predicting the presence of EMS was 0.777 [95% confidence interval (CI): 0.727–0.827]. Furthermore, decision curve analysis (DCA) revealed that the anticipated and observed values were nearly identical ( Figure 3c ). The above results indicate the importance and independence of risk score as a diagnostic model of EMS. The landscape of immune infiltration We employed the CIBERSORT algorithm to explore the difference between eutopic endometria in endometriosis patients and healthy controls after revealing the landscape of 22 immune cell subpopulations infiltration. The abundance ratios of 22 immune cells in the 155 samples are presented in Figure 4a . The percentage of immune cells in each sample was revealed in Figure 4b . Figure 4c depicts the interaction of innate immune cells. Compared with control endometria, eutopic endometria from women with endometriosis contained a greater number of T cells follicular helper, T cells regulator (Tregs), Macrophages M0, NK cells activated, Monocytes, Dendritic cells activated, and Mast cells resting. However, Plasma cells, T cells CD8, T cells CD4 memory resting, NK cells resting, Macrophages M1, Macrophages M2, Dendritic cells resting, Mast cells activated, T cells gamma delta and Eosinophils were relatively lower( Figure 4d ). Analysis of Core Genes and Immune Infiltration When we looked into the interaction between hub genes and immune cells further, the expression of risk hub genes (CHPF, CITED2, GPC3, PDK3) was shown to be positively connected with Plasma cells, Macrophages M2, T cells CD8, Mast cells activated, and T cells CD4 memory resting. The protective hub gene(ADH6) has a positive correlation of T cells CD4 naive, NK cells activated, Mast cells resting, T cells regulator (Tregs), and Macrophages M0( Figure 5a-e ). Analysis of Core Genes and Immune Factors The TISIDB database was then used to find associations between these five hub genes and various immunological variables such as chemokines, receptors, immunosuppressive factors, and immunostimulatory factors. A correlation graph was created between immunological factors and EMS key genes ( Figure 6a-d ). We selected immune factors associated with core genes (mean correlation coefficient > 0.4) and constructed an interaction network using Cytoscape and STRING( Figure 6e-f ). These findings indicated that key genes contribute significantly to the endometrial immune microenvironment. GSEA Analysis of Glycolysis-related Hub Genes We used GSEA on these five critical genes to investigate their activities and pathways. The GSEA analysis of five hub genes in the training set demonstrated that samples of these highly expressed hub genes (CHPF, CITED2, GPC3) were primarily enriched in “regulation of vesicle fusion”, “calmodulin dependent protein kinase activity”, “myelin maintenance”, “phosphatidylglycerol metabolic process”, “positive regulation of actin cytoskeleton recognization” related pathways. Also, Samples with low expression of ADH6 were mainly enriched in “centriole assembly”, “structural constituent of nuclear pore”, “regulation of centriole replication”, “intraciliary transport” and “regulation of protein exit from endoplasmic reticulum”( Figure 7a-d ). The result suggested that all the above genes were involved in biological function, such as energy metabolism, material transport, and cell proliferation, which in turn contributed to the progression of EMS. Concurrently, the highly expressed PDK3 mainly participated in “methyl CPG binding”, “retinal ganglion cell axon guidance", and "lactation” related pathways( Figure 7e ). Molecular regulatory mechanisms of core genes were shown in a circle plot( Figure 8a-e ). GO Similarity And Co-Expression Of Hub Genes To investigate the hub genes in EMS, we listed the key genes based on the average functional similarity links among the proteins. Among those genes, the score of genes CITED2(score:0.324), PDK3(score:0.318) and GPC3(score:0.306)ranked highest. The rest two genes,CHPF (score:0.285)and ADH6(score:0.216), are below 0.3( Figure 9a ). Pearson analysis was performed to investigate correlations between hub genes. Compared to CHPF, CITED2, GPC3, and PDK3 were more strongly positively correlated with each other. Yet only ADH6 remained negative associations with other hub genes( Figure 9b ). Discussion Endometriosis (EMS) is a systemic inflammatory disease caused by ectopic endometrial implantation and development outside the uterus cavity[ 2 ]. Recent studies have focused on glycolytic pathways in cancer cell growth and invasion. Endometrial cells, like cancer cells, have the ability to switch energy metabolism from mitochondrial oxidative phosphorylation (OXPHOS) to aerobic glycolysis, reduce ROS generation, and enhance survival[ 23 ]. It has been reported that glucose metabolism and energy production in ectopic endometriotic cells under hypoxia influence the incidence and invasion of endometriosis[ 24 , 25 ]. Nevertheless, studies involving the Warburg effect in eutopic endometrium cells are still lacking. Increasing findings suggest that changes in glycolytic metabolism in the endometrial microenvironment may have impacts on immune cell infiltration and other anti-immune processes, but the specific mechanism remains to be explored. Early prediction and detection of EMS can apply early interventions and improve treatment outcomes. Therefore, the identification of possible biomarkers for predicting EMS is crucial. In recent years, advances in Machine Learning Techniques and the availability of gene expression data in public databases have provided a new approach to identifying biomarkers for disease detection. In this work, we identified five glycolysis-related potential genes (CHPF, CITED2, PDK3, GPC3, and ADH6) through LASSO regression analysis and the RF method. The CIBERSORT algorithm was then used to do a deconvolution study of the immune microenvironment in order to determine the fraction of immune cells in EMS. The relationship between core genes and other immunomodulators, as well as the majority of chemokines and receptors mentioned in TISIDB, is then investigated. The profiles of the five hub genes were identified using GO semantic similarity and GSEA analysis. The risk score based on the five glycolysis-related markers was then used to create a nomogram, the nomogram had a good predictive performance. Chondroitin polymerizing factor (CHPF) is a type II transmembrane protein that is essential for chondroitin sulfate (CS)production. Many cell biological functions, such as cell adhesion, cell differentiation, and neural network creation, rely on CS[ 26 ]. Li et al [ 27 ] reported that the expression of the CHPF was linked to immune cells and various immune factors. At present, most studies focus on the function of CHPF in cancers, little has been studied in endometriosis. CBP/p300-interacting-transactivator-with-an-ED-rich-tail 2 (CITED2) is a transcriptional regulator that regulates biological functions by co-activating or repressing multiple transcription factors[ 28 ]. The glycolytic gene CITED2 is also a hypoxia-related gene. It is reported that CITED2 is associated with primary ovarian insufficiency[ 29 ]. Glypican-3 (GPC3) is a membrane-associated proteoglycan involved in cell growth, differentiation, and migration. The specific expression of GPC3 in tumour cells has gotten a lot of attention[ 30 ]. In a Canadian patient cohort, high membranous GPC3 expression was found in 20% of endometriosis-associated ovarian clear cell carcinomas (OCCCs) [ 31 ]. Pyruvate dehydrogenase kinase 3(PDK3) is a member of the PDK family, which contains PDK1, PDK2, PDK2, and PDK4. PDK3 mainly contributes to metabolic switch and cell survival under hypoxia, like CITED2[ 32 ]. Simultaneously, PDK3 performed a crucial role in cancers and has been regarded as a promising target for cancers[ 33 ]. Our results found that PDK3 was over-expressed in eutopic endometria in women with endometriosis. However, the function of PDK3 in endometriosis is unclear. The serum levels of Alcohol dehydrogenase 6(ADH6) have been shown in numerous studies to be a potential diagnostic marker in cancers. It has been substantiated that ADH6 was found to be involved in the P450-related pathway and biological processes linked to the progression and treatment of pancreatic cancer[ 34 ]. Its involvement in endometriosis biology, however, is unknown. As we know, the Biological behaviour of endometriosis is similar to that of malignant tumours. Glucose is the most readily available nutrient for cancer cells, but it is also required for T cell activation, differentiation, and function35. Proliferating tumour cells that consume a lot of extracellular glucose secrete lactic acid into the cancer microenvironment. Lactate was later discovered to inhibit monocyte migration and cytokine release, as well as promote resident macrophage polarization to the tumour-associated macrophage 2 (TAM2) phenotype, resulting in tumour progression and immune escape[ 35 ]. Although endometriosis is a benign illness, it exhibits neoplastic traits such as inflammation and tissue invasion[ 36 ]. So we speculate that the same biological process between glycolysis and the immune environment occurs in endometriosis[ 35 ]. The abundances of T cells follicular helper, T cells regulator (Tregs), Macrophages M0, NK cells activated, Monocytes, Dendritic cells activated, and Mast cells(MC) resting were higher in the eutopic endometria of women with endometriosis than in normal controls in our study. Tfh cells are a kind of CD4 + T cell that plays a critical role in the adaptive immune response. Tfh cells' roles in endometriosis have received little attention[ 37 ]. T cells regulators (Tregs) are increased in the endometrium of women with and without disease, according to most research. However, there is still debate[ 38 ]. Previous research has shown that in the proliferative phase of endometriosis, more macrophages (Møs) and activated dendritic cells are found in the endometrium of women with endometriosis, regardless of the hormonal milieu[ 38 ]. Currently, it appears that uterine natural killer (uNK) cells from women with endometriosis are immature and that uNK cytotoxic activity could be an indicator of endometriosis-related infertility and recurrent miscarriage, despite the fact that the absolute numbers are the same as in normal endometrium[ 39 ]. Others have found higher Mast cell infiltration in the endometrium in women with illness, as well as enhanced MC activation in ectopic lesions, but activated MC in eutopic endometrium were rarely found[ 40 ]. We also found increased monocytes in eutopic endometrium of women with endometriosis, we infer that monocytes are largely recruited, as a source of monocyte-derived macrophages[ 41 ]. Nextly, we analyze the correlation between hub glycolysis-related gene expression and various immune cell infiltration. Likewise, We investigated the relationship between five hub genes, various immunomodulators, chemokines and receptors listed in TISIDB. Finally, According to GESA, hub genes are primarily involved in biological processes, such as energy metabolism, material transport, and cell proliferation, which in turn contribute to the progression of EMS. What is more, Compared to CHPF, CITED2, GPC3, and PDK3 were more strongly positively correlated with each other. ADH6 remained negative associations with other hub genes. We discovered five glycolysis-related hub genes that are closely associated with the molecular mechanism of EMS using bioinformatic analysis, verified the biological functions and important pathways of the hub genes, and performed immune cell infiltration and correlation analysis for the target core genes. This work has certain limitations, despite the fact that the expression levels of these hub genes were also validated in the test set. If ethical approval is given, these discovered target genes will be confirmed in clinical samples using RT-qPCR in future research. In addition, it is just a proof of concept and more in vitro and in vivo experiments are needed to confirm our findings and investigate the mechanisms of glycolysis-related genes regulating the infiltration of immune cells. Conclusions In conclusion, we discovered five glycolysis-related genes in endometriosis and developed a model for EMS assessment. Based on numerous bioinformatics techniques, we discovered hub genes in EMS and their correlation with immune infiltration cells, as well as correlations between 22 immune cell subpopulations. Meanwhile, the GSEA and GO similarity analysis reveal more specific mechanisms. Selected genes could be candidate predictive markers and potential therapeutic targets for EMS, but the exact mechanisms of glycolysis-related genes and immune environment(including immune cells, immune factors) in EMS should be more explored. Abbreviations EMS: Endometriosis; NCBI: National Center For Biotechnology Information; GEO: Gene Expression Omnibus; GSEA: Gene Set Enrichment Analysis; MSigDB: Molecular Signatures Database; LASSO: Least Absolute Shrinkage And Selection Operator; RF: Random Forest; TME: Tumor Microenvironment; DCA: Decision Curve Analysis; ROC: Receiver Operating Characteristic Curve; AUC: Area Under Curve; CHPF: Chondroitin polymerizing factor; CITED2: CBP/p300-interacting-transactivator-with-an ED-rich-tail 2; GPC3:Glypican-3; PDK3:Pyruvate dehydrogenase kinase 3; ADH6: Alcohol dehydrogenase 6; Tregs: T cells regulator Tregs; Møs: Macrophages; uNK: uterine natural killer Declarations Data Availability Statements The datasets generated during and/or analyzed during the current study are available in the GEO. The data supporting this study’s findings are available from the corresponding author upon reasonable request. The names of the repository/repositories and accession number(s) are listed below: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE25628 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE51981 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE7846 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE7305 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120103 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE6364 Statements & Declarations The authors would like to thank GEO, GSEA, MSigDB, and TISIDIB for data availability. Funding This work was supported by the Shanghai Key Laboratory of Female Reproductive Endocrine-Related Diseases (Grant numbers. 17DZ2273600). Author Yanqiu Wang has received research support. Ethics approval and consent to participate Not applicable Consent for publication Not applicable Competing Interests The authors have no relevant financial or non-financial interests to disclose. Author Contributions YQ Wang: Project development, Writing-reviewing QZ Chen: Data analysis, Manuscript writing, Supervision and Validation YF Jiao: Manuscript writing Z Yin: Manuscript writing XY Fu: Data collection SN Guo: Data collection J Xiang: Project development, Writing-reviewing, Supervision and Validation All authors contributed to and have approved the final manuscript. References hapron C, Marcellin L, Borghese B and Santulli P (2019)Rethinking mechanisms, diagnosis and management of endometriosis. Nature Reviews Endocrinology 1511:666-82. http://doi.org/10.1038/s41574-019-0245-z aylor HS, Kotlyar AM and Flores VA(2021)Endometriosis is a chronic systemic disease: clinical challenges and novel innovations. Lancet 39710276:839-52. http://doi.org/10.1016/S0140-6736(21)00389-5 ampson JA(1927)Peritoneal endometriosis due to the menstrual dissemination of endometrial tissue into the peritoneal cavity. American Journal of Obstetrics and Gynecology 144:422-69. https://doi.org/10.1016/S0002-9378(15)30003-X ourtnay R, Ngo DC, Malik N, Ververis K, Tortorella SM and Karagiannis TC(2015)Cancer metabolism and the Warburg effect: the role of HIF-1 and PI3K. Mol Biol Rep 424:841-51. http://doi.org/10.1007/s11033-015-3858-x e Souza AC, Justo GZ, de Araújo DR and Cavagis AD(2011)Defining the molecular basis of tumor metabolism: a continuing challenge since Warburg's discovery. Cell Physiol Biochem 285:771-92. http://doi.org/10.1159/000335792 feiffer T, Schuster S and Bonhoeffer S(2001)Cooperation and competition in the evolution of ATP-producing pathways. Science 2925516:504-7. http://doi.org/10.1126/science.1058079 hou Y, Tozzi F, Chen J, Fan F, Xia L, Wang J, Gao G, Zhang A, Xia X, Brasher H, Widger W, Ellis LM and Weihua Z(2012)Intracellular ATP levels are a pivotal determinant of chemoresistance in colon cancer cells. Cancer Res 721:304-14. http://doi.org/10.1158/0008-5472.Can-11-1674 ocasale JW and Cantley L C(2010)Altered metabolism in cancer. BMC Biol 8:88. http://doi.org/10.1186/1741-7007-8-88 atenby RA and Gillies RJ(2004)Why do cancers have high aerobic glycolysis? Nat Rev Cancer 411:891-9. http://doi.org/10.1038/nrc1478 unt SY and Vander Heiden MG(2011)Aerobic glycolysis: meeting the metabolic requirements of cell proliferation. Annu Rev Cell Dev Biol 27:441-64. http://doi.org/10.1146/annurev-cellbio-092910-154237 arcía-Gómez E, Vázquez-Martínez ER, Reyes-Mayoral C, Cruz-Orozco OP, Camacho-Arroyo I and Cerbón M(2019)Regulation of Inflammation Pathways and Inflammasome by Sex Steroid Hormones in Endometriosis. Front Endocrinol (Lausanne) http://doi.org/10:935. 10.3389/fendo.2019.00935 elbandi AA, Mahmoudi M, Shervin A, Akbari E, Jeddi-Tehrani M, Sankian M, Kazemnejad S and Zarnani AH(2013)Eutopic and ectopic stromal cells from patients with endometriosis exhibit differential invasive, adhesive, and proliferative behavior. Fertil Steril 1003:761-9. 10.1016/j.fertnstert.2013.04.041 undqvist J, Andersson KL, Scarselli G, Gemzell-Danielsson K and Lalitkumar PG(2012)Expression of adhesion, attachment and invasion markers in eutopic and ectopic endometrium: a link to the aetiology of endometriosis. Hum Reprod 279:2737-46. http://doi.org/10.1093/humrep/des220 oss K, Hong HS, Bader JE, Sugiura A, Lyssiotis CA and Rathmell JC(2021)A guide to interrogating immunometabolism. Nat Rev Immunol 2110:637-52. http://doi.org/10.1038/s41577-021-00529-8 handel NS(2021)Glycolysis. Cold Spring Harb Perspect Biol http://doi.org/13510.1101/cshperspect.a040535 ao Q, Jing G, Zhang X, Li M, Yao Q and Wang L(2021)Cinnamic acid inhibits cell viability, invasion, and glycolysis in primary endometrial stromal cells by suppressing NF-κB-induced transcription of PKM2. Biosci Rep http://doi.org/10.1042/bsr20211828 alezic A., Udicki M., Srdic Galic B., Aleksic M., Korac A., Jankovic A. and Korac B.(2021)Tissue-Specific Warburg Effect in Breast Cancer and Cancer-Associated Adipose Tissue-Relationship between AMPK and Glycolysis. Cancers (Basel) 1311 http://doi.org/10.3390/cancers13112731 ung SW, Zhang R, Tan Z, Chung JPW, Zhang T and Wang CC(2021)Pharmaceuticals targeting signaling pathways of endometriosis as potential new medical treatment: A review. Med Res Rev 414:2489-564. http://doi.org/10.1002/med.21802 an C, Kam S and Ramadori P(2021)Metabolism-Associated Epigenetic and Immunoepigenetic Reprogramming in Liver Cancer. Cancers (Basel) http://doi.org/132010.3390/cancers13205250 u XG, Chen JJ, Zhou HL, Wu Y, Lin F, Shi J, Wu HZ, Xiao HQ and Wang W(2021)Identification and Validation of the Signatures of Infiltrating Immune Cells in the Eutopic Endometrium Endometria of Women With Endometriosis. Front Immunol 12:671201. http://doi.org/10.3389/fimmu.2021.671201 acedo HG, Fonseca NF and Brasil P(2019)Characterization of clinical patterns of dengue patients using an unsupervised machine learning approach. BMC Infect Dis 191:649. http://doi.org/10.1186/s12879-019-4282-y ang JZ, Du Z, Payattakool R, Yu PS and Chen CF(2007)A new method to measure the semantic similarity of GO terms. Bioinformatics 2310:1274-81. http://doi.org/10.1093/bioinformatics/btm087 obayashi H, Shigetomi H and Imanaka S (2021)Nonhormonal therapy for endometriosis based on energy metabolism regulation. Reprod Fertil 24:C42-C57. http://doi.org/10.1530/RAF-21-0053 asvandik S, Samuel K, Peters M, Eimre M, Peet N, Roost AM, Padrik L, Paju K, Peil L and Salumets A(2016)Deep Quantitative Proteomics Reveals Extensive Metabolic Reprogramming and Cancer-Like Changes of Ectopic Endometriotic Stromal Cells. J Proteome Res 152:572-84. http://doi.org/10.1021/acs.jproteome.5b00965 u MH, Hsiao KY and Tsai SJ (2019)Hypoxia: The force of endometriosis. J Obstet Gynaecol Res 453:532-41. http://doi.org/10.1111/jog.13900 in XL, Han T, Xia Q, Cui JJ, Zhuo M, Liang YY, Su WY, Wang LS, Wang LW, Liu ZB and Xiao XY(2021)CHPF promotes gastric cancer tumorigenesis through the activation of E2F1. Cell Death Dis 1210:876-76. http://doi.org/10.1038/s41419-021-04148-y i WW, Liu B, Dong SQ, He SQ, Liu YY, Wei SY, Mou JY, Zhang JX and Liu Z(2022)Bioinformatics and Experimental Analysis of the Prognostic and Predictive Value of the CHPF Gene on Breast Cancer. Front Oncol 12:856712-12. http://doi.org/10.3389/fonc.2022.856712 awson H, van de Lagemaat Louie N, Barile M, Tavosanis A, Durko J, Villacreces A, Bellani A, Mapperley C, Georges E, Martins-Costa C, Sepulveda C, Allen L, Campos J, Campbell KJ, O'Carroll D, Göttgens B, Cory S, Rodrigues NP, Guitart AV and Kranc KR(2021)CITED2 coordinates key hematopoietic regulatory pathways to maintain the HSC pool in both steady-state hematopoiesis and transplantation. Stem Cell Reports 1611:2784-97. http://doi.org/10.1016/j.stemcr.2021.10.001 ortuño C and Labarta E (2014)Genetics of primary ovarian insufficiency: a review. J Assist Reprod Genet 3112:1573-85. http://doi.org/10.1007/s10815-014-0342-9 heng XF, Liu X, Lei YN, Wang G and Liu M(2022)Glypican-3: A Novel and Promising Target for the Treatment of Hepatocellular Carcinoma. Front Oncol 12:824208-08. http://doi.org/10.3389/fonc.2022.824208 iedemeyer K, Köbel M, Koelkebeck H, Xiao Z and Vashisht K(2020)High glypican-3 expression characterizes a distinct subset of ovarian clear cell carcinomas in Canadian patients: an opportunity for targeted therapy. Human Pathology 98:56-63. https://doi.org/10.1016/j.humpath.2020.01.002 ui LZ, Cheng ZH, Liu Y, Dai YF, Pang YF, Jiao Y, Ke XY, Cui W, Zhang QY, Shi Jinlong and Fu Lin(2020)Overexpression of PDK2 and PDK3 reflects poor prognosis in acute myeloid leukemia. Cancer Gene Ther 271-2:15-21. http://doi.org/10.1038/s41417-018-0071-9 u J, Shi Q, Xu W, Zhou Q, Shi R, Ma Y, Chen D, Zhu L, Feng L, Cheng AS, Morrison H, Wang X and Jin H(2019)Metabolic enzyme PDK3 forms a positive feedback loop with transcription factor HSF1 to drive chemoresistance. Theranostics 910:2999-3013. http://doi.org/10.7150/thno.31301 iao XW, Huang R, Liu XG, Han CY, Yu L, Wang SJ, Sun N, Li BP, Ning X and Peng T (2017)Distinct prognostic values of alcohol dehydrogenase mRNA expression in pancreatic adenocarcinoma. OncoTargets and therapy 10:3719-32. http://doi.org/10.2147/OTT.S140221 un LC, Suo CX, Li ST, Zhang HF and Gao P(2018)Metabolic reprogramming for cancer cells and their microenvironment: Beyond the Warburg Effect. Biochimica et Biophysica Acta (BBA) - Reviews on Cancer 18701:51-66. https://doi.org/10.1016/j.bbcan.2018.06.005 nglesio MS, Papadopoulos N, Ayhan A, Nazeran TM, Noë M, Horlings HM, Lum A, Jones S, Senz J, Seckin T, Ho J, Wu RC, Lac V, Ogawa H, Tessier-Cloutier B, Alhassan R, Wang A, Wang Y, Cohen JD, Wong F, Hasanovic A, Orr N, Zhang M, Popoli M, McMahon W, Wood LD, Mattox A, Allaire C, Segars J, Williams C, Tomasetti C, Boyd N, Kinzler KW, Gilks CB, Diaz L, Wang TL, Vogelstein B, Yong PJ, Huntsman DG and Shih I M(2017)Cancer-Associated Mutations in Endometriosis without Cancer. N Engl J Med 37619:1835-48. http://doi.org/10.1056/NEJMoa1614814 aw H, Venturi V, Kelleher A and Munier C. M (2020)Tfh Cells in Health and Immunity: Potential Targets for Systems Biology Approaches to Vaccination. Int J Mol Sci http://doi.org/212210.3390/ijms21228524 allvé-Juanico J, Houshdaran S and Giudice LC(2019)The endometrial immune environment of women with endometriosis. Hum Reprod Update 255:564-91. http://doi.org/10.1093/humupd/dmz018 hiruchelvam U, Wingfield M and O'Farrelly C (2015)Natural Killer Cells: Key Players in Endometriosis. Am J Reprod Immunol 744:291-301. http://doi.org/10.1111/aji.12408 ugamata M, Ihara T and Uchiide I(2005)Increase of activated mast cells in human endometriosis. Am J Reprod Immunol 533:120-5. http://doi.org/10.1111/j.1600-0897.2005.00254.x ogg C, Panir K, Dhami P, Rosser M, Mack M, Soong D, Pollard JW, Jenkins SJ, Horne Andrew W. and Greaves Erin(2021)Macrophages inhibit and enhance endometriosis depending on their origin. Proc Natl Acad Sci USA 1186:e2013776118. http://doi.org/10.1073/pnas.2013776118 Supplementary Files Additionalfile1.docx Additional file 1: Fig S1Batch effect removal and normalization of the training set. (a-b) Box plot of data in the training set before and after normalized. (c-d) PCA results before and after batch effect removal. Additionalfile2.docx Additional file 2: Fig S2 Results of the Random Forest. (a) Error rate of the Random Forest (b) the rank of gene importance in the Random Forest. Additionalfile3.xls Additional file 3: Table S1. Gene result of LASSO method. Additionalfile4.xls Additional file 4: Table S2. Gene result of Random Forest method. Additionalfile5.xls Additional file 5: Table S3. Immune cell components in the training set. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-1686939","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":111959494,"identity":"bb65be76-62ae-43cc-a3de-7a20c8f84959","order_by":0,"name":"Qizhen Chen","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/klEQVRIiWNgGAWjYDACCRBhwJDAwMDY+OBDhQ0PP38D0VqYmw1nnEmTkZxxgBgtDCAt7G3SvC2HbQwaEvDr4J/dfOzBm4K6PH7pRqCWhvM8BgwHGD98zMFjyZ1j6YZzDNiKJeccbLacu+M2jzlzA7PkzG24tRhI5JhJ8xjwJG64kdh44+2Z2zyWDQfYmHnxasn/BtQikbj/RmKDBG/bOR6DAwmEtOSwAbUYJG6QSGyS5G07QFiLxI00M8k5BgmJM24kggI5mUdyxsFmvH7hn5H8TOLNn7rE/hnpD4FRaWfPz9988MNHPFrAgAeVy9hAQD2mllEwCkbBKBgFqAAA7XBUztlZqyYAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-3291-7097","institution":"Tongji University School of Medicine","correspondingAuthor":true,"prefix":"","firstName":"Qizhen","middleName":"","lastName":"Chen","suffix":""},{"id":111959495,"identity":"633d4853-4940-4dbc-9130-f57248d14145","order_by":1,"name":"Yufan Jiao","email":"","orcid":"","institution":"Tongji University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Yufan","middleName":"","lastName":"Jiao","suffix":""},{"id":111959496,"identity":"5fe4aba1-c6bb-47ad-b4ed-49b739881b5d","order_by":2,"name":"Zhe Yin","email":"","orcid":"","institution":"Tongji Hospital Affiliated to Tongji University: Shanghai Tongji Hospital","correspondingAuthor":false,"prefix":"","firstName":"Zhe","middleName":"","lastName":"Yin","suffix":""},{"id":111959497,"identity":"96c27ade-489d-4b90-9f20-7acb12310237","order_by":3,"name":"Xiayan Fu","email":"","orcid":"","institution":"Tongji University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Xiayan","middleName":"","lastName":"Fu","suffix":""},{"id":111959498,"identity":"964c5a92-0b1b-452d-bb14-9760cdb90374","order_by":4,"name":"Shana Guo","email":"","orcid":"","institution":"Tongji University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Shana","middleName":"","lastName":"Guo","suffix":""},{"id":111959499,"identity":"b460ce64-a466-490f-9b88-ba9931be15c8","order_by":5,"name":"Jun Xiang","email":"","orcid":"","institution":"Tongji Hospital Affiliated to Tongji University: Shanghai Tongji Hospital","correspondingAuthor":false,"prefix":"","firstName":"Jun","middleName":"","lastName":"Xiang","suffix":""},{"id":111959500,"identity":"e4eb02ee-c439-4edf-b10f-bbb4fbc55eeb","order_by":6,"name":"Yanqiu Wang","email":"","orcid":"","institution":"Tongji Hospital Affiliated to Tongji University: Shanghai Tongji Hospital","correspondingAuthor":false,"prefix":"","firstName":"Yanqiu","middleName":"","lastName":"Wang","suffix":""}],"badges":[],"createdAt":"2022-05-24 03:52:32","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-1686939/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-1686939/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":22644763,"identity":"5b05ad03-7abb-438a-b198-77ba019f3b09","added_by":"auto","created_at":"2022-06-14 16:11:43","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":887984,"visible":true,"origin":"","legend":"\u003cp\u003eOverview of the study workflow\u003c/p\u003e","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/230c51e59e56452851b2b693.png"},{"id":22644764,"identity":"e0a4c967-57dd-4fef-a86f-7f7d1725df94","added_by":"auto","created_at":"2022-06-14 16:11:43","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":44140,"visible":true,"origin":"","legend":"\u003cp\u003eSelection of diagnostic biomarkers and identification of hub genes (a) LASSO coefficient profiles of the 18 differentially expressed genes (b) The misclassification error in the jackknife rates analysis (c) Venn diagram of genes extracted from LASSO and RF methods\u003c/p\u003e","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/eb7d580487a24dc8414afed7.png"},{"id":22644768,"identity":"2a80e595-0175-4d09-82c6-4eedb1bdc225","added_by":"auto","created_at":"2022-06-14 16:11:43","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":49575,"visible":true,"origin":"","legend":"\u003cp\u003eEstablishment and verification of the glycolysis-immune-related diagnostic model for EMS (a) ROC analysis of the glycolysis-immune-related diagnostic model using the training group (b) Nomogram-predicted probability of EMS in the training group (c) Decision curve analysis of the model in the training group (d-e) ROC analysis of the glycolysis-immune-related diagnostic model using the test group(GSE120103 and GSE6364, respectively) (f) Nomogram for diagnosis of EMS\u003c/p\u003e","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/7efbf2cd70e9b546ea276a62.png"},{"id":22646284,"identity":"19c34e16-9ee0-4489-b86b-79a04b862773","added_by":"auto","created_at":"2022-06-14 16:21:43","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":104441,"visible":true,"origin":"","legend":"\u003cp\u003eThe landscape of immune infiltration between EMS and normal controls (a) The box-plot diagram indicating the abundance ratio of immune cells in 116 sample (b) The heatmap indicating the abundance ratio of immune cells in EMS(n=71) and control group(n=45) (c) the cor-heatmap shows the relationship between abundance ratios of 22 immune cells (d) the difference of immune infiltration between EMS(red) and normal(blue) controls (\u003cem\u003ep\u003c/em\u003e-values \u0026lt;0.05 indicated statistical significance)\u003c/p\u003e","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/5334d7a30ca6338b82fb1047.png"},{"id":22646285,"identity":"536996ec-3b87-4a1f-8057-c2deaff80f02","added_by":"auto","created_at":"2022-06-14 16:21:43","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":88314,"visible":true,"origin":"","legend":"\u003cp\u003eThe association between the hub genes and the infiltration level (a) CHPF (b) CITED2 (c)GPC3 (d)PDK3 (e)ADH6\u0026nbsp;\u003c/p\u003e","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/0679a62154b8c0bf8b528dec.png"},{"id":22644774,"identity":"02147852-101b-4fe8-9898-fbc32bbd5751","added_by":"auto","created_at":"2022-06-14 16:11:43","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":91690,"visible":true,"origin":"","legend":"\u003cp\u003eThe association between the hub genes and immune cell infiltration (a-d) Correlation between hub genes and chemokines, immune receptors,\u0026nbsp;immunosuppressive factors, and immunostimulatory factors (e-f) Protein–protein interaction plot of hub genes and immune-related molecules\u003c/p\u003e\u003cp\u003e\u003cbr\u003e\u003c/p\u003e","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/32827f99d2ed35a15e005743.png"},{"id":22646287,"identity":"8a19306a-7c23-47cf-b65e-844da365a317","added_by":"auto","created_at":"2022-06-14 16:21:43","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":78203,"visible":true,"origin":"","legend":"\u003cp\u003eEnrichment analysis of pathway and gene ontology (GO) involved hub genes\u0026nbsp;\u003cstrong\u003e(a–e)\u003c/strong\u003e\u0026nbsp;Gene Set Enrichment Analysis of CHPF, CITED2, GPC3, ADH6, and PDK3\u003c/p\u003e","description":"","filename":"Onlinefloatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/830f6be3ee9b8db556e5a0b9.png"},{"id":22645559,"identity":"26e9058d-86c6-4a15-9d23-e13afd6e76b1","added_by":"auto","created_at":"2022-06-14 16:16:43","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":87591,"visible":true,"origin":"","legend":"\u003cp\u003eMolecular regulatory mechanism of core gene-related pathway and GO functional enrichment analyses\u0026nbsp;\u003cstrong\u003e(a–e)\u003c/strong\u003e\u0026nbsp;GSEA-related ccgraph plot of CHPF, CITED2, GPC3, ADH6, and PDK3\u003c/p\u003e","description":"","filename":"Onlinefloatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/b81f4b17564ce2a5d0f7096b.png"},{"id":22646771,"identity":"38b29491-87d1-4dc3-aac1-66f94f022185","added_by":"auto","created_at":"2022-06-14 16:26:43","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":475300,"visible":true,"origin":"","legend":"\u003cp\u003eCloseness score of semantic similarities between GO terms and coexpression analysis of hub genes\u0026nbsp;\u003cstrong\u003e(a)\u003c/strong\u003e\u0026nbsp;GO semantic similarity box plot of core genes \u003cstrong\u003e(b)\u003c/strong\u003e\u0026nbsp;The circos diagram depicts Pearson correlations between hub genes\u003c/p\u003e","description":"","filename":"Onlinefloatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/83eb5d8ef6ced6192d479ea8.png"},{"id":22838404,"identity":"e1f72983-bf97-4e2f-a9f7-ac36855d00ce","added_by":"auto","created_at":"2022-06-20 09:02:37","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3424081,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/b2f57981-b86f-464f-9539-352628778e68.pdf"},{"id":22645556,"identity":"ec394b19-6add-4581-b66f-5af3da127542","added_by":"auto","created_at":"2022-06-14 16:16:43","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":4673289,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 1: Fig S1Batch effect removal and normalization of the training set. (a-b) Box plot of data in the training set before and after normalized. (c-d) PCA results before and after batch effect removal.\u003c/p\u003e","description":"","filename":"Additionalfile1.docx","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/8874c46215ca77fafd5e28d1.docx"},{"id":22644771,"identity":"aa7336ce-1967-46a2-aae7-a482a4654341","added_by":"auto","created_at":"2022-06-14 16:11:43","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":4633439,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 2: Fig S2 Results of the Random Forest. (a) Error rate of the Random \u003c/p\u003e\u003cp\u003eForest (b) the rank of gene importance in the Random Forest.\u003c/p\u003e","description":"","filename":"Additionalfile2.docx","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/960e533f990b76fa75dd237b.docx"},{"id":22644762,"identity":"3b8b5a17-649a-410a-9827-330f4a8d4686","added_by":"auto","created_at":"2022-06-14 16:11:43","extension":"xls","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":214,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 3: Table S1. Gene result of LASSO method.\u003c/p\u003e","description":"","filename":"Additionalfile3.xls","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/005aeaab32107c08fde80691.xls"},{"id":22644761,"identity":"c0caf683-dff8-4d4a-b1fe-62f4c365d529","added_by":"auto","created_at":"2022-06-14 16:11:43","extension":"xls","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":142,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 4: Table S2. Gene result of Random Forest method.\u003c/p\u003e","description":"","filename":"Additionalfile4.xls","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/9ca1f4360739ea3b6077a94c.xls"},{"id":22645560,"identity":"5146a02a-9e8c-4626-a85a-6575d711049d","added_by":"auto","created_at":"2022-06-14 16:16:43","extension":"xls","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":29859,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 5: Table S3. Immune cell components in the training set.\u003c/p\u003e","description":"","filename":"Additionalfile5.xls","url":"https://assets-eu.researchsquare.com/files/rs-1686939/v1/46a6d65cb01e2f3b9dcbdce9.xls"}],"financialInterests":"","formattedTitle":"Establishment Of A Novel Glycolysis-Immune-Related Diagnosis Gene Signature For Endometriosis By Machine Learning","fulltext":[{"header":"Background","content":"\u003cp\u003eEndometriosis is a chronic inflammatory illness in which endometrial tissue outside the uterus causes pelvic pain and infertility[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Endometriosis is a condition that affects 5\u0026ndash;10% of reproductive-aged women worldwide. Despite its popularity, the rate of misdiagnosis is high. Most women have difficulties expressing their symptoms or believing that their symptoms are being normalized in an unsuitable way[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Furthermore, the current requirement for surgical diagnosis\u0026mdash;typically via diagnostic laparoscopy\u0026mdash;creates a barrier to early detection and treatment[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Therefore, an urgent need exists to create a reliable prognostic prediction model for patients in the early phase through a non-invasive method.\u003c/p\u003e \u003cp\u003eAlthough there are numerous theories to consider explaining the causes of endometriosis, The explanation of retrograde menstruation proposed by Sampson is the most frequently accepted[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. It claims that fragments of monthly endometrial tissue including viable endometrial glands and stroma are retrogradely expelled into the peritoneal cavity via the fallopian tubes, where they cling to and infect the underlying mesothelium[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Other factors, on the other hand, are required to promote endometrial stromal and glandular cell invasion and proliferation, including changes in the immune environment, reprogramming of glucose metabolism, and local complex hormone effects.\u003c/p\u003e \u003cp\u003eThe fact that cells rely on glycolysis to generate energy has long been known. From the view of evolution, cells grow up in an anaerobic environment, and also can tolerate anaerobic glycolysis, so glycolysis is considered to be the oldest ATP production pathway[\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Recent studies have revealed a comprehension of aerobic glycolysis's benefits and specific advantages. Even though glycolysis produces less ATP than the Tricarboxylic acid oxidative phosphorylation pathway, proliferative cells prefer glycolysis for some reasons[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. First, Glycolysis and the conversion of glucose to lactate are increased, resulting in faster and greater ATP generation[\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Glycolytic ATP generation could be 100 times faster than the oxidative phosphorylation of tricarboxylic acids[\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. Meanwhile, the intracellular need is met by the modest synthesis of ATP from glycolysis. In conclusion, glycolysis may confer a selective growth advantage to proliferative cells[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eCompared to normal endometria, previous studies indicated that endometrial epithelial cells and stromal cells would have higher abilities of proliferation, adhesion, and invasion[\u003cspan additionalcitationids=\"CR12\" citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. Elevated levels of glycolysis(Warburg effect) can lead to lactate production and substance synthesis. Lactate accumulation promotes tumour cell migration, invasion, angiogenesis, and immune escape[\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. Interestingly, all the above cancer-like processes are also involved in the survival and invasion of eutopic endometria cells, thus contributing to the development of endometriosis[\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eA growing body of research suggests a link between glycolysis and immunological evasion[\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. While EM has benign clinical and pathological symptoms, it has cancer-like features such as spread, invasion, and hyperplasia[\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Cancers with a significant Warburg effect develop a tumour microenvironment (TME) deprived of glucose, limiting local immune surveillance via nutritional competition[\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. Meanwhile, Immune cells can promote glycolysis in the same way that tumour cells can. Previous research has found that the immunological environment of eutopic endometria in women with endometriosis differs from that of normal endometria, but in endometriosis, the relationship between glycolysis and the immunological milieu is both poorly understood[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIn conclusion, we wanted to create a model of endometriosis linked to glycolysis and investigate its connection with the immune microenvironment. The Gene Expression Omnibus database was used to obtain gene chips. LASSO and Random Forest were used to identify five prognostically related glycolytic genes. Following that, the association between the eutopic endometria immune environment and key genes were investigated, and a logistic regression model was built by ROC and verified by the test set. These findings could help researchers and clinicians better understand EMS.\u003c/p\u003e"},{"header":"Materials And Methods","content":"\u003cp\u003e\u003cstrong\u003eGene Expression Data Acquisition\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eGEO was used to download RNA-sequence profiles and data with and without endometriosis. The following are the eligibility requirements: First, eutopic endometrial samples collected from endometriosis patients and healthy controls; second, samples having glandular and stromal components; lastly, the ladies in the study were in the proliferative and early secretory phases of the menstrual cycle. GSE25628, GSE51981, GSE7846 and GSE7305 were included in study as training set (n=155). The test set comprised GSE120103 (n=36)and GSE6364 (n=37) to confirm our predictive model (\u003cstrong\u003eTABLE1\u003c/strong\u003e). For normalization, we utilized the \u0026quot;sva\u0026quot; utility in R to remove disparities across batches, because our datasets came from diverse cohorts and array platforms. For further analysis, we obtained 15,926 common genes.\u003c/p\u003e\n\u003cp style=\"text-align: center;\"\u003e\u003cstrong\u003eTable 1\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp style=\"text-align: center;\"\u003eThe RNA-sequence profiles used in this study\u0026nbsp;\u003c/p\u003e\n\u003cdiv align=\"center\"\u003e\n \u003ctable border=\"1\" cellpadding=\"0\" cellspacing=\"0\" width=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" width=\"14.5%\"\u003e\n \u003cp\u003eGEO Accession\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"12%\"\u003e\n \u003cp\u003ePlatform\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"13.833333333333334%\"\u003e\n \u003cp\u003eExperiment Type\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"9.5%\"\u003e\n \u003cp\u003eEM(N)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"16.5%\"\u003e\n \u003cp\u003eNORMAL(N)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"15.5%\"\u003e\n \u003cp\u003eTissue\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"7.333333333333333%\"\u003e\n \u003cp\u003eYear\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"10.833333333333334%\"\u003e\n \u003cp\u003eDateset\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" width=\"14.5%\"\u003e\n \u003cp\u003eGSE25628\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"12%\"\u003e\n \u003cp\u003eGPL571\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"13.833333333333334%\"\u003e\n \u003cp\u003emRNA array\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"9.5%\"\u003e\n \u003cp\u003e8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"16.5%\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"15.5%\"\u003e\n \u003cp\u003eendometrium\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"7.333333333333333%\"\u003e\n \u003cp\u003e2010\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"10.833333333333334%\"\u003e\n \u003cp\u003eTraining\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" width=\"14.5%\"\u003e\n \u003cp\u003eGSE51981\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"12%\"\u003e\n \u003cp\u003eGPL570\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"13.833333333333334%\"\u003e\n \u003cp\u003emRNA array\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"9.5%\"\u003e\n \u003cp\u003e76\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"16.5%\"\u003e\n \u003cp\u003e35\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"15.5%\"\u003e\n \u003cp\u003eendometrium\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"7.333333333333333%\"\u003e\n \u003cp\u003e2013\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"10.833333333333334%\"\u003e\n \u003cp\u003eTraining\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" width=\"14.5%\"\u003e\n \u003cp\u003eGSE7846\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"12%\"\u003e\n \u003cp\u003eGPL570\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"13.833333333333334%\"\u003e\n \u003cp\u003emRNA array\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"9.5%\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"16.5%\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"15.5%\"\u003e\n \u003cp\u003eendometrium\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"7.333333333333333%\"\u003e\n \u003cp\u003e2007\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"10.833333333333334%\"\u003e\n \u003cp\u003eTraining\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" width=\"14.5%\"\u003e\n \u003cp\u003eGSE7305\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"12%\"\u003e\n \u003cp\u003eGPL570\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"13.833333333333334%\"\u003e\n \u003cp\u003emRNA array\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"9.5%\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"16.5%\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"15.5%\"\u003e\n \u003cp\u003eendometrium\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"7.333333333333333%\"\u003e\n \u003cp\u003e2007\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"10.833333333333334%\"\u003e\n \u003cp\u003eTraining\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" width=\"14.5%\"\u003e\n \u003cp\u003eGSE120103\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"12%\"\u003e\n \u003cp\u003eGPL6480\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"13.833333333333334%\"\u003e\n \u003cp\u003emRNA array\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"9.5%\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"16.5%\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"15.5%\"\u003e\n \u003cp\u003eendometrium\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"7.333333333333333%\"\u003e\n \u003cp\u003e2019\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"10.833333333333334%\"\u003e\n \u003cp\u003etest\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" width=\"14.5%\"\u003e\n \u003cp\u003eGSE6364\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"12%\"\u003e\n \u003cp\u003eGPL570\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"13.833333333333334%\"\u003e\n \u003cp\u003emRNA array\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"9.5%\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"16.5%\"\u003e\n \u003cp\u003e16\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"15.5%\"\u003e\n \u003cp\u003eendometrium\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"7.333333333333333%\"\u003e\n \u003cp\u003e2007\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" width=\"10.833333333333334%\"\u003e\n \u003cp\u003etset\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003e\u003cstrong\u003eGlycolysis-related gene sets\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe Molecular Signatures Database (MSigDB) is a library of annotated gene sets for GSEA. Five gene sets relevant to glycolysis were retrieved, including BIOCARTA_GLYCOLYSIS_PATHWAY, BIOCARTA_FEEDER_PATHWAY, HALLMARK_GLYCOLYSIS, GO_GLYCOLYTIC_PROCESS, and REACTOME_GLYCOLYSIS. Using the \u0026quot;limma\u0026quot; program in R software, we found 262 glycolysis-related genes in 155 cases.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eIdentification and Validation of Predictive Gene Signature\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe glycolysis-related diagnostic indicators of EMS were classified using LASSO logistic regression and Random Forest. The \u0026quot;glmnet\u0026quot; program was used to conduct the LASSO analysis, with the response type set to binomial and the alpha set to one. Random Forest is a technique that uses recursive partitioning to generate a binary tree21. The Random Forest method was given a number of trees of 500. Then, we choose the top 20 genes of RF analysis to interact with LASSO results and used VennDiagram to visualize the intersection of gene lists.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConstruction of EMS diagnostic model\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eMachine algorithms Logistic Regression (LR), Random Forest (RF), and lasso regression (LASSO) were used to construct the diagnostic model of EMS based on the glycolysis-related diagnostic markers, the following is how a model was created:\u003c/p\u003e\n\u003cp\u003eRisk score=expr_gene_1 x coef_gene_1+ expr_gene_2 x coef_gene_2+\u0026hellip;+expr_gene_n x coef_gene_n\u003c/p\u003e\n\u003cp\u003eThe model\u0026apos;s effectiveness and accuracy were assessed using ROC curves and AUC values. The nomogram\u0026apos;s accuracy was assessed using the calibration plots. The best predicted value was indicated by the 45\u0026deg; line. The more perfect the result, the closer the curve was. The clinical utility of this model was examined using decision curve analysis (DCA).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEvaluation of Immune Cell Subtype Distribution\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe CIBERSORT algorithm was used to infer the relative proportion of 22 different types of immuno-infiltrating cells from RNA-seq data of women with and without EMS. Gene expression and immune-cell content were subjected to Spearman correlation analysis. A statistically significant value was defined as \u003cem\u003ep\u003c/em\u003e\u0026lt;0.05.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eGene Set Enrichment Analysis (GSEA)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eBased on the expression of five hub genes, EMS patients were categorized into two groups: high and low. The GSEA analysis of the two groups was accomplished through the use of signal pathway differences. The background gene set data was obtained from the Molecular Signature Database. A maximum (500) and minimum gene set were used to select the gene set. Enriched gene sets were found after 1,000 permutations with a \u003cem\u003ep\u003c/em\u003e\u0026lt;0.05 cutoff. The significantly enriched gene sets were then sorted in order of their significance. GSEA was used to investigate the relationships between various expression groups and biological processes.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eSemantic Similarity GO Annotations\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe use the GO Semsim software package of Wang\u0026rsquo;s method to explore the functional similarity between proteins[22]. In terms of molecular function (MF), biological processes (BP), and cellular component (CC), we calculated the geometric mean of GO semantic similarity. To measure functional similarity, the geometric average of semantic similarity was utilized.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCo-expression Analysis of the Hub Genes\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe \u0026ldquo;corrplot\u0026rdquo; and \u0026ldquo;circlize\u0026rdquo; tools in R software were used to do correlation analysis. The \u0026ldquo;corrplot\u0026rdquo; tool in R was used to plot the Pearson correlation of Hub gene expression (version 1.64). The \u0026ldquo;circlize\u0026rdquo; package was used to generate circos plots. The colours \u0026quot;red\u0026quot; and \u0026quot;green\u0026quot; represent correlation coefficients. A positive connection is indicated by the red colour, whereas a negative correlation is indicated by the green colour. The stronger the relationship, the darker the colour and thicker the cord.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStatistical Analysis\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFor statistical analyses and visualization of results, R software (version 4.1.2) was utilized. A\u003cem\u003e\u0026nbsp;p\u003c/em\u003e-value of less than 0.05 was judged statistically significant.\u0026nbsp;Significant correlation coefficients were defined as those with an absolute value more than 0.2 and a \u003cem\u003ep\u003c/em\u003e-value less than 0.05. To create the predictive model, the logistic regression technique was employed.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cstrong\u003eIdentification of five glycolysis-related\u0026nbsp;hub genes\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe training set was obtained from the NCBI GEO public database. There were 155 patients in total (EMS group,99; control group,56). The expression profiles of 262 glycolysis-related genes were derived using the Differentially expression profile.\u0026nbsp;We used LASSO regression to perform feature screening to explore for glycolysis-related biomarkers in EMS. The LASSO regression revealed that 18 genes were found to be signature genes. The 262 glycolysis-related DEGs were then fed into the random forest classifier.\u0026nbsp;The variable relevance of the output results was quantified in terms of decreasing accuracy and decreasing mean square error during the construction of the random forest model. The top 20 DEGs, ranked in order of relevance, were then chosen as candidate genes for further investigation. The intersection of random forest genes and LASSO regression genes resulted in 5 DEGs (\u003cstrong\u003eFigure 2\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEstablishment and Validation of the Diagnostic Model Based on five glycolysis-related hub genes\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe created a predictive model using five genes below:\u0026nbsp;chondroitin polymerizing factor(CHPF),\u0026nbsp;Cbp/p300 interacting transactivator with Glu/Asp rich carboxy-terminal domain 2(CITED2), glypican 3(GPC3),\u0026nbsp;alcohol dehydrogenase 6(ADH6),\u0026nbsp;pyruvate dehydrogenase kinase 3(PDK3). The following risk model was created using coefficients for the five hub genes:\u003c/p\u003e\n\u003cp\u003eRisk score=(2.751*CHPF)+(3.880*PDK3)+(0.631*CITED2)+(0.502*GPC3)-(3.075*ADH6)\u003c/p\u003e\n\u003cp\u003eCHPF, CITED2, GPC3, ADH6, and PDK3 were used to create a diagnostic prediction model for EMS using a multivariable logistic regression model and shown as a nomogram (\u003cstrong\u003eFigure 3f\u003c/strong\u003e). The performance of this model was examined using the area under the receiver operating(ROC). In the training set, the area under the ROC analysis (AUC) for this model is 0.777 (\u003cstrong\u003eFigure 3a\u003c/strong\u003e), and the AUC of the model in the test set is 0.824 and 0.774, respectively(\u003cstrong\u003eFigure 3d-e\u003c/strong\u003e). The calibration curve revealed that the model matched well with the actual and predicted probability of an EMS occurrence (\u003cstrong\u003eFigure 3b\u003c/strong\u003e). The nomogram\u0026apos;s C-index for predicting the presence of EMS was 0.777 [95% confidence interval (CI): 0.727\u0026ndash;0.827]. Furthermore, decision curve analysis (DCA) revealed that the anticipated and observed values were nearly identical (\u003cstrong\u003eFigure 3c\u003c/strong\u003e).\u0026nbsp;The above results indicate the importance and independence of risk score as a diagnostic model of EMS.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eThe landscape of immune infiltration\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe employed the CIBERSORT algorithm to explore the difference between eutopic endometria in endometriosis patients and healthy controls after revealing the landscape of 22 immune cell subpopulations infiltration. The abundance ratios of 22 immune cells in the 155 samples are presented in \u003cstrong\u003eFigure 4a\u003c/strong\u003e. The percentage of immune cells in each sample was revealed in \u003cstrong\u003eFigure 4b\u003c/strong\u003e. \u003cstrong\u003eFigure 4c\u003c/strong\u003e depicts the interaction of innate immune cells. Compared with control endometria, eutopic endometria from women with endometriosis contained a greater number of\u0026nbsp;T cells follicular helper, T cells regulator (Tregs), Macrophages M0, NK cells activated, Monocytes, Dendritic cells activated, and Mast cells resting. However, Plasma cells, T cells CD8, T cells CD4 memory resting, NK cells resting, Macrophages M1, Macrophages M2, Dendritic cells resting, Mast cells activated, T cells gamma delta and Eosinophils were relatively lower(\u003cstrong\u003eFigure 4d\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAnalysis of Core Genes and Immune Infiltration\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWhen we looked into the interaction between hub genes and immune cells further, the expression of risk hub genes (CHPF, CITED2, GPC3, PDK3) was shown to be positively connected with Plasma cells, Macrophages M2, T cells CD8, Mast cells activated, and T cells CD4 memory resting. The protective hub gene(ADH6) has a positive correlation of T cells CD4 naive, NK cells activated, Mast cells resting, T cells regulator (Tregs), and Macrophages M0(\u003cstrong\u003eFigure 5a-e\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAnalysis of Core Genes and Immune Factors\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe TISIDB database was then used to find associations between these five hub genes and various immunological variables such as chemokines, receptors, immunosuppressive factors, and immunostimulatory factors. A correlation graph was created between immunological factors and EMS key genes (\u003cstrong\u003eFigure 6a-d\u003c/strong\u003e). We selected immune factors associated with core genes (mean correlation coefficient \u0026gt; 0.4) and constructed an interaction network using Cytoscape and STRING(\u003cstrong\u003eFigure 6e-f\u003c/strong\u003e). These findings indicated that key genes contribute significantly to the endometrial immune microenvironment. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eGSEA Analysis of Glycolysis-related Hub Genes\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe used GSEA on these five critical genes to investigate their activities and pathways. The GSEA analysis of five hub genes in the\u0026nbsp;training set\u0026nbsp;demonstrated that samples of these highly expressed hub genes (CHPF, CITED2, GPC3) were primarily enriched in \u0026ldquo;regulation of vesicle fusion\u0026rdquo;, \u0026ldquo;calmodulin dependent protein kinase activity\u0026rdquo;, \u0026ldquo;myelin maintenance\u0026rdquo;, \u0026ldquo;phosphatidylglycerol metabolic process\u0026rdquo;, \u0026ldquo;positive regulation of actin cytoskeleton recognization\u0026rdquo; related pathways. Also, Samples with low expression of ADH6 were mainly enriched in \u0026ldquo;centriole assembly\u0026rdquo;, \u0026ldquo;structural constituent of nuclear pore\u0026rdquo;, \u0026ldquo;regulation of centriole replication\u0026rdquo;, \u0026ldquo;intraciliary transport\u0026rdquo; and \u0026ldquo;regulation of protein exit from endoplasmic reticulum\u0026rdquo;(\u003cstrong\u003eFigure 7a-d\u003c/strong\u003e).\u0026nbsp;The result suggested that all the above genes were involved in biological function, such as energy metabolism, material transport, and cell proliferation, which in turn contributed to the progression of EMS. Concurrently, the highly expressed PDK3 mainly participated in \u0026ldquo;methyl CPG binding\u0026rdquo;, \u0026ldquo;retinal ganglion cell axon guidance\u0026quot;, and \u0026quot;lactation\u0026rdquo; related pathways(\u003cstrong\u003eFigure 7e\u003c/strong\u003e). Molecular regulatory mechanisms of core genes were shown in a circle plot(\u003cstrong\u003eFigure 8a-e\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eGO Similarity And Co-Expression Of Hub Genes\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo investigate the hub genes in EMS, we listed the key genes based on the average functional similarity links among the proteins. Among those genes, the score of genes CITED2(score:0.324), PDK3(score:0.318) and GPC3(score:0.306)ranked highest. The rest two genes,CHPF (score:0.285)and ADH6(score:0.216), are below 0.3(\u003cstrong\u003eFigure 9a\u003c/strong\u003e). Pearson analysis was performed to investigate correlations between hub genes. Compared to CHPF, CITED2, GPC3, and PDK3 were more strongly positively correlated with each other. Yet only ADH6 remained negative associations with other hub genes(\u003cstrong\u003eFigure 9b\u003c/strong\u003e).\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eEndometriosis (EMS) is a systemic inflammatory disease caused by ectopic endometrial implantation and development outside the uterus cavity[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Recent studies have focused on glycolytic pathways in cancer cell growth and invasion. Endometrial cells, like cancer cells, have the ability to switch energy metabolism from mitochondrial oxidative phosphorylation (OXPHOS) to aerobic glycolysis, reduce ROS generation, and enhance survival[\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. It has been reported that glucose metabolism and energy production in ectopic endometriotic cells under hypoxia influence the incidence and invasion of endometriosis[\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. Nevertheless, studies involving the Warburg effect in eutopic endometrium cells are still lacking. Increasing findings suggest that changes in glycolytic metabolism in the endometrial microenvironment may have impacts on immune cell infiltration and other anti-immune processes, but the specific mechanism remains to be explored.\u003c/p\u003e \u003cp\u003eEarly prediction and detection of EMS can apply early interventions and improve treatment outcomes. Therefore, the identification of possible biomarkers for predicting EMS is crucial. In recent years, advances in Machine Learning Techniques and the availability of gene expression data in public databases have provided a new approach to identifying biomarkers for disease detection.\u003c/p\u003e \u003cp\u003eIn this work, we identified five glycolysis-related potential genes (CHPF, CITED2, PDK3, GPC3, and ADH6) through LASSO regression analysis and the RF method. The CIBERSORT algorithm was then used to do a deconvolution study of the immune microenvironment in order to determine the fraction of immune cells in EMS. The relationship between core genes and other immunomodulators, as well as the majority of chemokines and receptors mentioned in TISIDB, is then investigated. The profiles of the five hub genes were identified using GO semantic similarity and GSEA analysis. The risk score based on the five glycolysis-related markers was then used to create a nomogram, the nomogram had a good predictive performance.\u003c/p\u003e \u003cp\u003eChondroitin polymerizing factor (CHPF) is a type II transmembrane protein that is essential for chondroitin sulfate (CS)production. Many cell biological functions, such as cell adhesion, cell differentiation, and neural network creation, rely on CS[\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]. Li et al [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e] reported that the expression of the CHPF was linked to immune cells and various immune factors. At present, most studies focus on the function of CHPF in cancers, little has been studied in endometriosis.\u003c/p\u003e \u003cp\u003eCBP/p300-interacting-transactivator-with-an-ED-rich-tail 2 (CITED2) is a transcriptional regulator that regulates biological functions by co-activating or repressing multiple transcription factors[\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. The glycolytic gene CITED2 is also a hypoxia-related gene. It is reported that CITED2 is associated with primary ovarian insufficiency[\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eGlypican-3 (GPC3) is a membrane-associated proteoglycan involved in cell growth, differentiation, and migration. The specific expression of GPC3 in tumour cells has gotten a lot of attention[\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. In a Canadian patient cohort, high membranous GPC3 expression was found in 20% of endometriosis-associated ovarian clear cell carcinomas (OCCCs) [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e].\u003c/p\u003e \u003cp\u003ePyruvate dehydrogenase kinase 3(PDK3) is a member of the PDK family, which contains PDK1, PDK2, PDK2, and PDK4. PDK3 mainly contributes to metabolic switch and cell survival under hypoxia, like CITED2[\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Simultaneously, PDK3 performed a crucial role in cancers and has been regarded as a promising target for cancers[\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. Our results found that PDK3 was over-expressed in eutopic endometria in women with endometriosis. However, the function of PDK3 in endometriosis is unclear.\u003c/p\u003e \u003cp\u003eThe serum levels of Alcohol dehydrogenase 6(ADH6) have been shown in numerous studies to be a potential diagnostic marker in cancers. It has been substantiated that ADH6 was found to be involved in the P450-related pathway and biological processes linked to the progression and treatment of pancreatic cancer[\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. Its involvement in endometriosis biology, however, is unknown.\u003c/p\u003e \u003cp\u003eAs we know, the Biological behaviour of endometriosis is similar to that of malignant tumours. Glucose is the most readily available nutrient for cancer cells, but it is also required for T cell activation, differentiation, and function35. Proliferating tumour cells that consume a lot of extracellular glucose secrete lactic acid into the cancer microenvironment. Lactate was later discovered to inhibit monocyte migration and cytokine release, as well as promote resident macrophage polarization to the tumour-associated macrophage 2 (TAM2) phenotype, resulting in tumour progression and immune escape[\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eAlthough endometriosis is a benign illness, it exhibits neoplastic traits such as inflammation and tissue invasion[\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. So we speculate that the same biological process between glycolysis and the immune environment occurs in endometriosis[\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e]. The abundances of T cells follicular helper, T cells regulator (Tregs), Macrophages M0, NK cells activated, Monocytes, Dendritic cells activated, and Mast cells(MC) resting were higher in the eutopic endometria of women with endometriosis than in normal controls in our study. Tfh cells are a kind of CD4\u0026thinsp;+\u0026thinsp;T cell that plays a critical role in the adaptive immune response. Tfh cells' roles in endometriosis have received little attention[\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. T cells regulators (Tregs) are increased in the endometrium of women with and without disease, according to most research. However, there is still debate[\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. Previous research has shown that in the proliferative phase of endometriosis, more macrophages (M\u0026oslash;s) and activated dendritic cells are found in the endometrium of women with endometriosis, regardless of the hormonal milieu[\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. Currently, it appears that uterine natural killer (uNK) cells from women with endometriosis are immature and that uNK cytotoxic activity could be an indicator of endometriosis-related infertility and recurrent miscarriage, despite the fact that the absolute numbers are the same as in normal endometrium[\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. Others have found higher Mast cell infiltration in the endometrium in women with illness, as well as enhanced MC activation in ectopic lesions, but activated MC in eutopic endometrium were rarely found[\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. We also found increased monocytes in eutopic endometrium of women with endometriosis, we infer that monocytes are largely recruited, as a source of monocyte-derived macrophages[\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eNextly, we analyze the correlation between hub glycolysis-related gene expression and various immune cell infiltration. Likewise, We investigated the relationship between five hub genes, various immunomodulators, chemokines and receptors listed in TISIDB. Finally, According to GESA, hub genes are primarily involved in biological processes, such as energy metabolism, material transport, and cell proliferation, which in turn contribute to the progression of EMS. What is more, Compared to CHPF, CITED2, GPC3, and PDK3 were more strongly positively correlated with each other. ADH6 remained negative associations with other hub genes.\u003c/p\u003e \u003cp\u003eWe discovered five glycolysis-related hub genes that are closely associated with the molecular mechanism of EMS using bioinformatic analysis, verified the biological functions and important pathways of the hub genes, and performed immune cell infiltration and correlation analysis for the target core genes. This work has certain limitations, despite the fact that the expression levels of these hub genes were also validated in the test set. If ethical approval is given, these discovered target genes will be confirmed in clinical samples using RT-qPCR in future research. In addition, it is just a proof of concept and more \u003cem\u003ein vitro\u003c/em\u003e and \u003cem\u003ein vivo\u003c/em\u003e experiments are needed to confirm our findings and investigate the mechanisms of glycolysis-related genes regulating the infiltration of immune cells.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eIn conclusion, we discovered five glycolysis-related genes in endometriosis and developed a model for EMS assessment. Based on numerous bioinformatics techniques, we discovered hub genes in EMS and their correlation with immune infiltration cells, as well as correlations between 22 immune cell subpopulations. Meanwhile, the GSEA and GO similarity analysis reveal more specific mechanisms. Selected genes could be candidate predictive markers and potential therapeutic targets for EMS, but the exact mechanisms of glycolysis-related genes and immune environment(including immune cells, immune factors) in EMS should be more explored.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eEMS: Endometriosis; NCBI: National Center For Biotechnology Information; GEO: Gene Expression Omnibus; GSEA: Gene Set Enrichment Analysis; MSigDB: Molecular Signatures Database; LASSO: Least Absolute Shrinkage And Selection Operator; RF: Random Forest; TME: Tumor Microenvironment; DCA: Decision Curve Analysis; ROC: Receiver Operating Characteristic Curve; AUC: Area Under Curve; CHPF: Chondroitin polymerizing factor; CITED2: CBP/p300-interacting-transactivator-with-an ED-rich-tail 2; GPC3:Glypican-3; PDK3:Pyruvate dehydrogenase kinase 3; ADH6: Alcohol dehydrogenase 6; Tregs: T cells regulator Tregs; M\u0026oslash;s:\u003c/p\u003e\n\u003cp\u003eMacrophages; uNK: uterine natural killer\u0026nbsp;\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eData Availability Statements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets generated during and/or analyzed during the current study are available in the GEO. The data supporting this study\u0026rsquo;s findings are available from the corresponding author upon reasonable request. The names of the repository/repositories and accession number(s) are listed below:\u003c/p\u003e\n\u003cp\u003ehttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE25628\u003c/p\u003e\n\u003cp\u003ehttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE51981\u003c/p\u003e\n\u003cp\u003ehttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE7846\u003c/p\u003e\n\u003cp\u003ehttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE7305\u003c/p\u003e\n\u003cp\u003ehttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120103\u003c/p\u003e\n\u003cp\u003ehttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE6364\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStatements \u0026amp; Declarations\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors would like to thank GEO, GSEA, MSigDB, and TISIDIB for data availability.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding \u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by the Shanghai Key Laboratory of Female Reproductive Endocrine-Related Diseases (Grant numbers. 17DZ2273600). Author Yanqiu Wang has received research support.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors have no relevant financial or non-financial interests to disclose.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eYQ Wang: Project development, Writing-reviewing\u003c/p\u003e\n\u003cp\u003eQZ Chen: Data analysis, Manuscript writing, Supervision and Validation\u003c/p\u003e\n\u003cp\u003eYF Jiao: Manuscript writing\u003c/p\u003e\n\u003cp\u003eZ Yin: Manuscript writing\u003c/p\u003e\n\u003cp\u003eXY Fu: Data collection\u003c/p\u003e\n\u003cp\u003eSN Guo: Data collection\u003c/p\u003e\n\u003cp\u003eJ Xiang: Project development, Writing-reviewing, Supervision and Validation\u003c/p\u003e\n\u003cp\u003eAll authors contributed to and have approved the final manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003ehapron C, Marcellin L, Borghese B and Santulli P (2019)Rethinking mechanisms, diagnosis and management of endometriosis. Nature Reviews Endocrinology 1511:666-82. http://doi.org/10.1038/s41574-019-0245-z\u003c/li\u003e\n\u003cli\u003eaylor HS, Kotlyar AM and Flores VA(2021)Endometriosis is a chronic systemic disease: clinical challenges and novel innovations. Lancet 39710276:839-52. http://doi.org/10.1016/S0140-6736(21)00389-5\u003c/li\u003e\n\u003cli\u003eampson JA(1927)Peritoneal endometriosis due to the menstrual dissemination of endometrial tissue into the peritoneal cavity. American Journal of Obstetrics and Gynecology 144:422-69. https://doi.org/10.1016/S0002-9378(15)30003-X\u003c/li\u003e\n\u003cli\u003eourtnay R, Ngo DC, Malik N, Ververis K, Tortorella SM and Karagiannis TC(2015)Cancer metabolism and the Warburg effect: the role of HIF-1 and PI3K. Mol Biol Rep 424:841-51. http://doi.org/10.1007/s11033-015-3858-x\u003c/li\u003e\n\u003cli\u003ee Souza AC, Justo GZ, de Ara\u0026uacute;jo DR and Cavagis AD(2011)Defining the molecular basis of tumor metabolism: a continuing challenge since Warburg\u0026apos;s discovery. Cell Physiol Biochem 285:771-92. http://doi.org/10.1159/000335792\u003c/li\u003e\n\u003cli\u003efeiffer T, Schuster S and Bonhoeffer S(2001)Cooperation and competition in the evolution of ATP-producing pathways. Science 2925516:504-7. http://doi.org/10.1126/science.1058079\u003c/li\u003e\n\u003cli\u003ehou Y, Tozzi F, Chen J, Fan F, Xia L, Wang J, Gao G, Zhang A, Xia X, Brasher H, Widger W, Ellis LM and Weihua Z(2012)Intracellular ATP levels are a pivotal determinant of chemoresistance in colon cancer cells. Cancer Res 721:304-14. http://doi.org/10.1158/0008-5472.Can-11-1674\u003c/li\u003e\n\u003cli\u003eocasale JW and Cantley L C(2010)Altered metabolism in cancer. BMC Biol 8:88. http://doi.org/10.1186/1741-7007-8-88\u003c/li\u003e\n\u003cli\u003eatenby RA and Gillies RJ(2004)Why do cancers have high aerobic glycolysis? Nat Rev Cancer 411:891-9. http://doi.org/10.1038/nrc1478\u003c/li\u003e\n\u003cli\u003eunt SY and Vander Heiden MG(2011)Aerobic glycolysis: meeting the metabolic requirements of cell proliferation. Annu Rev Cell Dev Biol 27:441-64. http://doi.org/10.1146/annurev-cellbio-092910-154237\u003c/li\u003e\n\u003cli\u003earc\u0026iacute;a-G\u0026oacute;mez E, V\u0026aacute;zquez-Mart\u0026iacute;nez ER, Reyes-Mayoral C, Cruz-Orozco OP, Camacho-Arroyo I and Cerb\u0026oacute;n M(2019)Regulation of Inflammation Pathways and Inflammasome by Sex Steroid Hormones in Endometriosis. Front Endocrinol (Lausanne) http://doi.org/10:935. 10.3389/fendo.2019.00935\u003c/li\u003e\n\u003cli\u003eelbandi AA, Mahmoudi M, Shervin A, Akbari E, Jeddi-Tehrani M, Sankian M, Kazemnejad S and Zarnani AH(2013)Eutopic and ectopic stromal cells from patients with endometriosis exhibit differential invasive, adhesive, and proliferative behavior. Fertil Steril 1003:761-9. 10.1016/j.fertnstert.2013.04.041\u003c/li\u003e\n\u003cli\u003eundqvist J, Andersson KL, Scarselli G, Gemzell-Danielsson K and Lalitkumar PG(2012)Expression of adhesion, attachment and invasion markers in eutopic and ectopic endometrium: a link to the aetiology of endometriosis. Hum Reprod 279:2737-46. http://doi.org/10.1093/humrep/des220\u003c/li\u003e\n\u003cli\u003eoss K, Hong HS, Bader JE, Sugiura A, Lyssiotis CA and Rathmell JC(2021)A guide to interrogating immunometabolism. Nat Rev Immunol 2110:637-52. http://doi.org/10.1038/s41577-021-00529-8\u003c/li\u003e\n\u003cli\u003ehandel NS(2021)Glycolysis. Cold Spring Harb Perspect Biol http://doi.org/13510.1101/cshperspect.a040535\u003c/li\u003e\n\u003cli\u003eao Q, Jing G, Zhang X, Li M, Yao Q and Wang L(2021)Cinnamic acid inhibits cell viability, invasion, and glycolysis in primary endometrial stromal cells by suppressing NF-\u0026kappa;B-induced transcription of PKM2. Biosci Rep http://doi.org/10.1042/bsr20211828\u003c/li\u003e\n\u003cli\u003ealezic A., Udicki M., Srdic Galic B., Aleksic M., Korac A., Jankovic A. and Korac B.(2021)Tissue-Specific Warburg Effect in Breast Cancer and Cancer-Associated Adipose Tissue-Relationship between AMPK and Glycolysis. Cancers (Basel) 1311 http://doi.org/10.3390/cancers13112731\u003c/li\u003e\n\u003cli\u003eung SW, Zhang R, Tan Z, Chung JPW, Zhang T and Wang CC(2021)Pharmaceuticals targeting signaling pathways of endometriosis as potential new medical treatment: A review. Med Res Rev 414:2489-564. http://doi.org/10.1002/med.21802\u003c/li\u003e\n\u003cli\u003ean C, Kam S and Ramadori P(2021)Metabolism-Associated Epigenetic and Immunoepigenetic Reprogramming in Liver Cancer. Cancers (Basel) http://doi.org/132010.3390/cancers13205250\u003c/li\u003e\n\u003cli\u003eu XG, Chen JJ, Zhou HL, Wu Y, Lin F, Shi J, Wu HZ, Xiao HQ and Wang W(2021)Identification and Validation of the Signatures of Infiltrating Immune Cells in the Eutopic Endometrium Endometria of Women With Endometriosis. Front Immunol 12:671201. http://doi.org/10.3389/fimmu.2021.671201\u003c/li\u003e\n\u003cli\u003eacedo HG, Fonseca NF and Brasil P(2019)Characterization of clinical patterns of dengue patients using an unsupervised machine learning approach. BMC Infect Dis 191:649. http://doi.org/10.1186/s12879-019-4282-y\u003c/li\u003e\n\u003cli\u003eang JZ, Du Z, Payattakool R, Yu PS and Chen CF(2007)A new method to measure the semantic similarity of GO terms. Bioinformatics 2310:1274-81. http://doi.org/10.1093/bioinformatics/btm087\u003c/li\u003e\n\u003cli\u003eobayashi H, Shigetomi H and Imanaka S (2021)Nonhormonal therapy for endometriosis based on energy metabolism regulation. Reprod Fertil 24:C42-C57. http://doi.org/10.1530/RAF-21-0053\u003c/li\u003e\n\u003cli\u003easvandik S, Samuel K, Peters M, Eimre M, Peet N, Roost AM, Padrik L, Paju K, Peil L and Salumets A(2016)Deep Quantitative Proteomics Reveals Extensive Metabolic Reprogramming and Cancer-Like Changes of Ectopic Endometriotic Stromal Cells. J Proteome Res 152:572-84. http://doi.org/10.1021/acs.jproteome.5b00965\u003c/li\u003e\n\u003cli\u003eu MH, Hsiao KY and Tsai SJ (2019)Hypoxia: The force of endometriosis. J Obstet Gynaecol Res 453:532-41. http://doi.org/10.1111/jog.13900\u003c/li\u003e\n\u003cli\u003ein XL, Han T, Xia Q, Cui JJ, Zhuo M, Liang YY, Su WY, Wang LS, Wang LW, Liu ZB and Xiao XY(2021)CHPF promotes gastric cancer tumorigenesis through the activation of E2F1. Cell Death Dis 1210:876-76. http://doi.org/10.1038/s41419-021-04148-y\u003c/li\u003e\n\u003cli\u003ei WW, Liu B, Dong SQ, He SQ, Liu YY, Wei SY, Mou JY, Zhang JX and Liu Z(2022)Bioinformatics and Experimental Analysis of the Prognostic and Predictive Value of the CHPF Gene on Breast Cancer. Front Oncol 12:856712-12. http://doi.org/10.3389/fonc.2022.856712\u003c/li\u003e\n\u003cli\u003eawson H, van de Lagemaat Louie N, Barile M, Tavosanis A, Durko J, Villacreces A, Bellani A, Mapperley C, Georges E, Martins-Costa C, Sepulveda C, Allen L, Campos J, Campbell KJ, O\u0026apos;Carroll D, G\u0026ouml;ttgens B, Cory S, Rodrigues NP, Guitart AV and Kranc KR(2021)CITED2 coordinates key hematopoietic regulatory pathways to maintain the HSC pool in both steady-state hematopoiesis and transplantation. Stem Cell Reports 1611:2784-97. http://doi.org/10.1016/j.stemcr.2021.10.001\u003c/li\u003e\n\u003cli\u003eortu\u0026ntilde;o C and Labarta E (2014)Genetics of primary ovarian insufficiency: a review. J Assist Reprod Genet 3112:1573-85. http://doi.org/10.1007/s10815-014-0342-9\u003c/li\u003e\n\u003cli\u003eheng XF, Liu X, Lei YN, Wang G and Liu M(2022)Glypican-3: A Novel and Promising Target for the Treatment of Hepatocellular Carcinoma. Front Oncol 12:824208-08. http://doi.org/10.3389/fonc.2022.824208\u003c/li\u003e\n\u003cli\u003eiedemeyer K, K\u0026ouml;bel M, Koelkebeck H, Xiao Z and Vashisht K(2020)High glypican-3 expression characterizes a distinct subset of ovarian clear cell carcinomas in Canadian patients: an opportunity for targeted therapy. Human Pathology 98:56-63. https://doi.org/10.1016/j.humpath.2020.01.002\u003c/li\u003e\n\u003cli\u003eui LZ, Cheng ZH, Liu Y, Dai YF, Pang YF, Jiao Y, Ke XY, Cui W, Zhang QY, Shi Jinlong and Fu Lin(2020)Overexpression of PDK2 and PDK3 reflects poor prognosis in acute myeloid leukemia. Cancer Gene Ther 271-2:15-21. http://doi.org/10.1038/s41417-018-0071-9\u003c/li\u003e\n\u003cli\u003eu J, Shi Q, Xu W, Zhou Q, Shi R, Ma Y, Chen D, Zhu L, Feng L, Cheng AS, Morrison H, Wang X and Jin H(2019)Metabolic enzyme PDK3 forms a positive feedback loop with transcription factor HSF1 to drive chemoresistance. Theranostics 910:2999-3013. http://doi.org/10.7150/thno.31301\u003c/li\u003e\n\u003cli\u003eiao XW, Huang R, Liu XG, Han CY, Yu L, Wang SJ, Sun N, Li BP, Ning X and Peng T (2017)Distinct prognostic values of alcohol dehydrogenase mRNA expression in pancreatic adenocarcinoma. OncoTargets and therapy 10:3719-32. http://doi.org/10.2147/OTT.S140221\u003c/li\u003e\n\u003cli\u003eun LC, Suo CX, Li ST, Zhang HF and Gao P(2018)Metabolic reprogramming for cancer cells and their microenvironment: Beyond the Warburg Effect. Biochimica et Biophysica Acta (BBA) - Reviews on Cancer 18701:51-66. https://doi.org/10.1016/j.bbcan.2018.06.005\u003c/li\u003e\n\u003cli\u003englesio MS, Papadopoulos N, Ayhan A, Nazeran TM, No\u0026euml; M, Horlings HM, Lum A, Jones S, Senz J, Seckin T, Ho J, Wu RC, Lac V, Ogawa H, Tessier-Cloutier B, Alhassan R, Wang A, Wang Y, Cohen JD, Wong F, Hasanovic A, Orr N, Zhang M, Popoli M, McMahon W, Wood LD, Mattox A, Allaire C, Segars J, Williams C, Tomasetti C, Boyd N, Kinzler KW, Gilks CB, Diaz L, Wang TL, Vogelstein B, Yong PJ, Huntsman DG and Shih I M(2017)Cancer-Associated Mutations in Endometriosis without Cancer. N Engl J Med 37619:1835-48. http://doi.org/10.1056/NEJMoa1614814\u003c/li\u003e\n\u003cli\u003eaw H, Venturi V, Kelleher A and Munier C. M (2020)Tfh Cells in Health and Immunity: Potential Targets for Systems Biology Approaches to Vaccination. Int J Mol Sci http://doi.org/212210.3390/ijms21228524\u003c/li\u003e\n\u003cli\u003eallv\u0026eacute;-Juanico J, Houshdaran S and Giudice LC(2019)The endometrial immune environment of women with endometriosis. Hum Reprod Update 255:564-91. http://doi.org/10.1093/humupd/dmz018\u003c/li\u003e\n\u003cli\u003ehiruchelvam U, Wingfield M and O\u0026apos;Farrelly C (2015)Natural Killer Cells: Key Players in Endometriosis. Am J Reprod Immunol 744:291-301. http://doi.org/10.1111/aji.12408\u003c/li\u003e\n\u003cli\u003eugamata M, Ihara T and Uchiide I(2005)Increase of activated mast cells in human endometriosis. Am J Reprod Immunol 533:120-5. http://doi.org/10.1111/j.1600-0897.2005.00254.x\u003c/li\u003e\n\u003cli\u003eogg C, Panir K, Dhami P, Rosser M, Mack M, Soong D, Pollard JW, Jenkins SJ, Horne Andrew W. and Greaves Erin(2021)Macrophages inhibit and enhance endometriosis depending on their origin. Proc Natl Acad Sci USA 1186:e2013776118. http://doi.org/10.1073/pnas.2013776118\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"endometriosis, glycolysis, immune infiltration, machine learning, diagnosis","lastPublishedDoi":"10.21203/rs.3.rs-1686939/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-1686939/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003ePurpose:\u003c/strong\u003e The objective of this study was to investigate the key glycolysis-related genes linked to immune cell infiltration in endometriosis and to develop a new endometriosis(EMS) predictive model.\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eMethods:\u003c/strong\u003e A training set and a test set were created from the NCBI GEO public database. We identified five glycolysis-related genes using LASSO and the Random Forest method. Then we developed and tested a prediction model for EMS diagnosis. The method CIBERSORT was used to compare the infiltration of 22 different immune cells. We looked into the relationship between key glycolysis-related genes and immune factors in eutopic endometrial of women with endometriosis. Besides, GO-based semantic similarity and logistic regression model analyses were used to investigate core genes.\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eResults: \u003c/strong\u003eThe five glycolysis-related hub genes (CHPF, CITED2, GPC3, PDK3, ADH6) were used to establish a predictive model for EMS. In the training and test set, the AUC of the ROC prediction model was 0.777, 0.824, and 0.774, respectively. Additionally, there was a remarkable difference in the immune environment between EMS and control. \u003c/p\u003e\u003cp\u003e\u003cstrong\u003eConclusion: \u003c/strong\u003eThe glycolysis-immune-based predictive model was established to forecast EMS patients’ diagnosis, and a detailed comprehension of the interactions between endometriosis, glycolysis, and the immune system, may be vital for the recognition of potential novel therapeutic approaches and targets for EMS patients.\u003c/p\u003e","manuscriptTitle":"Establishment Of A Novel Glycolysis-Immune-Related Diagnosis Gene Signature For Endometriosis By Machine Learning","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2022-06-14 16:11:40","doi":"10.21203/rs.3.rs-1686939/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"eb6a942d-05a6-42cb-a4a7-2556f8a116c0","owner":[],"postedDate":"June 14th, 2022","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2022-06-20T09:02:27+00:00","versionOfRecord":[],"versionCreatedAt":"2022-06-14 16:11:40","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-1686939","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-1686939","identity":"rs-1686939","version":["v1"]},"buildId":"WvIrzKhiLBfengagbw6Ux","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Condition tags

endometriosis

Citation neighborhood

Papers in the corpus that this work cites (lower rings, blue) and that cite this one (upper rings, green). Dot size scales with the paper's in-corpus citation count — bigger dot = more influential within the endo/adeno field. Click a dot to open that paper. [ expand to 2 hops ] — adds papers reached through this work's immediate citers/citees. Heavier; up to 60 extra dots.

References (41)

Source provenance

europepmc
last seen: 2026-06-04T01:45:00.660873+00:00
openalex
last seen: 2026-06-04T00:00:01.174412+00:00
License: CC0 · commercial use OK