Section 5
By integrating bioinformatics analysis and machine learning algorithms, this study identified FZD4, SRPX2, and COL8A1 as hub genes associated with angiogenesis in endometriosis (EM). Gene enrichment analysis clarified the potential molecular pathways through which these genes regulate EM-related angiogenesis; immune infiltration analysis revealed the regulatory role of key immune cell subsets in the endometrial microenvironment on EM angiogenesis; the constructed ceRNA regulatory network systematically elucidated the multi-level molecular regulatory mechanisms underlying EM angiogenesis; meanwhile, potential active ingredients and traditional Chinese medicines that may act on this process were predicted. This study may provide certain theoretical references for the screening of EM angiogenesis-related biomarkers and is also expected to expand new research directions for the development of EM-targeted drugs.
Certainly, this study has limitations in several aspects. Specifically, it was conducted entirely as a bioinformatics analysis based on computer simulations and public data, without performing qRT-PCR, animal experiments, or functional experiments. It lacks both in vitro/in vivo experimental support and clinical validation, which constitutes a major limitation. To address this issue in future research, verification of FZD4, SRPX2, and COL8A1 should be carried out using independent tissue cohorts or in vitro angiogenesis experiments. Meanwhile, during the study, genes related to pathways such as estrogen and hypoxia were not excluded a priori; instead, screening was conducted throughout based on uniform statistical criteria. Moreover, due to the absence of a designed crosstalk quantification module, the degree of overlap between angiogenesis and the estrogen/hypoxia pathways could not be systematically evaluated, and further clarification will be required in subsequent studies through hypergeometric tests or pathway interaction analyses. Furthermore, potential confounding factors of the samples, such as disease subtypes, staging, and hormonal status, were not fully adjusted for, which may affect the accuracy of the results. The constructed diagnostic model has not undergone validation with external independent cohorts, posing a risk of overfitting, and its generalization ability remains to be confirmed. In addition, both the immune deconvolution analysis and the construction of the ceRNA network rely on algorithmic predictions, and the reliability of the relevant results still requires direct corroboration through subsequent experiments.
Intro
Endometriosis (EM) is a chronic, estrogen-dependent, inflammatory disease defined by endometrial-like tissue (lesions) outside the uterine lining. [ 1 ] It affects approximately 10% of reproductive-age women. [ 2 ] Despite well-established theories of retrograde menstruation, coelomic metaplasia, lymph and vein dissemination, no single mechanism fully accounts for the multifaceted etiology of EM. Consequently, the pathogenesis underlying EM remains a subject of considerable controversy.
The essential “3A” pathogenic sequence of “Attachment,” “Aggregation,” and “Angiogenesis” constitutes a mandatory pathway for viable endometrial fragments to develop into EM lesions. [ 3 ] Sprouting angiogenesis describes the process of new blood vessel formation through budding from established vasculature. [ 4 ] Neovascularization in ectopic lesions and its adjacent tissues is essential to sustain implanted ectopic endometrium survival, promote lesion growth and the progression to EM. [ 5 ] Most EM lesions are surrounded by abdominal blood vessels and exhibit the feature of highly vascularized. [ 6 , 7 ] This demonstrates that angiogenesis plays a fundamental role in the progression of EM. Consequently, suppressing and blocking angiogenesis may represent a crucial therapeutic strategy for controlling the implantation and growth of ectopic lesions.
By integrating bioinformatics analysis and machine learning, this study aims to identify angiogenesis hub genes and elucidate their molecular mechanisms in EM, thereby proposing innovative strategies for clinical management.
Author
Conceptualization: Jiaoyue Li, Xiaona Ma.
Data curation: Jiaoyue Li, Fawei Li, Sijia Zhang, Xiaona Ma.
Formal analysis: Fawei Li, Sijia Zhang, Changming Zhai.
Funding acquisition: Xiaona Ma.
Methods
The GSE7305 (10 EM patients vs 10 healthy controls), GSE23339 (10 EM patients vs 9 healthy controls) and GSE25628 (16 EM patients vs 6 healthy controls) datasets were retrieved from the Gene Expression Omnibus (GEO) database [ 8 ] ( https://www.ncbi.nlm.nih.gov/geo/ ). For detailed information about the datasets, see (Table S1, Supplemental Digital Content, https://links.lww.com/MD/Q431 ). Angiogenesis-associated genes (AAGs) were systematically obtained from AMIGO2 [ 9 ] (Gene Ontology Consortium; http://amigo.geneontology.org ). First, we performed principal component analysis (PCA) on the merged raw dataset to assess the presence of batch effects. The results indicated a certain degree of heterogeneity among samples from different batches. To address this, we applied the widely recognized ComBat algorithm (implemented in the R package “sva”) for batch correction. This algorithm is based on a linear model: “gene expression ~ disease status (EM/healthy control) + batch + potential confounders.” It specifically targets batch-related systematic variations while preserving biology variations associated with disease status. The corrected data were then used for subsequent analyses, ensuring the reliability and reproducibility of our results. Differentially expressed genes (DEGs) were identified using a threshold of adjusted P -value 1. This criterion was selected based on 3 considerations: First, the Benjamini-Hochberg correction was applied to control the false discovery rate (FDR) at below 5%, ensuring the statistical significance of the results. Second, the |logFC| > 1 criterion ensures that the detected gene expression changes are biologically meaningful. Third, this threshold is widely adopted in Endometrioid carcinoma (EM) studies and has been validated through pre-experiments to effectively enrich pathways related to angiogenesis. The volcano plot visualizing DEG distributions were generated using “ggplot2” package in R (version 3.5.2). This work utilized exclusively publicly available data and involved no human or animal experimentation, thus qualifying for ethical review exemption.
Initially, the “WGCNA” package in R (version 1.73) is used to process the sample data and delete outliers. The soft threshold power is determined by the “Pick Soft Threshold” function, which is converted into topological overlap matrix. Hierarchical clustering is performed based on the difference degree of the matrix. Genes with highly similar co-expression patterns are clustered into the same module (ME), and the module most related to EM is selected for subsequent analysis. We intersected the DEGs, the module genes obtained from WGCNA, and the AAGs to obtain the endometriosis-angiogenesis-associated genes (EM-AAGs).
Kyoto Encyclopedia of Genes and Genomes (KEGG) [ 10 ] pathway enrichment and gene ontology (GO) [ 11 ] functional enrichment of EM-AAGs, including biological process, cellular component, and molecular function, were analyzed for function and pathway of EM-AAGs, using the “clusterProfiler” package in R (version 4.10.1). The difference was considered statistically significant at P < .05.
To further identify the key EM-AAGs critical for EM diagnosis, this study constructed a binary classification prediction model and analyzed data using 5 machine learning algorithms: Random Forest (RF), Least Absolute Shrinkage and Selection Operator (LASSO), eXtreme Gradient Boosting (XGBoost), Gradient Boosting Machine (GBM), and Support Vector Machine-Recursive Feature Elimination (SVM-RFE). The overlapping genes identified by the 5 machine learning algorithms were ultimately selected as the hub genes for EM-AAGs.
For stability verification, 56 EM-AAGs from the original study were used as input, and cross-validation (CV) was conducted for 5 machine learning algorithms (i.e., RF, LASSO, XGBoost, GBM, SVM-RFE). To reduce random errors, the process was repeated 10 times, with “model classification accuracy” as the core index to verify algorithm stability across different data subsets. This addressed the original study limitation of only screening hub genes via algorithm intersection, verifying the reliability of FZD4, SRPX2, and COL8A1 as EM angiogenesis hub genes and eliminating the risk of “hub genes being caused by algorithmic preference for specific data.”
For external validation, 3 independent GEO datasets ( GSE11691 , GSE120103 , GSE7846 ) not included in the original study were used to build an external validation set (62 EM patients, 45 healthy controls), with no sample overlap with the original training set ( GSE7305 + GSE23339 + GSE25628 ). The ComBat algorithm (R “sva” package) was applied to eliminate batch effects, verified via PCA before and after removal. Meanwhile, FZD4, SRPX2, and COL8A1 expression trends in the validation set were checked for consistency with the original training set. Subsequently, a “FZD4 + SRPX2 + COL8A1” combined diagnostic model was built using the original study’s nomogram weights. Area under the curve (AUC, with 95% CI) served as the core index to compare the model’s efficacy with single genes and the original training set’s combined model; additionally, with sensitivity = 0.85 and specificity = 0.80 as thresholds, decision curve analysis (DCA) was used to assess clinical net benefit.
Receiver operating characteristic (ROC) curves were generated using the “pROC” package in R (version 1.18.5), with the AUC quantifying the diagnostic efficacy of hub genes. Subsequently, a nomogram was constructed, and calibration curves with DCA were performed to evaluate prediction accuracy.
To explore potential pathways associated with hub genes in the pathogenesis of EM, we performed gene set enrichment analysis (GSEA). The C2.cp.KEGG.v7.4.symbols.gmt gene set from the Molecular Signatures Database (MSigDB, https://www.gsea-msigdb.org/gsea/msigdb ) served as the reference. Significantly enriched pathways were identified using the following thresholds: absolute normalized enrichment score (|NES|) > 1.5, nominal P < .05, and FDR < 0.25.
This study used the “Cibersort” package in R (version 1.03) and the immune cell feature matrix gene expression profiles provided by the CIBERSORTx [ 12 ] (cibersortx.stanford.edu) to calculate the immune cell proportions of EM patients and the healthy control group. The “ggplot2” package in R was used to draw box plots and cluster overlay histograms. Subsequently, Spearman correlation analysis was used to explore the correlation between hub genes and immune cells.
Transcriptional regulatory networks involving hub genes and transcription factors (TFs) were analyzed using Network Analyst 3.0 [ 13 ] ( https://www.networkanalyst.ca/ ) to assess their interactions and functional impacts. The resulting networks were visualized using Cytoscape 3.8.0 software. To identify candidate regulatory miRNAs, predictions from 5 databases were integrated: miRWalk [ 14 ] ( http://mirwalk.umm.uni-heidelberg.de/ ), miRNet [ 15 ] ( https://www.mirnet.ca/ ), miRTarBase [ 16 ] ( https://mirtarbase.cuhk.edu.cn/~miRTarBase/miRTarBase_2025/php/index.php ), StarBase [ 17 ] ( https://rna.sysu.edu.cn/encori/index.php ), and TargetScan Human 8.0 [ 18 ] ( http://www.targetscan.org/vert_80/ ). Only miRNAs predicted by all 5 databases were retained.
Results
The GSE7305 , GSE23339 , and GSE25628 datasets were integrated to correct for batch effects. Significant batch effects were observed across the 3 datasets before batch effect correction (Fig. 1 A), whereas gene expression distributions converged after batch effect correction (Fig. 1 B). Comparison between the EM groups and the healthy controls identified 1528 DEGs, comprising 821 upregulated and 707 downregulated genes; the corresponding volcano plot is presented in Figure 1 C. Additionally, 555 AAGs were retrieved from the AMIGO2 database using “angiogenesis” as the search term.
Differential gene expression analysis. (A) Datasets before batch correction; (B) datasets after batch correction; (C) Volcano plot of DEGs: red represents significantly upregulated genes; green represents significantly downregulated genes. DEGs = differentially expressed genes.
Cluster analysis validated that all samples met quality control criteria without outliers (Fig. 2 A). Guided by scale-free topology fit indices and mean connectivity metrics, a soft threshold power (β = 9) was empirically determined (Fig. 2 B) to construct the topological overlap matrix. Hierarchical clustering analysis identified 14 co-expression modules (Fig. 2 C), among which the MEgrey60 module exhibited the strongest biological relevance, comprising 1543 genes. Intersection analysis between module genes, DEGs, and AAGs uncovered 56 EM-AAGs (Fig. 2 D). Differential expression patterns of EM-AAGs were visualized in a heatmap (Fig. 2 E).
Screening of co-expression gene modules using weighted gene correlation network analysis (WGCNA). (A) stratified clustering diagram of EM and control group samples; (B) optimal soft threshold fitting analysis diagram; (C) heatmap of the correlation between module and EM trait; (D) Venn diagram of DEGs, AAGs and MEgrey60; (5) heatmap of EM-AAGs. AAGs = angiogenesis-associated genes, DEGs = differentially expressed genes, EM = endometriosis.
To elucidate potential regulatory mechanisms, GO and KEGG enrichment analysis were performed on the 56 identified EM-AAGs. The results demonstrated that biological process were predominantly enriched in angiogenesis regulation, vasculature development regulation, and positive regulation of angiogenesis; cellular component were primarily associated with the basement membrane, cell junction, and collagen trimer; molecular function were significantly enriched in glycosaminoglycan binding, heparin binding, and sulfur compound binding (Fig. 3 A). KEGG analysis revealed significant enrichment in focal adhesion, cell adhesion molecules, and the PI3K-Akt signaling pathway. The top 10 enriched pathways are visualized in Figure 3 B.
GO and KEGG enrichment analyses of candidate genes. (A) GO enrichment bubble plot, showing significantly enriched terms in biological process (BP, green), cellular component (CC, red), and molecular function (MF, blue) for 56 candidate genes. The y-axis denotes term names, and the x-axis represents enrichment score. (B) KEGG enrichment results. Left: Chord diagram of “candidate gene–KEGG pathway” associations (lines indicate associations). Right: KEGG enrichment bubble plot, where the x-axis is gene. Ratio, the y-axis is pathway name, dot color reflects −log 10 (adjusted P -value), and dot size represents the count of candidate genes in the pathway. GO = gene ontology, KEGG = Kyoto Encyclopedia of Genes and Genomes.
Feature importance evaluation of the 56 EM-AAGs was first performed using RF, identifying the top 15 genes by significance score (Fig. 4 A and B). Subsequent LASSO regression analysis yielded 9 candidate genes (Fig. 4 C and D), while GBM and XGBoost algorithms identified the top 15 candidate genes by ranking (Fig. 4 E and F). SVM-RFE analysis identified 9 genes (Fig. 4 G and H). Intersection analysis revealed FZD4, SRPX2, and COL8A1 as robustly overlapping hub genes across all 5 machine learning methods. Notably, FZD4 consistently ranked highest in feature importance scores throughout all algorithmic evaluations (Fig. 4 I), suggesting its pivotal role in EM regulatory networks.
Machine learning identifies biomarkers. (A and B) Feature importance identification based on the Random Forest algorithm; (C) LASSO regression algorithm regression cross-validation curve; (D) LASSO regression algorithm regression coefficient path diagram; (E) GBM algorithm; (F) XGBoost algorithm; (G and H) SVM-RFE algorithm prediction true value and error value change curve; (I) Intersection of biomarkers of the 3 algorithms. GBM = gradient boosting machine, LASSO = least absolute shrinkage and selection operator.
For CV of hub gene screening algorithms, 56 original EM-associated angiogenesis genes (EM-AAGs) were used as input. Five algorithms – RF, LASSO, XGBoost, GBM, and SVM-RFE – underwent 10 repeated CVs to reduce random errors, with model classification accuracy as the core index to quantify algorithm stability across data subsets (Fig. 5 A). This fixed the original study’s flaw of “screening hub genes only via multi-algorithm intersection”: FZD4, SRPX2, and COL8A1 were stably identified as core genes in all validations, eliminating the risk of “hub genes being biased by specific algorithms toward the original training set” and supporting their reliability as EM angiogenesis hub genes.
Stability and external validation of hub genes and diagnostic model.(A) Changes in accuracy of 10 repeated cross-validations for 5 machine learning algorithms (GBM, LASSO, RF, SVM, XGBoost), used to evaluate algorithm stability and verify the reliability of hub gene screening. (B) PCA plot of the external validation set before batch effect correction, where different markers represent datasets GSE11691 , GSE120103 , and GSE7846 , showing obvious batch differences in the original data. (C) PCA plot of the external validation set after batch effect correction using the ComBat algorithm, with increased overlap of sample distribution indicating effective batch effect correction. (D) ROC curves of individual genes FZD4, SRPX2, and COL8A1 in the external validation set, showing the diagnostic efficacy of each gene.(E) ROC curve of the “FZD4 + SRPX2 + COL8A1” combined diagnostic model in the external validation set, with an AUC of 0.933 (95% confidence interval: 0.906–0.952), reflecting the diagnostic performance of the combined model. AUC = area under the curve, GBM = gradient boosting machine, LASSO = least absolute shrinkage and selection operator, PCA = Principal component analysis, RF = Random Forest, ROC = receiver operating characteristic, SVM-RFE = support vector machine-recursive feature elimination, XGBoost = extreme gradient boosting.
Regarding external validation of the diagnostic model (to address the original lack of independent validation), 3 new, non-overlapping GEO datasets ( GSE11691 , GSE120103 , GSE7846 ) were used to build an external set (62 EM patients, 45 healthy controls; no overlap with the original training set GSE7305 + GSE23339 + GSE25628 ). Batch effects were corrected via the ComBat algorithm (R “sva” package), confirmed by improved sample overlap in PCA plots before/after correction (Fig. 5 B and C).
External validation results are shown in Fig. 5 D (single-gene efficacy) and Fig. 5 E (combined model efficacy): FZD4, SRPX2, and COL8A1 were significantly upregulated in EM patients ( P < .01) in the external set, consistent with the original training set (proving disease-specific expression). The “FZD4 + SRPX2 + COL8A1” combined model (using original nomogram weights) had an AUC of 0.933 (95% CI: 0.906–0.952) in the external set – higher than single genes (e.g., COL8A1, AUC = 0.876) and close to the original training set’s AUC (0.945). At sensitivity = 0.85 and specificity = 0.80, DCA showed consistent clinical net benefit with the original set.
To visualize the expression levels of characteristic genes, violin plots were generated (Fig. 6 A). The results showed that compared with the healthy controls, the expression of hub genes was significantly upregulated in EM samples.
Construction and validation of diagnostic model based on hub genes. (A) Expression of hub genes in EM; (B) ROC curve of hub genes; (C) nomogram model; (D) calibration curve; (E) DCA curve. DCA = decision curve analysis, EM = endometriosis, ROC = receiver operating characteristic.
Subsequently, ROC curve analysis was conducted for the hub genes FZD4, SRPX2, and COL8A1 to evaluate their clinical diagnostic value. In the training set, all 3 hub genes exhibited AUC values > 0.9 (Fig. 6 B). Based on this, a nomogram was constructed using the 3 hub genes to quantify their diagnostic efficacy for EM (Fig. 6 C). The calibration curve showed good agreement with the ideal curve, indicating high prediction accuracy of the model (Fig. 6 D). The DCA curve (Fig. 6 E) further confirmed the model’s significant clinical net benefit. These results indicate that all 3 hub genes exhibited satisfactory diagnostic performance and may serve as potential diagnostic markers for angiogenesis in EM.
To investigate signaling pathways associated with hub genes in EM pathogenesis, we performed GSEA on each hub gene individually. GSEA revealed distinct pathway associations: FZD4 showed significant enrichment in cytoskeleton in muscle cells and cell cycle related pathways (Fig. 7 A), while SRPX2 was predominantly associated with Cytokine-Cytokine Receptor Interaction and Nucleocytoplasmic Transport mechanisms (Fig. 7 B). COL8A1 demonstrated enrichment in Cytokine-Cytokine Receptor Interaction signaling pathways and Spearman disease related pathways (Fig. 7 C). Visualization of the top 15 enriched pathways (Fig. 7 D–F) demonstrated convergent downregulation of the cell cycle pathway by all 3 hub genes. Our findings suggest that the hub genes collectively suppress cell cycle progression, thereby inhibiting EM pathogenesis through this pivotal mechanism.
GSEA and pathway enrichment visualization results of hub genes FZD4, SRPX2, and COL8A1.(A) GSEA results of hub gene COL8A1, showing significantly enriched pathways (e.g., cytokine-cytokine receptor interaction, osteoclast differentiation, complement and coagulation cascades, etc); (B) GSEA results of hub gene FZD4, showing significantly enriched pathways (e.g., cytoskeleton in muscle cells, complement and coagulation cascades, olfactory transduction, etc); (C) GSEA results of hub gene SRPX2, showing significantly enriched pathways (e.g., cytokine-cytokine receptor interaction, cytoskeleton in muscle cells, focal adhesion, etc); (D–F) visualized waterfall plots of the top 15 enriched pathways for FZD4 (D), SRPX2 (E), and COL8A1 (F), where red represents upregulated pathways and blue represents downregulated pathways; the results show that all 3 genes are involved in the downregulated regulation of cell cycle pathways. GSEA = gene set enrichment analysis.
In this study, we used the CIBERSORT algorithm to analyze the composition of immune cells and explore the differences in the immune microenvironment between EM patients and healthy controls.
The results showed that EM patients had higher expression levels of resting memory CD4 + T cells, Macrophages M1, Macrophages M2, activated mast cells, and neutrophils. However, plasma cells, T cells follicular helper, NK cells resting, NK cells activated, and dendritic cells activated exhibited lower expression levels compared to the control group (Fig. 8 A). There were also individual differences in the proportions of immune cells among EM patients. The proportions of immune cells in 61 EM samples were calculated, as shown in Figure 8 B.
Immune infiltration analysis of hub genes. (A) Analysis of immune cell infiltration in EM group and healthy control group; (B) relative percentages of 22 immune cell subpopulations in 61 samples; (C) correlation between FZD4 expression level and immune cells; (D) correlation between SRPX2 expression level and immune cells; (E) correlation between COL8A1 expression level and immune cells; (F) correlation analysis of FZD4 and NK cells; (G) correlation analysis of SPRX2 and neutrophils; (H) correlation analysis of SPRX2 with neutrophils; (I) correlation analysis of COL8A1 with resting memory CD4 + T cells; (J) correlation analysis of COL8A1 with regulatory T cells. EM = endometriosis, NK cells = natural killer cells.
To better understand the functional roles of hub genes in immune infiltration, we sequentially performed correlation analyses separately (Fig. 8 C–E). The analysis revealed a negative correlation between FZD4 and NK cell expression level ( R = −0.42, P = .012) (Fig. 8 F), whereas none of the other immune cells that were positively correlated were statistically significant. SPRX2 exhibited a positive correlation with neutrophil expression ( R = 0.42, P = .011) and a negative correlation with regulatory T cell expression ( R = −0.36, P = .033) (Fig. 8 G and H). COL8A1 demonstrated a positive correlation with resting memory CD4 + T cell expression ( R = 0.47, P = .0041) and a negative correlation with regulatory T cell expression ( R = −0.47, P = .0039) (Fig. 8 I and J). The correlation analysis demonstrated a significant consistency between immune infiltration profiles and hub genes, underscoring the robust linkage of these genes to disease progression.
TFs prediction analysis indicated that FZD4 is regulated by 5 TFs, SRPX2 by 6 TFs, and COL8A1 by 7 TFs. FOXC1 can concurrently regulate all 3 genes (Fig. 9 A). By merging the prediction results from 5 databases and identifying the common elements, a total of 13 miRNAs were identified (Fig. 9 B). Subsequently, a ceRNA network was constructed to explore the regulatory mechanisms of FZD4, SRPX2, and COL8A1 (Fig. 9 C). The network comprises 21 nodes (3 genes, 13 miRNAs, 5 lncRNAs) and 28 interacting edges. C10orf91 was found to bind to miR-31-5p and regulate COL8A1, FZD4, and SRPX2 simultaneously.
Prediction of TFs for hub genes and construction of ceRNA network. (A) hub genes – TFs interaction network: yellow arrow indicates hub genes; red ellipse indicates upregulated TFs; green ellipse indicates downregulated TFs; purple ellipse indicates non-differentially expressed TFs; (B) 13 miRNAs were obtained from intersection of 5 databases; (C) ceRNA network: pink diamonds for hub genes; purple ovals for miRNAs; green arrows for lncRNAs. ceRNA = competing endogenous RNA, lncRNAs = long non-coding RNAs, miRNAs = microRNAs, TFs = transcription factors.
Discussion
Through analysis of DEGs and WGCNA, we identified 56 EM-AAGs. These genes were significantly enriched in angiogenesis regulation and cell adhesion, suggesting that they may participate in the physiological and pathological processes of EM by regulating the vascular microenvironment, cell interactions, and signal transduction, thereby providing direction for elucidating the molecular mechanisms of EM. Through multi-algorithm CV, 3 hub genes, FZD4, SRPX2, and COL8A1, were ultimately identified. Their expression levels were significantly upregulated in EM patients and demonstrated good diagnostic efficacy.
FZD4 is a transmembrane receptor for WNT ligands that orchestrates canonical WNT/β-catenin signaling. In EM lesions, FZD4 expression was significantly elevated, corroborating previous reports implicating aberrant WNT signaling in endometrial pathophysiology. Upon WNT ligand binding, FZD4 recruits LRP5/6 co-receptors, leading to β-catenin stabilization, nuclear translocation, and transcriptional activation of proangiogenic and proliferative target genes including VEGF and Cyclin D1. [ 19 , 20 ] Inhibition of FZD4 or blockade of upstream activators in EM models has been shown to attenuate neovascularization and lesion growth. [ 21 ] These findings support a model in which FZD4-mediated WNT/β-catenin activation constitutes a critical driver of angiogenesis and tissue expansion in endometriosis, making it a promising target for therapeutic intervention.
SRPX2 functions as an extracellular matrix protein that promotes early angiogenic remodeling. Knockout studies in endothelial cell models demonstrate that loss of SRPX2 specifically impairs endothelial cell migration and delays vascular sprouting. [ 22 ] Mechanistic investigations reveal that SRPX2 interacts with focal adhesion kinase (FAK) and integrinβ1, triggering FAK phosphorylation and downstream activation of Src, PI3K/Akt, and Rac1 pathways. [ 23 , 24 ] This signaling cascade enhances endothelial cell adhesion, motility, and tube formation – hallmarks of active angiogenesis. Aberrant FAK activation within EM lesions potentiates the proangiogenic microenvironment, reinforcing SRPX2 as a candidate biomarker and therapeutic target to disrupt lesion vascularization.
COL8A1 is a short-chain collagen family member localized to basement membranes and smooth muscle layers. COL8A1 overexpression in EM lesions contributes to angiogenesis via 2 complementary mechanisms: it enhances VEGF secretion to activate endothelial cells, and it increases matrix stiffness through augmented collagen deposition, thereby facilitating mechanotransduction signals that stabilize nascent vessels. Furthermore, COL8A1 cooperates with matrix metalloproteinases (MMPs) to remodel the extracellular matrix, creating conduits for endothelial invasion. [ 25 ] Given its dual role in biochemical and biomechanical regulation of angiogenesis, COL8A1 represents a novel therapeutic entry point to attenuate aberrant vascular support for EM lesions.
GSEA analysis reveals collective suppression of the cell cycle pathway by FZD4, COL8A1, and SRPX2. Crucially, aberrant cell cycle regulation is closely linked to angiogenesis in EM. Studies demonstrate that within endometriotic lesions, dysregulated cell cycle proteins, specifically overexpression of cyclins D1, A, B1 and reduced expression of cyclin-dependent kinase inhibitors, including p21 and p27kip1, drive aberrant cell proliferation. Concurrently, by upregulating pro-angiogenic factors such as VEGF, this dysregulation drives angiogenesis in ectopic lesions. [ 26 ] Furthermore, Arcyriaflavin A, a targeted inhibitor of the cyclin D1-CDK4 complex, induces apoptosis and suppresses proliferation in ectopic cells while reducing VEGF secretion, thereby inhibiting angiogenesis. [ 27 ] Notably, cell cycle proteins can modulate angiogenesis-related signaling pathways through participating in tumor necrosis factor (TGF)-β-mediated epithelial-mesenchymal transition (EMT). This ultimately promotes angiogenesis in ectopic lesions. [ 28 ] These findings reveal an interplay between the cell cycle regulatory network and angiogenesis in the pathogenesis of EM. Targeting this crosstalk holds promise as a novel therapeutic strategy to intervene in cell cycle dysregulation and curb lesion progression in EM.
Immune infiltration analysis showed that EM patients had higher proportions of M1/M2 macrophages than healthy women, whereas protective immune cells like NK cells were notably reduced. Dysregulated M1/M2 macrophage balance drives pathological angiogenesis in EM. Research indicates that M2 macrophages directly promote angiogenesis in ectopic lesions by secreting pro-angiogenic factors such as VEGF and platelet-derived growth factor (PDGF). For instance, Bacci et al [ 29 ] demonstrated in animal models that macrophages from EM patients exhibit an M2-polarized phenotype. These macrophages significantly upregulate VEGF expression through PI3K/Akt pathway activation while secreting interleukin (IL)-8 and TGF-β to collaboratively promote endothelial cell migration and lumen formation. Furthermore, lactate derived from glycolytic activity in ectopic endometrial stromal cells induces M2 polarization, which via the Mettl3/Trib1/ERK/STAT3 axis enhances VEGF secretion and augments pro-angiogenic capacity. Conversely, in the initial phase of EM, M1 macrophages exert anti-angiogenic effects through IFN-γ/TNF-α secretion and concurrently initiate Th1-polarized immunity to eradicate ectopic endometrial cells. Thiruchelvam et al [ 30 ] revealed that M1 macrophage-derived IL-12/Angptl4 suppress endothelial proliferation. Critically, M2 polarization counteracts this effect by downregulating anti-angiogenic factors in their secretome. Disease progression is thus characterized by M1-to-M2 transition, skewing the inflammatory milieu toward pro-angiogenic dominance.
Dysfunction of natural killer (NK) cells is closely linked to vascularization of ectopic lesions. Studies demonstrate that NK cells within peritoneal fluid and ectopic lesions of EM patients exhibit significant phenotypic and functional alterations, including overexpression of inhibitory receptors and downregulation of activating receptors. These changes impair cytotoxic activity, compromising the clearance of ectopic endometrial cells. Consequently, this defect in immune surveillance creates a favorable microenvironment for ectopic tissue survival and angiogenesis. [ 31 ]
Transcription factor prediction analysis suggests that FOXC1 may co-regulate 3 hub genes. Previous studies have confirmed significantly elevated mRNA and protein expression of FOXC1 in ectopic endometrial tissues of EM patients. FOXC1 promotes cell proliferation, migration, and invasion by activating the PI3K/Akt signaling pathway. [ 32 ] The PI3K/Akt pathway is a key regulatory pathway for angiogenesis, capable of upregulating pro-angiogenic factors and promoting neovascularization, thereby supporting the survival of ectopic endometrial tissues. [ 33 , 34 ] In the constructed ceRNA network, we observed an interaction between C10orf91 and miR-31-5p. Current research indicates that miR-31-5p can promote endothelial cell proliferation, migration, and angiogenesis. [ 35 ] The pro-angiogenic effects mediated by the FOXC1-PI3K/Akt axis, coupled with the potential regulation by miR-31-5p, may collectively form the molecular basis of a multi-level regulatory network for angiogenesis in EM. This offers novel insights for future mechanistic exploration and targeted therapeutic interventions.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.