Biomarker Prediction and Immune Landscape of Angiogenesis in Inflammatory Bowel Disease: Insights from Bioinformatics and Machine Learning Approaches | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Biomarker Prediction and Immune Landscape of Angiogenesis in Inflammatory Bowel Disease: Insights from Bioinformatics and Machine Learning Approaches pengliang zhang, shuang chen, xianmin liu, lijuan wu, yingjian zhang This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6554786/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background: The pathogenesis of inflammatory bowel disease (IBD) remains poorly understood, with angiogenesis playing a crucial role in its development. This study primarily aims to identify effective biomarkers of angiogenesis in IBD and to enhance the understanding of the disease's immunological characteristics. Methods: Data sets related to IBD were sourced from the GEO database, including one set for bioinformatics analysis and machine learning, and another for external validation. Gene sets associated with angiogenesis were obtained from the MSigDB database, and IBD-related angiogenesis gene sets were identified by intersecting with the IBD data set. Support Vector Machine (SVM), Lasso regression, and Random Forest (RF) models were employed to identify marker genes. The diagnostic performance of the eigengene was evaluated using the receiver operating characteristic (ROC) curve and a diagnostic nomogram. Single-sample gene set enrichment analysis (ssGSEA) was utilized to elucidate the immune landscape, and correlation analysis was conducted to explore the relationship between eigengenes and immune infiltration. Results: The convergence of results from LASSO, Random Forest (RF), and Support Vector Machine (SVM) analyses identified three key genes: CXCL8, THY1, and COL4A2. The biological processes associated with these genes primarily involve cytokine interactions, chemotaxis, extracellular matrix-receptor interactions, and oxidative phosphorylation, among others. Immune infiltration analysis demonstrated a significant increase in 11 immune cell types within the inflammatory bowel disease (IBD) samples. Furthermore, these signature genes exhibited a strong correlation with various immune cells. Conclusions: CXCL8, THY1, and COL4A2 have been identified as reliable potential biomarkers for angiogenesis in IBD. The immune responses mediated by these biomarkers play a critical role in IBD angiogenesis through interactions with immune-infiltrating cells. Biological sciences/Immunology Health sciences/Gastroenterology Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 1. Introduction Inflammatory bowel disease (IBD) is a chronic, progressive, immune-mediated inflammatory condition of the gastrointestinal tract, predominantly diagnosed during adolescence and young adulthood[ 1 ]. It primarily encompasses Crohn's disease and ulcerative colitis, with a prevalence of approximately 1% and an increasing incidence among children in recent years[ 2 , 3 ]. In Crohn's disease, intestinal involvement is typically segmental, often affecting the terminal ileum, and is characterized histologically by epithelioid granulomas. In contrast, ulcerative colitis presents with diffuse inflammation, usually beginning in the rectum and extending proximally to the terminal ileum. Clinically, IBD commonly manifests as chronic diarrhea (with or without hematochezia), abdominal pain, and weight loss[ 4 , 5 , 6 ]. Although the precise etiology of IBD remains unclear, it is hypothesized to result from a complex interplay of genetic predisposition, undefined environmental factors, alterations in the gut microbiome, and immune dysregulation[ 7 , 8 ]. Regarding the diversity of the host gut microbiome, a reduction in Firmicutes and an increase in Proteobacteria and Bacteroidetes may result in decreased production of short-chain fatty acids and impaired function of regulatory T cells and epithelial cells, thereby elevating the risk of developing inflammatory bowel disease (IBD) [ 9 , 10 ]. Moreover, research indicates that first-degree relatives of IBD patients have a fivefold increased risk of developing the condition [ 9 ]. For diagnostic purposes, fecal calprotectin can be utilized to identify patients with gastrointestinal symptoms who may have IBD, while imaging techniques can safely, easily, and reliably detect inflammation. Regular re-evaluation of disease status is essential for patients [ 11 ]. Given the high prevalence of IBD among younger populations and its significant impact on patients' quality of life, it is anticipated that IBD will become a major health concern in developing countries in the near future [ 12 ]. Consequently, early diagnosis and treatment of IBD are crucial. In particular, early intervention in Crohn’s disease may alter the disease's natural progression and reduce the risk of disability. Angiogenesis is a highly intricate process involving various cell types, growth factors, cytokines, adhesion molecules, and signaling pathways [ 13 ]. It plays a crucial role in the pathogenesis of inflammatory bowel disease (IBD), with chronic inflammation and angiogenesis being interrelated processes; specifically, chronic intestinal inflammation is reliant on angiogenesis. Angiogenesis facilitates the inflammatory response by promoting leukocyte migration and supplying oxygen and nutrients, while also contributing significantly to wound healing [ 14 – 18 ]. Moreover, numerous studies have documented physiological alterations in vascular anatomy and the upregulation of angiogenic mediators in IBD patients, indicating that a deeper understanding of the angiogenic process could lead to more effective treatments for chronic intestinal inflammation [ 19 ]. Research has revealed significantly elevated levels of serum vascular endothelial growth factor (VEGF) and basic fibroblast growth factor (b-FGF) in IBD patients compared to controls [ 20 ]. Moreover, studies utilizing animal models of experimental intestinal inflammation have demonstrated significant angiogenesis in the two primary forms of inflammatory bowel disease (IBD)—Crohn's disease and ulcerative colitis—suggesting that inhibiting this process could hold therapeutic potential [ 21 ]. Additionally, it has been observed that mucosal extracts from IBD patients induce strong angiogenic responses in corneal and chorioallantoic membrane assays compared to extracts from normal mucosa [ 22 ]. Nonetheless, the specific role of angiogenic genes in the pathogenesis of IBD requires further investigation. This study aims to identify diagnostic and therapeutic biomarkers associated with angiogenesis in inflammatory bowel disease (IBD) utilizing bioinformatics approaches. Furthermore, it seeks to elucidate the role of these biomarkers in IBD through functional and molecular mechanism analyses, thereby providing a theoretical foundation for the clinical diagnosis and treatment of the disease. A comprehensive analysis workflow is presented in Fig. 1. 2. Method 2.1 Data Source and Preprocessing In this study, transcriptome microarray data from colon tissue samples of patients with inflammatory bowel disease (IBD) were retrieved from the Gene Expression Omnibus (GEO) database ( https://www.ncbi.nlm.nih.gov/geo/ ) and subsequently processed using R software. To preserve data integrity, probes lacking corresponding gene symbols were excluded. For instances where multiple probes targeted the same gene, the mean expression value was calculated to ensure accuracy and consistency. Expression profiles were normalized using the interarray normalization function from the limma software package. Metadata were constructed by integrating two datasets: GSE165512 and GSE75214. The GSE165512 dataset comprises 46 control samples, 84 samples from patients with Crohn's disease, and 40 samples from patients with ulcerative colitis. The GSE75214 dataset, serving as an external validation set, includes 22 control samples, 8 samples from patients with Crohn's disease, and 97 samples from patients with ulcerative colitis. Additionally, 48 genes associated with angiogenesis (ARGs) were obtained from the MSigDB database (Hallmark gene set). 2.2 Differential Expression Analysis of Genes In this study, differential expression analysis was conducted using the R language limma package to calculate fold changes in gene expression levels and corresponding statistical p-values. The p-values obtained from the analysis were adjusted for multiple comparisons using the False Discovery Rate (FDR) method, with a threshold set at FDR > 0.5 and an adjusted p-value (P.adjust) of 0.05 to identify differentially expressed genes. Heat maps illustrating the differential expression of genes were generated utilizing the R packages pheatmap and ggplot2. 2.3 Gene enrichment analysis Gene enrichment analysis was conducted utilizing the clusterProfiler software package and the DAVID website to investigate the gene enrichment of ARGs. This analysis aimed to elucidate the underlying biological processes (Biological Process, BP), cellular components (Cellular Component, CC), molecular functions (Molecular Function, MF), and the enrichment within the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. The objective was to identify the biological processes and signaling pathways significantly associated with ARGs. In this study, the results of the analysis were visualized using the ggplot2 and circlize packages. 2.4 Machine learning predictive model construction In this study, the most diagnostic biomarkers for inflammatory bowel disease (IBD) were identified by screening the ARGs with significant differential expression using a machine learning approach. Specifically, five machine learning models were constructed using the caret, randomForest, and glmnet packages in the R programming language. These models included Random Forest (RF), Support Vector Machine (SVM), Generalized Linear Model (GLM), eXtreme Gradient Boosting (XGB), and Lasso (Least Absolute Shrinkage and Selection Operator) regression. Each model was executed with default parameter settings and underwent comprehensive evaluation through cross-validation to ensure robustness and generalizability. To facilitate interpretation of the prediction outcomes, the DALEX package was employed to visualize the residual distribution and feature importance. This visualization process enabled the identification of features exerting the most significant influence on model predictions, thereby highlighting potential biomarkers. Furthermore, the pROC package was employed to construct the receiver operating characteristic (ROC) curves and compute the area under the curve (AUC) value, thereby evaluating the model's efficacy in differentiating between patients with inflammatory bowel disease (IBD) and healthy control subjects. 2.5 Diagnonomogram In this study, the rms package in the R programming language was employed to develop a nomogram, a receiver operating characteristic (ROC) curve, and a calibration curve, thereby offering a clinical perspective on the diagnosis of inflammatory bowel disease (IBD). Initially, each patient was assigned a score based on the expression levels of individual genes, and these scores were aggregated to generate a total score. This total score indicates the potential risk of IBD for each individual, which is subsequently predicted using a nomogram. The nomogram serves as an intuitive graphical tool for illustrating the outcomes of predictive models. Furthermore, this study utilized ROC curves to assess the diagnostic performance of the model, specifically its ability to differentiate between IBD patients and healthy controls. 2.6 Analysis of the immune infiltration In this study, the R packages GSVA and GSEABase were employed to conduct single-sample Gene Set Enrichment Analysis (ssGSEA) in order to assess differences in immune infiltration between patients with inflammatory bowel disease (IBD) and control groups. Immune-related gene sets were sourced from the Molecular Signatures Database (MSigDB) to facilitate the ssGSEA, which aimed to quantify the extent of immune infiltration in IBD patients. Initially, an enrichment score for each immune gene set was computed for each sample using the ssGSEA approach. Following the normalization of these enrichment scores, comparisons of various immune cell enrichment scores between IBD and healthy samples were performed using the Wilcoxon test to identify immune cell types significantly associated with disease status. Furthermore, the study employed the Spearman rank correlation coefficient to evaluate the relationship between marker gene expression levels and immune cell enrichment scores, thereby identifying gene and immune cell pairs that exhibited both statistically significant and strong correlations. 2.7 Construction of the TF-miRNA-mRNA regulatory network In this study, the miRNet database was employed to predict microRNAs (miRNAs) and transcription factors (TFs) in conjunction with biomarker genes, facilitating the construction of a comprehensive regulatory network that elucidates the interactions among TFs, miRNAs, and messenger RNAs (mRNAs) using the Cytoscape platform. 3. Results 3.1 The screening of the differentially expressed genes A total of 1,945 differentially expressed genes (DEGs) were identified from the metadata using screening criteria of an adjusted p-value 0.5. Among these, 1,078 genes were found to be upregulated, while 867 genes were downregulated in patients with inflammatory bowel disease (IBD). These findings indicate a significant disparity in gene expression levels between IBD patients and healthy controls, providing a crucial foundation for further exploration of the molecular mechanisms underlying IBD. To visually represent the relationship between the expression patterns of the DEGs, a heat map (Fig. 2A) and a volcano plot (Fig. 2B) were constructed. Subsequent intersection with apoptosis-related genes (ARGs) from the MSigDB database identified nine key differentially expressed ARGs (Fig. 2C). To investigate the functional enrichment of angiogenesis-related genes within biological processes (BP), cellular components (CC), and molecular functions (MF), as well as their involvement in specific pathological pathways, Gene Ontology (GO) enrichment analysis was conducted on angiogenesis-related genes (ARGs). The findings were visualized using histograms and chord plots (Fig. 3A and 3B). The GO enrichment analysis revealed that ARGs are prominently associated with angiogenesis and its regulatory mechanisms, including both positive and negative regulation, as well as the response to hypoxia, underscoring the critical role of oxygen deficiency in inflammatory responses. The analysis also highlighted the negative regulation of cell proliferation and growth, indicating a potential role for dysregulated cell growth in inflammatory bowel disease (IBD). Additionally, ARGs were implicated in the regulation of cell adhesion and inflammatory responses. These genes were predominantly enriched in extracellular regions, such as the extracellular space, platelet α-granule lumen, and cell surface. The enriched molecular functions included protein binding, cytokine activity, and growth factor activity, suggesting a significant role for ARGs in cell signaling and intercellular communication. The KEGG pathway enrichment analysis results underscored several pathways significantly associated with ARGs, such as the cancer pathway, AGE-RAGE signaling in diabetic complications, cytokine-cytokine receptor interactions, rheumatoid arthritis, and inflammatory bowel disease, as depicted in Figs. 3C and 3D. By integrating the enrichment analysis findings from both GO and KEGG pathways, we enhanced our comprehension of the pathogenesis of inflammatory bowel disease (IBD) mediated by ARGs, particularly in relation to angiogenesis, cell-cell interactions, inflammatory responses, and immune regulation. 3.2 Chromosome localization The chromosomal localization analysis of the nine key ARGs, identified through the intersection of gene differential expression analysis and ARG-associated genes, was conducted using the Rcircos package. Utilizing the built-in UCSC.HG19.Human.CytoBandIdeogram, human genomic chromosome data were represented in the outer circle of the plot, encompassing chromosomes 1 to 22 as well as the X and Y sex chromosomes. The chromosomal locations of the marker genes were annotated using the R package biomaRt. Subsequently, the chromosomes were distinguished by different colored stripes, and the positions of the nine identified marker genes were indicated on the diagram (Fig. 4A). Specifically, RHOB was located on chromosome 2, PROK2 on chromosome 3, CXCL8 on chromosome 4, ROBO4 and THY1 on chromosome 11, COL4A2 on chromosome 13, CHRNA7 on chromosome 15, and SERPINF1 and SPHK1 on chromosome 17. 3.3 Co-expression analysis of genes in key ARGs For the nine key ARGs identified, namely CXCL8, THY1, COL4A2, PROK2, ROBO4, SPHK1, RHO1, RHOB, CHRNA7, and SERPINF1, we conducted an analysis of the correlations between their gene expression levels. This analysis is visualized in the correlation heat map presented in Fig. 4B. The values depicted in the figure represent the Spearman rank correlation coefficients between the genes, which range from − 1 to 1. Positive values indicate a positive correlation, while negative values indicate a negative correlation; the greater the absolute value, the stronger the correlation. The figure reveals that CXCL8 exhibits a strong positive correlation with PROK2, THY1 shows a strong positive correlation with SPHK1 and ROBO4, COL4A2 is strongly correlated with ROBO4, RHOB, and SERPINF1, and both ROBO4 and RHOB demonstrate a strong positive correlation with SERPINF1. In contrast, CHRNA7 exhibits a relatively low correlation with the other genes. 3.4 Machine learning screening for biomarkers Machine learning screening for biomarkers In conducting diagnostic genetic screening studies for inflammatory bowel disease (IBD), the dataset was initially partitioned into a training set comprising 70% of the data and a test set comprising the remaining 30%. Utilizing the 'caret' package in R, five machine learning models were developed: Random Forest (RF), Support Vector Machine (SVM), Generalized Linear Model (GLM), Lasso Regression Model (Lasso), and Extreme Gradient Boosting (XGB). Cross-validation techniques were employed to assess the overall performance of these models. The DALEX package facilitated the visualization of the models' residual distributions and feature importance, while the pROC package was utilized to compute and visualize the Receiver Operating Characteristic (ROC) curves and the Area Under the Curve (AUC) values for the models. Based on the residual distribution illustrated in Figs. 5A and 5B, we conduct an analysis of the specific outcomes associated with the five models. In each boxplot, the red dots represent the root-mean-square (RMS) values of the residuals for each model. The boxplots indicate that the residual distributions for the Random Forest (RF), Support Vector Machine (SVM), and Lasso models are the most concentrated, with the SVM model demonstrating the highest concentration and the lowest RMS value. This observation implies a high level of predictive accuracy for the SVM model. In contrast, the Extreme Gradient Boosting (XGB) model exhibits the widest range of residuals and the highest RMS value, suggesting a potentially greater prediction error. The Generalized Linear Model (GLM) occupies an intermediate position with respect to these metrics. Upon evaluating the prediction accuracy of the models using the Receiver Operating Characteristic (ROC) curve, it was confirmed that the Random Forest (RF), Support Vector Machine (SVM), and Lasso models demonstrated superior predictive accuracy among the five machine learning models assessed, as evidenced by their respective Area Under the Curve (AUC) values (Fig. 5C). Specifically, the RF, SVM, and Lasso models exhibited relatively higher AUC values in distinguishing inflammatory bowel disease (IBD) status, with values of 0.886, 0.873, and 0.871, respectively. An AUC value approaching 1 indicates high predictive accuracy, whereas an AUC value near 0.5 suggests performance akin to random guessing, rendering it ineffective. In this study, an AUC greater than 0.7 is considered indicative of good predictive performance. Although the Generalized Linear Model (GLM) and Extreme Gradient Boosting (XGB) models also demonstrated some predictive significance, the combined analysis of residual distribution results further supports the superior performance of the RF, SVM, and Lasso models. In this study, Random Forest (RF), Support Vector Machine (SVM), and Lasso regression—three machine learning models known for their superior performance—were employed to assess the feature importance of nine genes (refer to Fig. 5D) and to identify the most critical marker genes. The RF models were developed using the randomForest package within the R package caret, while the SVM models were constructed utilizing the R package e1071 in conjunction with caret. The method of "one minus AUC loss after permutations" was applied, wherein each feature was randomly permuted to evaluate its impact on model accuracy, as measured by the Area Under the Curve (AUC). The permutation test was used to assess the effect of each feature's alteration on model performance. Based on the combined results from the SVM and RF models, it was preliminarily concluded that RHOB, SERPINF1, and PROK2 exhibit relatively low importance. Furthermore, this study utilized the Lasso regression model to construct a predictive framework for assessing inflammatory bowel disease (IBD) status, based on the gene expression levels of nine specific genes. The Lasso regression technique enhances model coefficient estimation by integrating a penalty term into the conventional least squares approach, thereby imposing constraints on the coefficients. In Fig. 5E, the x-axis depicts the logarithmic value (Log Lambda) of the penalty parameter lambda, while the y-axis indicates the magnitude of the coefficients. Each line represents the coefficient of a variable, which decreases from non-zero values to zero as lambda increases (moving from left to right). It is apparent that with increasing lambda values, a larger number of coefficients are driven to zero, resulting in a more parsimonious model. Figure 5F presents the cross-validation error plot, where the x-axis similarly represents the logarithmic value (Log Lambda) of the penalty parameter lambda. The y-axis denotes the binomial deviance, a widely utilized loss function in classification tasks. The red solid line depicts the mean cross-validation error for each lambda value, while the dotted line illustrates the error range, represented by the standard error. The second dashed line, selected in this study, indicates the most streamlined modeling outcome within the minimum error range, known as the 1-SE rule. This rule typically favors a simpler, more regularized model. In practice, the 1-SE rule is inclined to select a more robust and generalized model, thereby mitigating the risk of overfitting. In conclusion, our study developed the most succinct model for predicting inflammatory bowel disease (IBD) states, utilizing CXCL8, Thy1, and COL4A2. These markers were ultimately identified as the key indicators in our analysis. 3.5 Analysis using Gene Set Enrichment Analysis (GSEA). Through the analysis of gene expression levels in patient samples, we categorized patients into high and low expression groups based on the expression levels of the marker gene. Following this classification, we conducted differential gene expression analysis and ranked the genes according to the log-transformed fold change (logFC) of the differentially expressed genes, in descending order. Utilizing this ranked list, we performed Gene Set Enrichment Analysis (GSEA) on the KEGG database of the human genome to identify pathways that are enriched in the high expression group compared to the low expression group. This approach allowed us to evaluate the KEGG pathways potentially influenced by the marker gene. We conducted a Gene Set Enrichment Analysis (GSEA) on the key biomarkers identified by our machine learning model. Figure 6A illustrates the GSEA results for CXCL8, revealing significant enrichment of several related pathways in the high expression cohort. These pathways include cytokine-cytokine receptor interaction, chemokine signaling, Toll-like receptor signaling, and Nod-like receptor signaling pathways. Notably, these pathways exhibit substantial negative enrichment scores (Enrichment Score, ES), suggesting significant downregulation in the high expression group. Additionally, the Leishmania infection pathway is enriched, potentially indicating a broader immune response pattern rather than a direct pathogen infection. This observation implies that elevated CXCL8 expression may be associated with the downregulation of these immune-related pathways. Figure 6B presents the TGSEA analysis of HY 1, indicating that pathways associated with adolescent-onset adult-type diabetes, drug metabolism via cytochrome P450, and other drug-metabolizing enzymes were significantly enriched in the low-expression group. The enrichment scores (ES) for these pathways were notably high, suggesting significant upregulation in the low-expression group. This finding may imply that the inhibition of THY 1 expression is closely linked to maturity-onset diabetes of the young (MODY) and could potentially enhance the metabolism rate of various drugs. Furthermore, pathways related to neuroactive ligand-receptor interactions and extracellular matrix-receptor interactions were significantly enriched in the high-expression group, with enrichment scores showing relatively large negative values. This suggests significant downregulation in the high-expression group, implying that the upregulation of THY 1 may downregulate pathways involved in neuroactive ligand-receptor interactions and extracellular matrix-receptor interactions, thereby affecting intercellular communication. Figure 6C presents the Gene Set Enrichment Analysis (GSEA) of COL4A2, highlighting three principal pathways associated with the low expression group: ribosomes, oxidative phosphorylation, and Huntington's disease, among others. The substantial enrichment score (ES) suggests a significant upregulation of these pathways in the low expression group, implying that COL4A2 expression may contribute to the upregulation of pathways related to ribosomes and oxidative phosphorylation, and may be closely linked to Huntington's disease. Conversely, pathways associated with cytokine-cytokine receptor interaction and extracellular matrix-receptor interaction are markedly enriched in the high expression group, as indicated by large negative enrichment scores. This suggests a significant downregulation of these immune-related pathways in the high expression group, indicating that elevated COL4A2 expression may lead to the downregulation of immune-related pathways. 3.6 Validation of external datasets for biomarker analysis. To evaluate the predictive capacity of individual gene expression levels in distinguishing between inflammatory bowel disease (IBD) and healthy states, we conducted a receiver operating characteristic (ROC) analysis utilizing the pROC package. ROC curves for each gene were constructed by assigning a value of 0 to healthy status and a value of 1 to IBD status, with the corresponding area under the curve (AUC) values calculated to quantify predictive performance. The specific results of the ROC analysis (Fig. 7A) indicated that the CXCL8 gene exhibited the highest AUC value of 0.796, followed by COL4A2 with an AUC of 0.748, and THY1 with an AUC of 0.734. These findings suggest that these gene expression levels demonstrate significant potential as biomarkers for the reliable prediction of IBD status. To assess the reliability of the analytical results, the study reanalyzed 127 samples to evaluate the predictive capability of individual gene expression levels in distinguishing between Inflammatory Bowel Disease (IBD) and healthy conditions. The findings, presented in Fig. 7B, demonstrate that all three marker genes exhibit strong predictive power. Additionally, boxplots (Figs. 7C and 7D) illustrate the differences in expression levels, revealing that the expression levels of the three genes are significantly elevated in the IBD group compared to the healthy group. This observation aligns with the expression patterns of the marker genes in the GSE165512 dataset utilized at the study's outset, thereby further corroborating the reliability of the study's analytical results. 3.7 Development of a diagnostic nomogram utilizing biomarkers We employed the rmsR package to develop a nomogram, receiver operating characteristic (ROC) curve, and calibration curve for the clinical diagnosis of inflammatory bowel disease (IBD). Initially, we constructed a nomogram (refer to Fig. 8A), which generates a total score by assigning a score to each gene based on its expression level, subsequently summing these scores to estimate the risk of IBD. Within the nomogram, the uppermost 'Points' band delineates a score range for each gene or variable. Scores corresponding to the patient's gene expression levels can be identified on this band, and these scores are aggregated to derive the 'Total Points.' The 'Linear Predictor' scale converts the total score into a linear predictive value, facilitating the mapping of cumulative scores onto a probability scale. At the bottom of the nomogram, the 'Probability of IBD' band displays the likelihood of IBD occurrence as determined by the cumulative score. This probability scale typically ranges from 0 to 1, with each probability value corresponding to distinct risk categories. We additionally employed the Receiver Operating Characteristic (ROC) curve to illustrate the relationship between the sensitivity and specificity of the model, as depicted in Fig. 8B. The model achieved an Area Under the Curve (AUC) value of 0.853, signifying a high level of diagnostic accuracy. AUC values approaching 1.0 are indicative of substantial predictive capability. In this context, the elevated AUC value suggests that the gene expression level serves as a robust biomarker for differentiating between Inflammatory Bowel Disease (IBD) and healthy status. Subsequently, we validated the predictive accuracy of the model through the application of a calibration curve (Fig. 8C), which demonstrated a strong concordance between the observed probabilities and those predicted by the model. Ideally, the calibration curve should align with the 45-degree line (denoted as the Ideal line in the figure). Our analysis reveals that both the apparent probability predicted by the model and the bias-corrected predicted probability closely approximate the ideal line, thereby indicating the model's proficiency in accurately estimating the probability of inflammatory bowel disease (IBD). Finally, a decision curve analysis was conducted (refer to Fig. 8D) to illustrate the anticipated net benefit of employing the model for predictions, as opposed to taking no action (i.e., All or None), across various risk thresholds. The results indicate that the model offers a net benefit surpassing baseline predictions at the majority of risk thresholds when utilized for predicting inflammatory bowel disease (IBD). 3.8 Analysis of immune cell infiltration Through immune infiltration analysis, we investigated the differences in immune cell composition between individuals with inflammatory bowel disease (IBD) and healthy controls. We assessed the enrichment of various immune cell types in each sample and compared the enrichment scores of these cells between IBD and healthy samples using the Wilcoxon test. This analysis allowed us to identify immune cell types closely associated with disease status. Figure 9A illustrates the enrichment scores of different immune cell types in IBD versus healthy samples, as determined by single-sample gene set enrichment analysis (ssGSEA), with the presence of various immune cells clearly indicated. By comparing the ssGSEA scores between the IBD and healthy groups, we observed significant differences in the enrichment scores of 11 immune cell types, including neutrophils, Th17 cells, and regulatory T cells (TReg), in the IBD group. These findings suggest that these cell types may play a crucial role in the pathogenesis of IBD. Additionally, we illustrated the relative infiltration levels through the use of cloud and rain plots (Fig. 9B) for the seven immune cell types that demonstrated the most significant differences between the IBD and healthy groups. These immune cell types include regulatory T cells (T Reg), Th 17 cells, Th 1 cells, effector memory T cells (Tem), neutrophils, activated dendritic cells (aDC), and macrophages. Subsequently, we conducted an in-depth analysis of the correlation between the expression levels of specific marker genes and the immune cell enrichment scores. Our primary focus was on evaluating the strength of the association between marker gene expression and significantly different immune cell subsets. Initially, we integrated gene expression data with immune cell enrichment scores derived from single sample gene set enrichment analysis (ssGSEA). This integration allowed for the alignment and comparison of gene expression levels with immune cell abundance scores for each sample. To assess the correlation between marker gene expression levels and immune cell enrichment scores, we employed the Spearman rank correlation coefficient. In all correlation analyses conducted, we selected gene-cell pairs with a p-value less than 0.05 and an absolute Spearman correlation coefficient (ρ) exceeding 0.5. This criterion was designed to identify gene and cell type pairs exhibiting both statistically significant and robust correlations. Based on this screening threshold, we identified four relational pairs: COL4A2-macrophages, CXCL8-aDC, CXCL8-macrophages, and THY1-NK cells. The results demonstrated a positive correlation between the expression level of CXCL8 and the aDC enrichment score, as illustrated in Fig. 10A. The Spearman correlation coefficient was 0.55, with a 95% confidence interval of [0.43, 0.65], and a p-value of 1.12e-14, indicating a strong positive correlation. Additionally, the expression level of CXCL8 was positively associated with macrophage enrichment scores, as shown in Fig. 10B. The Spearman correlation coefficient for this relationship was 0.66, with a 95% confidence interval of [0.56, 0.74], and a p-value of 1.32e-22, indicating a very strong positive correlation. These findings suggest that increases in CXCL8 expression levels correspond to higher enrichment scores for both aDCs and macrophages. The expression level of COL4A2 demonstrated a positive correlation with the macrophage enrichment score, as depicted in Fig. 10C. The Spearman correlation coefficient was calculated to be 0.64, with a 95% confidence interval ranging from 0.53 to 0.72, and a p-value of 1.37e-20, indicating a very strong positive correlation between these variables. This suggests that as the expression levels of COL4A2 increase, there is a corresponding increase in the macrophage enrichment score. Similarly, the expression level of THY1 exhibited a positive correlation with the NK cell enrichment score, as shown in Fig. 10D. The Spearman correlation coefficient was determined to be 0.57, with a 95% confidence interval of [0.45, 0.66], and a p-value of 7.13e-16, indicating a strong positive correlation. This implies that upregulation of THY1 expression is associated with a corresponding enhancement in NK cell enrichment. 3.9 Development of a transcription factor-microRNA-messenger RNA regulatory network Finally, the miRNet database was utilized to predict small RNAs (miRNAs) and transcription factors (TFs) associated with three biomarker genes. These three biomarker genes were found to predict 162 interacting gene relationships, comprising 31 transcription factors and 20 miRNAs. Subsequently, a Marker-miRNA-TF network was constructed using Cytoscape software, as illustrated in Fig. 11. In this network, the red nodes represent the three biomarker genes, the light green nodes denote transcription factors, and the yellow nodes indicate miRNAs. The figure illustrates the interactions between COL4A2, CXCL8, THY1, and various microRNAs (miRNAs) and transcription factors (TFs), suggesting that these miRNAs and TFs may influence COL4A2, CXCL8, and THY1 either directly or indirectly, thereby contributing to the pathogenesis of inflammatory bowel disease (IBD). CXCL8 is associated with multiple miRNAs and TFs, indicating that these molecules may play a role in the direct regulation of CXCL8 gene expression. In contrast, COL4A2 is primarily linked to miRNAs, suggesting that its gene expression is predominantly regulated by these miRNAs. Meanwhile, THY1 is associated only with the transcription factor POU5F1 and the miRNAs hsa-mir-494-3p and hsa-mir-16-5p, indicating that the expression of the THY1 gene is directly regulated by these specific miRNAs and TF. This suggests a relatively straightforward regulatory mechanism for THY1 gene expression, which may facilitate further mechanistic investigations. 4. Discussion Recent studies have increasingly focused on the association between inflammatory bowel disease (IBD) and angiogenesis. Angiogenesis, the formation of new blood vessels, plays a critical role in the pathophysiology of IBD. Research has demonstrated that levels of angiogenic factors, such as vascular endothelial growth factor (VEGF), are significantly elevated in patients with IBD, potentially correlating with disease activity [ 23 ]. Moreover, the inhibition of angiogenesis has been proposed as a promising therapeutic approach for IBD. For instance, ginsenoside Rg3, derived from traditional Chinese medicine ginseng, has garnered attention for its antiangiogenic properties, and its efficacy in IBD treatment may be enhanced through delivery via a thermosensitive hydrogel system [ 24 ]. In the management of inflammatory bowel disease (IBD), anti-tumor necrosis factor (TNF) antibodies, such as infliximab, have demonstrated efficacy in restoring endothelial nitric oxide synthase (eNOS) and vascular endothelial growth factor receptor 2 (VEGFR2) protein expression in endothelial cells. This observation implies that modulation of angiogenesis may contribute to mitigating the inflammatory processes associated with IBD [ 25 ]. These findings indicate that angiogenesis not only plays a crucial role in the pathogenesis of IBD but also represents a potential target for future therapeutic interventions. Further investigation into the specific mechanisms underlying angiogenesis in IBD could yield novel insights for the development of more effective treatment modalities. In the present study, three signature biomarkers—CXCL8, COL4A2, and THY1—were identified as significantly associated with angiogenesis in inflammatory bowel disease (IBD) through the application of LASSO regression, random forest (RF), and support vector machine (SVM) models, as well as weighted gene co-expression network analysis (WGCNA). Receiver operating characteristic (ROC) analysis and diagnostic nomogram demonstrated that these biomarkers exhibited excellent discriminatory power in differentiating IBD samples from healthy controls. Results from single-sample gene set enrichment analysis (ssGSEA) revealed a significant increase in the infiltration of various immune cells, with the study highlighting the differential infiltration levels of the seven most significantly distinct immune cell types between the IBD and healthy groups. Furthermore, the three biomarkers showed associations with macrophages, activated dendritic cells (aDC), and natural killer (NK) cells. These findings suggest that CXCL8, COL4A2, and THY1 are closely linked to angiogenesis in IBD and may serve as promising diagnostic and therapeutic biomarkers for the disease. CXCL8, also referred to as interleukin-8 (IL-8), is integral to the inflammatory response, functioning as a potent neutrophil chemokine that orchestrates the directed migration of leukocytes. This is achieved through its activation upon binding to specific chemokine G protein-coupled receptors (GPCRs). The activity of CXCL8 is contingent upon its interaction with the human CXC chemokine receptors CXCR1 and CXCR2, as well as its binding to cell surface glycosaminoglycans (GAGs) [ 26 ]. Such binding facilitates the formation of a solid-phase chemotactic gradient, which optimally presents CXCL8 to circulating neutrophils [ 27 ]. Nonetheless, it has been demonstrated that CXCL8's activity is not solely reliant on receptor interaction but is also intricately regulated at the levels of transcription, translation, and post-translational modifications. These regulatory mechanisms are crucial for ensuring the precise spatiotemporal activity of CXCL8 in the context of inflammatory diseases and cancer [ 26 ]. In conjunction with the Gene Set Enrichment Analysis (GSEA) conducted in this study, the overexpression of CXCL8 in inflammatory bowel disease (IBD) appears to modulate various immune-related pathways, including cytokine-cytokine receptor interaction, chemokine signaling, and Toll-like receptor and Nod-like receptor signaling. This modulation results in the accumulation of neutrophils at the lesion site, thereby exacerbating tissue dysfunction and the inflammatory response. Furthermore, it has been observed that the activation of the P2Y6 receptor can enhance CXCL8 expression in an AP-1-dependent manner, further facilitating the recruitment of neutrophils. This finding suggests that P2Y6 may serve as a potential therapeutic target for mitigating intestinal inflammation. Beyond its role in inflammation, CXCL8 is also implicated in cancer progression[ 28 ]. Research indicates that CXCL8 contributes to tumor progression and metastasis by influencing macrophages within the tumor microenvironment. Specifically, CXCL8 regulates the ATF3-CXCL8 axis via the PI3K/AKT/mTOR signaling pathway, thereby altering the phenotype of macrophages surrounding tumors and promoting tumor growth[ 27 ]. This aligns with our study's conclusion that CXCL8 is closely associated with macrophage enrichment. CXCL8 is a critical mediator in the process of angiogenesis. Empirical evidence indicates that CXCL8 facilitates endothelial cell recruitment and angiogenesis. In the context of breast cancer, CXCL8 secreted by human adipose-derived mesenchymal stem cells (hADSCs) has been observed to enhance tumor growth by promoting angiogenesis. Specifically, CXCL8 augments the migratory and tubular structure formation capabilities of human umbilical vein endothelial cells (HUVECs) via the CXCR1 and CXCR2 signaling pathways, thereby contributing to tumor angiogenesis [ 29 ]. Additionally, Epstein-Barr virus (EBV) infection has been shown to upregulate CXCL8 expression, which in turn promotes angiogenesis and tumor growth through the activation of the NF-κB signaling pathway. The overexpression of CXCL8 correlates with a poorer prognosis in gastric cancer patients, underscoring its significant role in angiogenesis and tumor progression in this cancer type [ 30 ]. In colorectal cancer, CXCL8 enhances FOXD1 expression through the AKT/NF-κB signaling pathway, thereby facilitating angiogenesis. The ability of CXCL8 to enhance the tubular structure formation, proliferation, and migration of HUVECs further supports the involvement of the CXCR2-dependent pathway [ 31 ]. In summary, CXCL8 is integral to the process of angiogenesis in various inflammatory diseases and cancers. It facilitates endothelial cell migration and proliferation through specific receptor interactions and the activation of associated signaling pathways, thereby augmenting angiogenesis. These insights offer novel therapeutic approaches for the management of inflammatory bowel disease. The COL4A2 (type IV collagen α 2 chain) gene is critically involved in the process of angiogenesis. Our research has demonstrated that the heterotrimer formed by COL4A2 in conjunction with COL4A1 constitutes a vital component of the basement membrane, which is essential for maintaining the stability and functionality of the vascular basement membrane. Mutations in COL4A2 can result in angiogenic disorders, potentially leading to a variety of vascular-related diseases [ 32 ]. In a particular study, molecular and genetic analyses of a COL4A2 mutant mouse model revealed that such mutations result in abnormal vascular development, which can precipitate small vessel disease, recurrent hemorrhagic stroke, and age-related macroscopic vascular lesions. Furthermore, it has been demonstrated that the intracellular accumulation of collagen in vascular endothelial cells and pericytes, due to COL4A2 mutations, is a significant predisposing factor for intracerebral hemorrhage (ICH) [ 33 ]. Furthermore, mutations in the COL4A2 gene impact the secretion of both COL4A1 and COL4A2, leading to intracellular accumulation and endoplasmic reticulum (ER) stress, which may induce cytotoxic effects. Research indicates that treatment with chemical chaperones can mitigate the intracellular accumulation of mutant collagen and ameliorate the cellular phenotype associated with COL4A2 mutations [ 34 ]. The involvement of COL4A2 in inflammatory bowel disease (IBD) is potentially linked to its role in preserving the integrity of the intestinal epithelial barrier. It is widely recognized that compromised intestinal barrier function constitutes a significant factor in the initiation and progression of IBD. Research has demonstrated that the disruption of intestinal epithelial barrier function is associated with various factors, including the overexpression of inflammatory mediators, heightened apoptosis, and modifications in tight junction proteins [ 35 ]. In the context of inflammatory bowel disease (IBD), the remodeling of the extracellular matrix (ECM) is recognized as a significant pathological process that can influence the expression and functionality of type IV collagen protein [ 36 ]. This assertion is corroborated by the Gene Set Enrichment Analysis (GSEA) conducted in the present study. Furthermore, it has been observed that aberrant ECM metabolism occurs in the intestines of individuals with IBD, potentially leading to collagen degradation and subsequent compromise of intestinal barrier integrity [ 37 ]. Consequently, the involvement of COL4A2 in IBD may pertain to its critical role in maintaining basement membrane integrity and regulating intestinal barrier function. In summary, COL4A2 is integral to the stability of the vascular basement membrane, with mutations in this gene linked to various vascular diseases. Its involvement in inflammatory bowel disease (IBD) is likely connected to its function in preserving the integrity of the intestinal epithelial barrier and regulating extracellular matrix remodeling. Future research should aim to elucidate the specific mechanisms by which COL4A2 influences IBD, potentially identifying novel targets and strategies for therapeutic intervention. THY1, also referred to as CD90, is a glycoprotein prevalent across various cell types, including fibroblasts, neurons, and immune cells, and it plays significant roles in numerous biological processes, particularly within the immune system and in fibrotic diseases. Firstly, the involvement of THY1 in immune regulation appears to be intricately linked to the pathophysiological mechanisms of inflammatory bowel disease (IBD). Evidence suggests that THY1 is instrumental in modulating T cell activation and proliferation, processes that are pivotal in the immune response associated with IBD [ 38 ]. This finding aligns with our conclusion that THY1 is closely associated with the enrichment of natural killer (NK) cells. Secondly, THY1 may influence the onset and progression of IBD by impacting intestinal barrier function. The maintenance of intestinal barrier integrity is crucial for preventing the invasion of pathogens and toxins. THY1 may affect this barrier function by regulating intercellular junctions and the composition of the extracellular matrix [ 39 ], a hypothesis that was corroborated by the gene set enrichment analysis (GSEA) conducted in this study. The disruption of intestinal barrier function is a prevalent pathological characteristic observed in patients with inflammatory bowel disease (IBD), potentially linked to the aberrant expression and functionality of THY1. Furthermore, the involvement of THY1 in IBD may be pertinent to its role in fibrotic processes. Patients with IBD frequently experience intestinal fibrosis, and elevated THY1 expression has been noted in fibrotic tissues, indicating its potential contribution to the development and progression of fibrosis [ 40 ]. Through its regulation of extracellular matrix remodeling and fibrocyte activation, THY1 may affect the severity of fibrosis associated with IBD. Recently, the role of THY1 in angiogenesis has garnered significant scholarly interest. Research indicates that THY1 serves a crucial regulatory function in the process of angiogenesis. Initially, THY1 influences the migratory and invasive capabilities of cancer cells through its interaction with integrins. For instance, one study demonstrated that THY1 activates the calcium ion channel and P2X7 receptor signaling pathway via its interaction with αVβ3 integrin, thereby facilitating cancer cell migration and invasion [ 41 ]. This mechanism suggests a potential involvement of THY1 in tumor angiogenesis. Furthermore, the expression level of THY1 can be modulated by various signaling pathways. For example, PMA (12-myristate acid-13-acetate) has been shown to upregulate THY1 expression through the activation of the PKC-δ/Syk/NF-κB signaling pathway, consequently inhibiting endothelial cell migration and the formation of capillary-like tubular structures [ 42 ]. These findings indicate that THY1 is involved in a sophisticated signaling network that regulates angiogenesis. Furthermore, THY1 is integral to the wound-healing process, as it facilitates wound repair by improving blood perfusion in the skin. Evidence from studies on THY1 knockout mice, which exhibited delayed reepithelialization and reduced blood perfusion during wound healing, underscores the essential role of THY1 in angiogenesis [ 43 ]. In conclusion, THY1 appears to play a multifaceted role in inflammatory bowel disease (IBD), encompassing immune regulation, intestinal barrier function, and fibrotic processes. Additionally, THY1 is integral to angiogenesis. Further investigation into the specific mechanisms by which THY1 influences IBD is essential to elucidate its contribution to disease pathogenesis and to identify novel therapeutic targets for IBD treatment. Despite our rigorous efforts to enhance the reliability of our findings through the use of extensive sample datasets, diverse analytical methodologies, and both internal and external validation, certain limitations of our study must be acknowledged. Firstly, this research involves secondary data mining and analysis of previously published datasets, and variations in dataset selection and analytical approaches may yield different outcomes. Secondly, a substantial amount of clinical information pertaining to the sample was not obtained, leading to the omission of potential effects related to patient complications, gender, and age. Thirdly, the precise and well-defined mechanisms by which these signature biomarkers influence angiogenesis to drive the pathological processes of inflammatory bowel disease remain unclear. Consequently, further research is warranted, involving larger sample sizes from diverse regions or ethnic groups, as well as additional in vitro and in vivo experiments. 5. Conclusion This study ultimately identified three angiogenesis biomarkers of significance in inflammatory bowel disease (IBD): CXCL8, COL4A2, and THY1. These biomarkers are implicated in fundamental biological processes, including cytokine interactions, chemotaxis, extracellular matrix-receptor interactions, oxidative phosphorylation, and other related reactions. Immunoprofiling revealed a notable increase in 11 immune cell types within IBD samples. Moreover, a significant positive correlation was observed between these biomarkers and infiltrating immune cells. These findings suggest that the immune response plays a critical role in the angiogenic mechanisms underlying IBD, which can be attributed to the interaction between these signature biomarkers and immune-infiltrating cells. Declarations Ethical Approval and Consent to participate Ethical approval was not required for this study, as it utilized anonymized/publicly available data and did not involve direct interaction with human or animal subjects. Consent to participate was therefore not applicable. Consent for publication Not applicable. This study did not involve human participants, personal data, identifiable images, or case reports that require consent for publication. Data Availability/Availability of data and materials The data supporting this study were derived from the following publicly available resources: GEO Database: https://www.ncbi.nlm.nih.gov/geo/ MSigDB Database: https://www.gsea-msigdb.org/gsea/msigdb/index.jsp Processed datasets and analysis scripts are available from the authors on request. Competing interests The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Funding This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Author Contributions Statement Pengliang Zhang: Conceptualization, Methodology, Formal Analysis, Data Curation, Software, Visualization, Writing – Original Draft. Shuang Chen: Validation, Investigation, Resources, Writing – Review & Editing. Xianmin Liu: Validation, Investigation, Resources, Writing – Review & Editing. Lijuan Wu: Validation, Investigation, Resources, Writing – Review & Editing. Yingjian Zhang: Supervision, Project Administration, Funding Acquisition, Writing – Review & Editing. Acknowledgements No additional acknowledgements beyond the listed authors are applicable to this study. References SINGH N, BERNSTEIN C N. Environmental risk factors for inflammatory bowel disease [J]. United European gastroenterology journal, 2022, 10(10): 1047-53. ROSEN M J, DHAWAN A, SAEED S A. Inflammatory Bowel Disease in Children and Adolescents [J]. JAMA pediatrics, 2015, 169(11): 1053-60. KOLIANI-PACE J L, SIEGEL C A. Prognosticating the Course of Inflammatory Bowel Disease [J]. Gastrointestinal endoscopy clinics of North America, 2019, 29(3): 395-404. FABIáN O, KAMARADOVá K. Morphology of inflammatory bowel diseases (IBD) [J]. Ceskoslovenska patologie, 2022, 58(1): 27-37. BRUNER L P, WHITE A M, PROKSELL S. Inflammatory Bowel Disease [J]. Primary care, 2023, 50(3): 411-27. HEMMER A, FOREST K, RATH J, et al. Inflammatory Bowel Disease: A Concise Review [J]. South Dakota medicine : the journal of the South Dakota State Medical Association, 2023, 76(9): 416-23. ZHAO M, FENG R, BEN-HORIN S, et al. Systematic review with meta-analysis: environmental and dietary differences of inflammatory bowel disease in Eastern and Western populations [J]. Alimentary pharmacology & therapeutics, 2022, 55(3): 266-76. FLYNN S, EISENSTEIN S. Inflammatory Bowel Disease Presentation and Diagnosis [J]. The Surgical clinics of North America, 2019, 99(6): 1051-62. RAMOS G P, PAPADAKIS K A. Mechanisms of Disease: Inflammatory Bowel Diseases [J]. Mayo Clinic proceedings, 2019, 94(1): 155-65. SCHULTSZ C, VAN DEN BERG F M, TEN KATE F W, et al. The intestinal mucus layer from patients with inflammatory bowel disease harbors high numbers of bacteria compared with controls [J]. Gastroenterology, 1999, 117(5): 1089-97. WRIGHT E K, DING N S, NIEWIADOMSKI O. Management of inflammatory bowel disease [J]. The Medical journal of Australia, 2018, 209(7): 318-23. SEYEDIAN S S, NOKHOSTIN F, MALAMIR M D. A review of the diagnosis, prevention, and treatment methods of inflammatory bowel disease [J]. Journal of medicine and life, 2019, 12(2): 113-22. EDER P, KORYBALSKA K, LINKE K, et al. Angiogenesis-related proteins--their role in the pathogenesis and treatment of inflammatory bowel disease [J]. Current protein & peptide science, 2015, 16(3): 249-58. ALKIM C, ALKIM H, KOKSAL A R, et al. Angiogenesis in Inflammatory Bowel Disease [J]. International journal of inflammation, 2015, 2015: 970890. BENELLI R, LORUSSO G, ALBINI A, et al. Cytokines and chemokines as regulators of angiogenesis in health and disease [J]. Current pharmaceutical design, 2006, 12(24): 3101-15. POUSA I D, MATé J, GISBERT J P. Angiogenesis in inflammatory bowel disease [J]. European journal of clinical investigation, 2008, 38(2): 73-81. CHIDLOW J H, JR., SHUKLA D, GRISHAM M B, et al. Pathogenic angiogenesis in IBD and experimental colitis: new ideas and therapeutic avenues [J]. American journal of physiology Gastrointestinal and liver physiology, 2007, 293(1): G5-g18. DEBAN L, CORREALE C, VETRANO S, et al. Multiple pathogenic roles of microvasculature in inflammatory bowel disease: a Jack of all trades [J]. The American journal of pathology, 2008, 172(6): 1457-66. KOUTROUBAKIS I E, TSIOLAKIDOU G, KARMIRIS K, et al. Role of angiogenesis in inflammatory bowel disease [J]. Inflammatory bowel diseases, 2006, 12(6): 515-23. AZZAM N. Angiogenesis and inflammatory bowel disease [J]. Saudi journal of gastroenterology : official journal of the Saudi Gastroenterology Association, 2007, 13(1): 37-8. DANESE S. Negative regulators of angiogenesis in inflammatory bowel disease: thrombospondin in the spotlight [J]. Pathobiology : journal of immunopathology, molecular and cellular biology, 2008, 75(1): 22-4. DANESE S, SANS M, DE LA MOTTE C, et al. Angiogenesis as a novel component of inflammatory bowel disease pathogenesis [J]. Gastroenterology, 2006, 130(7): 2060-73. DAvgerinos Efthimios,Katergiannakis Vaggelogiannis,Kopanakis Nikolaos,et al: Serum VEGF and bFGF in patients with inflammatory bowel diseases. Ann Ital Chir, 2014 May-Jun;85(3):203-6. Xie Yiqiong,Ma Ying,Xu Lu,et al: Inhibition of Angiogenesis and Effect on Inflammatory Bowel Disease of Ginsenoside Rg3-Loaded Thermosensitive Hydrogel. Pharmaceutics, 2024 Sep 25;16(10):1243. Altorjay I,Bacskai I,Bátori R,et al: Anti-TNF-alpha antibody (infliximab) therapy supports the recovery of eNOS and VEGFR2 protein expression in endothelial cells. Int J Immunopathol Pharmacol, 2011 Apr-Jun;24(2):323-35. Cambier Seppe,Gouwy Mieke,Proost Paul : The chemokines CXCL8 and CXCL12: molecular and functional properties, role in disease and efforts towards pharmacological intervention. Cell Mol Immunol, 2023 Mar;20(3):217-251. Adage Tiziana,Bartley Michael R,Del Bene Francesca,et al: PA401, a novel CXCL8-based biologic therapeutic with increased glycosaminoglycan binding, reduces bronchoalveolar lavage neutrophils and systemic inflammatory markers in a murine model of LPS-induced lung inflammation.Cytokine, 2015 Dec;76(2):433-441. Arguin Guillaume,Bilodeau Maude S,Degagné Émilie,et al: P2Y6 receptor contributes to neutrophil recruitment to inflamed intestinal mucosa by increasing CXC chemokine ligand 8 expression in an AP-1-dependent manner in epithelial cells. Inflamm Bowel Dis, 2012 Aug;18(8):1456-69. Wang Yuan,Liu Junli,Jiang Qingyuan,et al: Human Adipose-Derived Mesenchymal Stem Cell-Secreted CXCL1 and CXCL8 Facilitate Breast Tumor Growth By Promoting Angiogenesis. Stem Cells, 2017 Sep;35(9):2060-2070. Zhang Jing-Yue,Du Yu,Gong Li-Ping,et al: EBV-Induced CXCL8 Upregulation Promotes Vasculogenic Mimicry in Gastric Carcinoma via NF-κB Signaling. Front Cell Infect Microbiol, 2022 Mar 7:12:780416. Chen Chun,Xu Zhuo-Qing,Zong Ya-Ping,et al: CXCL5 induces tumor angiogenesis via enhancing the expression of FOXD1 mediated by the AKT/NF-κB pathway in colorectal cancer. Cell Death Dis, 2019 Feb 21;10(3):178. Jeanne Marion,Labelle-Dumais Cassandre,Jorgensen Jeff,et al: COL4A2 mutations impair COL4A1 and COL4A2 secretion and cause hemorrhagic stroke. Am J Hum Genet, 2012 Jan 13;90(1):91-101. Jeanne Marion,Jorgensen Jeff,Gould Douglas B : Molecular and Genetic Analyses of Collagen Type IV Mutant Mouse Models of Spontaneous Intracerebral Hemorrhage Identify Mechanisms for Stroke Prevention. Circulation, 2015 May 5;131(18):1555-65. Murray Lydia S,Lu Yinhui,Taggart Aislynn,et al: Chemical chaperone treatment reduces intracellular accumulation of mutant collagen IV and ameliorates the cellular phenotype of a COL4A2 mutation that causes haemorrhagic stroke. Hum Mol Genet, 2014 Jan 15;23(2):283-92. Ye Xiaolin,Sun Mei : AGR2 ameliorates tumor necrosis factor-α-induced epithelial barrier dysfunction via suppression of NF-κB p65-mediated MLCK/p-MLC pathway activation. Int J Mol Med, 2017 May;39(5):1206-1214. Mortensen Joachim Høg,Manon-Jensen Tina,Jensen Michael Dam,et al: Ulcerative colitis, Crohns disease, and irritable bowel syndrome have different profiles of extracellular matrix turnover, which also reflects disease activity in Crohns disease. PLoS One, 2017 Oct 13;12(10):e0185855. Fischer Andreas,Gluth Markus,Weege Friderike,et al: Glucocorticoids regulate barrier function and claudin expression in intestinal epithelial cells via MKP-1. Am J Physiol Gastrointest Liver Physiol, 2014 Feb;306(3):G218-28. Raza Ali,Yousaf Wajeeha,Giannella Ralph,et al: Th17 cells: interactions with predisposing factors in the immunopathogenesis of inflammatory bowel disease. Expert Rev Clin Immunol, 2012 Feb;8(2):161-8. Shi Ruoran,Yu Fazheng,Hu Xueyu,et al: Protective Effect of Lactiplantibacillus plantarum subsp. plantarum SC-5 on Dextran Sulfate Sodium-Induced Colitis in Mice. Foods, 2023 Feb 20;12(4):897. Al-Araimi Amna,Al Kharusi Amira,Bani Oraba Asma,et al: Deletion of SOCS2 Reduces Post-Colitis Fibrosis via Alteration of the TGFβ Pathway. Int J Mol Sci, 2020 Apr 27;21(9):3073. Brenet Marianne,Martínez Samuel,Pérez-Nuñez Ramón,et al: Thy-1 (CD90)-Induced Metastatic Cancer Cell Migration and Invasion Are β3 Integrin-Dependent and Involve a Ca2+/P2X7 Receptor Signaling Axis. Front Cell Dev Biol, 2021 Jan 12:8:592442. Wen Heng-Ching,Huo Yen Nien,Chou Chih-Ming,et al: PMA inhibits endothelial cell migration through activating the PKC-δ/Syk/NF-κB-mediated up-regulation of Thy-1. Sci Rep, 2018 Nov 2;8(1):16247. Pérez Leonardo A,León José,López Juan,et al: The GPI-Anchored Protein Thy-1/CD90 Promotes Wound Healing upon Injury to the Skin by Enhancing Skin Perfusion. Int J Mol Sci, 2022 Oct 19;23(20):12539. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6554786","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":460053488,"identity":"0fea9c4b-c17f-4855-ac22-4825143c123d","order_by":0,"name":"pengliang zhang","email":"","orcid":"","institution":"First Affiliated Hospital of Henan University of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"pengliang","middleName":"","lastName":"zhang","suffix":""},{"id":460053489,"identity":"74f20348-00e4-437b-9096-9e2826067f95","order_by":1,"name":"shuang chen","email":"","orcid":"","institution":"First Affiliated Hospital of Henan University of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"shuang","middleName":"","lastName":"chen","suffix":""},{"id":460053490,"identity":"a63fba3c-e0ee-40af-a0eb-3e6fb41d19fc","order_by":2,"name":"xianmin liu","email":"","orcid":"","institution":"First Affiliated Hospital of Henan University of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"xianmin","middleName":"","lastName":"liu","suffix":""},{"id":460053491,"identity":"440b6bea-157d-4a45-8def-8f5064f94ad8","order_by":3,"name":"lijuan wu","email":"","orcid":"","institution":"First Affiliated Hospital of Henan University of Science and Technology","correspondingAuthor":false,"prefix":"","firstName":"lijuan","middleName":"","lastName":"wu","suffix":""},{"id":460053492,"identity":"b3acee8b-5c3d-42cd-a777-6a42a9621837","order_by":4,"name":"yingjian zhang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABIklEQVRIiWNgGAWjYDACCTh5sIGBoeIAVJiNaC1nQFqYidICBYxtRGiRn9387OHXNos8ecfDjZ8L592RNzh//gDDh7LDDPyzG7BqYZxzzNxYtk2i2PDAwWbpmdueGW64kczAOOPcYQaJOwewamGWSDCTlmyTSNzYcLBBmnfbYcYNN5gZmHnbDjMYSCRg1cImkf4NpqX5N++cw/Ybzh9mYP6LRwuPRI6Z5EeglvkMB9ukeRsOJ244kMzAzIhHi4RETpk0wzmJxA1ALdY8xw4nz7yRbHCw51w6j8QN7FrkZ6Rvk/xRVpc4f8bxx7d5ag7b9p0/+PDBjzJrOf4Z2LWAg4AXGAsGN5DCB8TkwakeCBh//AFa19+AT80oGAWjYBSMZAAAuLVngIq7VaQAAAAASUVORK5CYII=","orcid":"","institution":"First Affiliated Hospital of Henan University of Science and Technology","correspondingAuthor":true,"prefix":"","firstName":"yingjian","middleName":"","lastName":"zhang","suffix":""}],"badges":[],"createdAt":"2025-04-29 09:23:38","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6554786/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6554786/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":83307616,"identity":"88ccb041-31b0-45e6-8a49-0f9cd859b401","added_by":"auto","created_at":"2025-05-22 17:05:55","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":131384,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/3e7154dbc63e0b312bb4cce0.jpg"},{"id":83308072,"identity":"c4df80fe-fe8f-479f-9d13-ee609f356860","added_by":"auto","created_at":"2025-05-22 17:13:55","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":391721,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/b90b64ac38082dff4bf73fe2.jpg"},{"id":83307625,"identity":"201e49eb-de97-4349-8257-e5b23e883526","added_by":"auto","created_at":"2025-05-22 17:05:55","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":338720,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/976e528825c1817315ef2d37.jpg"},{"id":83307619,"identity":"1ac38260-2e97-4859-bfa8-0203d9647b50","added_by":"auto","created_at":"2025-05-22 17:05:55","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":209069,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/ca96b789e3b829193ae09bec.jpg"},{"id":83307622,"identity":"d5adcee4-aeeb-48b9-a04d-83c80063d1aa","added_by":"auto","created_at":"2025-05-22 17:05:55","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":304892,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/654d9a3b3b385746896797c9.jpg"},{"id":83307617,"identity":"7d08aeeb-3908-4a55-a8fb-c481fe3031c6","added_by":"auto","created_at":"2025-05-22 17:05:55","extension":"jpg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":356716,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"6.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/7e0a1316c52ce78916ce34a1.jpg"},{"id":83308178,"identity":"cf6f0adc-4bf1-4924-9fa5-ad696a4567fd","added_by":"auto","created_at":"2025-05-22 17:21:55","extension":"jpg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":253147,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"7.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/d41b5d9eaae5b803b603bb10.jpg"},{"id":83308528,"identity":"badd2c63-121e-48ed-a094-299718e6a17b","added_by":"auto","created_at":"2025-05-22 17:29:55","extension":"jpg","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":237191,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"8.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/dcb7815717f083ba4415a9ba.jpg"},{"id":83308177,"identity":"b6002d06-4793-4fa4-b8dd-9fa31c66f5b4","added_by":"auto","created_at":"2025-05-22 17:21:55","extension":"jpg","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":236061,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"9.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/4c6c19889e486d4fe5ff10f9.jpg"},{"id":83307626,"identity":"f313631c-9d6a-4434-89c9-413fdc98ed23","added_by":"auto","created_at":"2025-05-22 17:05:55","extension":"jpg","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":342228,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"10.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/b120e0e78c404322b3ba4087.jpg"},{"id":83308069,"identity":"8b7793c5-4672-453a-82ed-5ed662ed5a79","added_by":"auto","created_at":"2025-05-22 17:13:55","extension":"jpg","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":198466,"visible":true,"origin":"","legend":"\u003cp\u003eSee image above for figure legend\u003c/p\u003e","description":"","filename":"11.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/f7411da1b96e4ffbd92dcc4f.jpg"},{"id":91896584,"identity":"26c6ed51-adad-4d70-9960-4f66c5fbb064","added_by":"auto","created_at":"2025-09-22 18:16:31","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3849464,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6554786/v1/d028d5a8-afcc-4787-aafa-c49e8a8ccb42.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Biomarker Prediction and Immune Landscape of Angiogenesis in Inflammatory Bowel Disease: Insights from Bioinformatics and Machine Learning Approaches","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eInflammatory bowel disease (IBD) is a chronic, progressive, immune-mediated inflammatory condition of the gastrointestinal tract, predominantly diagnosed during adolescence and young adulthood[\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e]. It primarily encompasses Crohn\u0026apos;s disease and ulcerative colitis, with a prevalence of approximately 1% and an increasing incidence among children in recent years[\u003cspan class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e3\u003c/span\u003e]. In Crohn\u0026apos;s disease, intestinal involvement is typically segmental, often affecting the terminal ileum, and is characterized histologically by epithelioid granulomas. In contrast, ulcerative colitis presents with diffuse inflammation, usually beginning in the rectum and extending proximally to the terminal ileum. Clinically, IBD commonly manifests as chronic diarrhea (with or without hematochezia), abdominal pain, and weight loss[\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e6\u003c/span\u003e]. Although the precise etiology of IBD remains unclear, it is hypothesized to result from a complex interplay of genetic predisposition, undefined environmental factors, alterations in the gut microbiome, and immune dysregulation[\u003cspan class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eRegarding the diversity of the host gut microbiome, a reduction in Firmicutes and an increase in Proteobacteria and Bacteroidetes may result in decreased production of short-chain fatty acids and impaired function of regulatory T cells and epithelial cells, thereby elevating the risk of developing inflammatory bowel disease (IBD) [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e10\u003c/span\u003e]. Moreover, research indicates that first-degree relatives of IBD patients have a fivefold increased risk of developing the condition [\u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e]. For diagnostic purposes, fecal calprotectin can be utilized to identify patients with gastrointestinal symptoms who may have IBD, while imaging techniques can safely, easily, and reliably detect inflammation. Regular re-evaluation of disease status is essential for patients [\u003cspan class=\"CitationRef\"\u003e11\u003c/span\u003e]. Given the high prevalence of IBD among younger populations and its significant impact on patients\u0026apos; quality of life, it is anticipated that IBD will become a major health concern in developing countries in the near future [\u003cspan class=\"CitationRef\"\u003e12\u003c/span\u003e]. Consequently, early diagnosis and treatment of IBD are crucial. In particular, early intervention in Crohn\u0026rsquo;s disease may alter the disease\u0026apos;s natural progression and reduce the risk of disability.\u003c/p\u003e\n\u003cp\u003eAngiogenesis is a highly intricate process involving various cell types, growth factors, cytokines, adhesion molecules, and signaling pathways [\u003cspan class=\"CitationRef\"\u003e13\u003c/span\u003e]. It plays a crucial role in the pathogenesis of inflammatory bowel disease (IBD), with chronic inflammation and angiogenesis being interrelated processes; specifically, chronic intestinal inflammation is reliant on angiogenesis. Angiogenesis facilitates the inflammatory response by promoting leukocyte migration and supplying oxygen and nutrients, while also contributing significantly to wound healing [\u003cspan class=\"CitationRef\"\u003e14\u003c/span\u003e\u0026ndash;\u003cspan class=\"CitationRef\"\u003e18\u003c/span\u003e]. Moreover, numerous studies have documented physiological alterations in vascular anatomy and the upregulation of angiogenic mediators in IBD patients, indicating that a deeper understanding of the angiogenic process could lead to more effective treatments for chronic intestinal inflammation [\u003cspan class=\"CitationRef\"\u003e19\u003c/span\u003e]. Research has revealed significantly elevated levels of serum vascular endothelial growth factor (VEGF) and basic fibroblast growth factor (b-FGF) in IBD patients compared to controls [\u003cspan class=\"CitationRef\"\u003e20\u003c/span\u003e]. Moreover, studies utilizing animal models of experimental intestinal inflammation have demonstrated significant angiogenesis in the two primary forms of inflammatory bowel disease (IBD)\u0026mdash;Crohn\u0026apos;s disease and ulcerative colitis\u0026mdash;suggesting that inhibiting this process could hold therapeutic potential [\u003cspan class=\"CitationRef\"\u003e21\u003c/span\u003e]. Additionally, it has been observed that mucosal extracts from IBD patients induce strong angiogenic responses in corneal and chorioallantoic membrane assays compared to extracts from normal mucosa [\u003cspan class=\"CitationRef\"\u003e22\u003c/span\u003e]. Nonetheless, the specific role of angiogenic genes in the pathogenesis of IBD requires further investigation.\u003c/p\u003e\n\u003cp\u003eThis study aims to identify diagnostic and therapeutic biomarkers associated with angiogenesis in inflammatory bowel disease (IBD) utilizing bioinformatics approaches. Furthermore, it seeks to elucidate the role of these biomarkers in IBD through functional and molecular mechanism analyses, thereby providing a theoretical foundation for the clinical diagnosis and treatment of the disease. A comprehensive analysis workflow is presented in Fig. 1.\u003c/p\u003e"},{"header":"2. Method","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\n \u003ch2\u003e2.1 Data Source and Preprocessing\u003c/h2\u003eIn this study, transcriptome microarray data from colon tissue samples of patients with inflammatory bowel disease (IBD) were retrieved from the Gene Expression Omnibus (GEO) database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/geo/\u003c/span\u003e\u003c/span\u003e) and subsequently processed using R software. To preserve data integrity, probes lacking corresponding gene symbols were excluded. For instances where multiple probes targeted the same gene, the mean expression value was calculated to ensure accuracy and consistency. Expression profiles were normalized using the interarray normalization function from the limma software package. Metadata were constructed by integrating two datasets: GSE165512 and GSE75214. The GSE165512 dataset comprises 46 control samples, 84 samples from patients with Crohn\u0026apos;s disease, and 40 samples from patients with ulcerative colitis. The GSE75214 dataset, serving as an external validation set, includes 22 control samples, 8 samples from patients with Crohn\u0026apos;s disease, and 97 samples from patients with ulcerative colitis. Additionally, 48 genes associated with angiogenesis (ARGs) were obtained from the MSigDB database (Hallmark gene set).\n\u003c/div\u003e\n\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\n \u003ch2\u003e2.2 Differential Expression Analysis of Genes\u003c/h2\u003eIn this study, differential expression analysis was conducted using the R language limma package to calculate fold changes in gene expression levels and corresponding statistical p-values. The p-values obtained from the analysis were adjusted for multiple comparisons using the False Discovery Rate (FDR) method, with a threshold set at FDR\u0026thinsp;\u0026gt;\u0026thinsp;0.5 and an adjusted p-value (P.adjust) of 0.05 to identify differentially expressed genes. Heat maps illustrating the differential expression of genes were generated utilizing the R packages pheatmap and ggplot2.\n\u003c/div\u003e\n\u003ch3\u003e2.3 Gene enrichment analysis\u003c/h3\u003e\n\u003cp\u003eGene enrichment analysis was conducted utilizing the clusterProfiler software package and the DAVID website to investigate the gene enrichment of ARGs. This analysis aimed to elucidate the underlying biological processes (Biological Process, BP), cellular components (Cellular Component, CC), molecular functions (Molecular Function, MF), and the enrichment within the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. The objective was to identify the biological processes and signaling pathways significantly associated with ARGs. In this study, the results of the analysis were visualized using the ggplot2 and circlize packages.\u003c/p\u003e\n\u003cp\u003e2.4 Machine learning predictive model construction\u003c/p\u003e\n\u003cp\u003eIn this study, the most diagnostic biomarkers for inflammatory bowel disease (IBD) were identified by screening the ARGs with significant differential expression using a machine learning approach. Specifically, five machine learning models were constructed using the caret, randomForest, and glmnet packages in the R programming language. These models included Random Forest (RF), Support Vector Machine (SVM), Generalized Linear Model (GLM), eXtreme Gradient Boosting (XGB), and Lasso (Least Absolute Shrinkage and Selection Operator) regression. Each model was executed with default parameter settings and underwent comprehensive evaluation through cross-validation to ensure robustness and generalizability. To facilitate interpretation of the prediction outcomes, the DALEX package was employed to visualize the residual distribution and feature importance. This visualization process enabled the identification of features exerting the most significant influence on model predictions, thereby highlighting potential biomarkers. Furthermore, the pROC package was employed to construct the receiver operating characteristic (ROC) curves and compute the area under the curve (AUC) value, thereby evaluating the model\u0026apos;s efficacy in differentiating between patients with inflammatory bowel disease (IBD) and healthy control subjects.\u003c/p\u003e\n\u003cp\u003e2.5 Diagnonomogram\u003c/p\u003e\n\u003cp\u003eIn this study, the rms package in the R programming language was employed to develop a nomogram, a receiver operating characteristic (ROC) curve, and a calibration curve, thereby offering a clinical perspective on the diagnosis of inflammatory bowel disease (IBD). Initially, each patient was assigned a score based on the expression levels of individual genes, and these scores were aggregated to generate a total score. This total score indicates the potential risk of IBD for each individual, which is subsequently predicted using a nomogram. The nomogram serves as an intuitive graphical tool for illustrating the outcomes of predictive models. Furthermore, this study utilized ROC curves to assess the diagnostic performance of the model, specifically its ability to differentiate between IBD patients and healthy controls.\u003c/p\u003e\n\u003cp\u003e2.6 Analysis of the immune infiltration\u003c/p\u003e\n\u003cp\u003eIn this study, the R packages GSVA and GSEABase were employed to conduct single-sample Gene Set Enrichment Analysis (ssGSEA) in order to assess differences in immune infiltration between patients with inflammatory bowel disease (IBD) and control groups. Immune-related gene sets were sourced from the Molecular Signatures Database (MSigDB) to facilitate the ssGSEA, which aimed to quantify the extent of immune infiltration in IBD patients. Initially, an enrichment score for each immune gene set was computed for each sample using the ssGSEA approach. Following the normalization of these enrichment scores, comparisons of various immune cell enrichment scores between IBD and healthy samples were performed using the Wilcoxon test to identify immune cell types significantly associated with disease status. Furthermore, the study employed the Spearman rank correlation coefficient to evaluate the relationship between marker gene expression levels and immune cell enrichment scores, thereby identifying gene and immune cell pairs that exhibited both statistically significant and strong correlations.\u003c/p\u003e\n\n\u003ch2\u003e2.7 Construction of the TF-miRNA-mRNA regulatory network\u003c/h2\u003eIn this study, the miRNet database was employed to predict microRNAs (miRNAs) and transcription factors (TFs) in conjunction with biomarker genes, facilitating the construction of a comprehensive regulatory network that elucidates the interactions among TFs, miRNAs, and messenger RNAs (mRNAs) using the Cytoscape platform.\u003c/p\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\n \u003ch2\u003e3.1 The screening of the differentially expressed genes\u003c/h2\u003eA total of 1,945 differentially expressed genes (DEGs) were identified from the metadata using screening criteria of an adjusted p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 and an absolute log2 fold change (FC)\u0026thinsp;\u0026gt;\u0026thinsp;0.5. Among these, 1,078 genes were found to be upregulated, while 867 genes were downregulated in patients with inflammatory bowel disease (IBD). These findings indicate a significant disparity in gene expression levels between IBD patients and healthy controls, providing a crucial foundation for further exploration of the molecular mechanisms underlying IBD. To visually represent the relationship between the expression patterns of the DEGs, a heat map (Fig. 2A) and a volcano plot (Fig. 2B) were constructed. Subsequent intersection with apoptosis-related genes (ARGs) from the MSigDB database identified nine key differentially expressed ARGs (Fig. 2C).\u003cp\u003eTo investigate the functional enrichment of angiogenesis-related genes within biological processes (BP), cellular components (CC), and molecular functions (MF), as well as their involvement in specific pathological pathways, Gene Ontology (GO) enrichment analysis was conducted on angiogenesis-related genes (ARGs). The findings were visualized using histograms and chord plots (Fig.\u0026nbsp;3A and 3B). The GO enrichment analysis revealed that ARGs are prominently associated with angiogenesis and its regulatory mechanisms, including both positive and negative regulation, as well as the response to hypoxia, underscoring the critical role of oxygen deficiency in inflammatory responses. The analysis also highlighted the negative regulation of cell proliferation and growth, indicating a potential role for dysregulated cell growth in inflammatory bowel disease (IBD). Additionally, ARGs were implicated in the regulation of cell adhesion and inflammatory responses. These genes were predominantly enriched in extracellular regions, such as the extracellular space, platelet \u0026alpha;-granule lumen, and cell surface. The enriched molecular functions included protein binding, cytokine activity, and growth factor activity, suggesting a significant role for ARGs in cell signaling and intercellular communication.\u003c/p\u003e\n \u003cp\u003eThe KEGG pathway enrichment analysis results underscored several pathways significantly associated with ARGs, such as the cancer pathway, AGE-RAGE signaling in diabetic complications, cytokine-cytokine receptor interactions, rheumatoid arthritis, and inflammatory bowel disease, as depicted in Figs. 3C and 3D. By integrating the enrichment analysis findings from both GO and KEGG pathways, we enhanced our comprehension of the pathogenesis of inflammatory bowel disease (IBD) mediated by ARGs, particularly in relation to angiogenesis, cell-cell interactions, inflammatory responses, and immune regulation.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\n \u003ch2\u003e3.2 Chromosome localization\u003c/h2\u003e\n \u003cp\u003eThe chromosomal localization analysis of the nine key ARGs, identified through the intersection of gene differential expression analysis and ARG-associated genes, was conducted using the Rcircos package. Utilizing the built-in UCSC.HG19.Human.CytoBandIdeogram, human genomic chromosome data were represented in the outer circle of the plot, encompassing chromosomes 1 to 22 as well as the X and Y sex chromosomes. The chromosomal locations of the marker genes were annotated using the R package biomaRt. Subsequently, the chromosomes were distinguished by different colored stripes, and the positions of the nine identified marker genes were indicated on the diagram (Fig. 4A). Specifically, RHOB was located on chromosome 2, PROK2 on chromosome 3, CXCL8 on chromosome 4, ROBO4 and THY1 on chromosome 11, COL4A2 on chromosome 13, CHRNA7 on chromosome 15, and SERPINF1 and SPHK1 on chromosome 17.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\n \u003ch2\u003e3.3 Co-expression analysis of genes in key ARGs\u003c/h2\u003e\n \u003cp\u003eFor the nine key ARGs identified, namely CXCL8, THY1, COL4A2, PROK2, ROBO4, SPHK1, RHO1, RHOB, CHRNA7, and SERPINF1, we conducted an analysis of the correlations between their gene expression levels. This analysis is visualized in the correlation heat map presented in Fig. 4B. The values depicted in the figure represent the Spearman rank correlation coefficients between the genes, which range from \u0026minus;\u0026thinsp;1 to 1. Positive values indicate a positive correlation, while negative values indicate a negative correlation; the greater the absolute value, the stronger the correlation. The figure reveals that CXCL8 exhibits a strong positive correlation with PROK2, THY1 shows a strong positive correlation with SPHK1 and ROBO4, COL4A2 is strongly correlated with ROBO4, RHOB, and SERPINF1, and both ROBO4 and RHOB demonstrate a strong positive correlation with SERPINF1. In contrast, CHRNA7 exhibits a relatively low correlation with the other genes.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\n \u003ch2\u003e3.4 Machine learning screening for biomarkers\u003c/h2\u003e\n \u003cp\u003eMachine learning screening for biomarkers\u003c/p\u003e\n \u003cp\u003eIn conducting diagnostic genetic screening studies for inflammatory bowel disease (IBD), the dataset was initially partitioned into a training set comprising 70% of the data and a test set comprising the remaining 30%. Utilizing the \u0026apos;caret\u0026apos; package in R, five machine learning models were developed: Random Forest (RF), Support Vector Machine (SVM), Generalized Linear Model (GLM), Lasso Regression Model (Lasso), and Extreme Gradient Boosting (XGB). Cross-validation techniques were employed to assess the overall performance of these models. The DALEX package facilitated the visualization of the models\u0026apos; residual distributions and feature importance, while the pROC package was utilized to compute and visualize the Receiver Operating Characteristic (ROC) curves and the Area Under the Curve (AUC) values for the models.\u003c/p\u003e\n \u003cp\u003eBased on the residual distribution illustrated in Figs.\u0026nbsp;5A and 5B, we conduct an analysis of the specific outcomes associated with the five models. In each boxplot, the red dots represent the root-mean-square (RMS) values of the residuals for each model. The boxplots indicate that the residual distributions for the Random Forest (RF), Support Vector Machine (SVM), and Lasso models are the most concentrated, with the SVM model demonstrating the highest concentration and the lowest RMS value. This observation implies a high level of predictive accuracy for the SVM model. In contrast, the Extreme Gradient Boosting (XGB) model exhibits the widest range of residuals and the highest RMS value, suggesting a potentially greater prediction error. The Generalized Linear Model (GLM) occupies an intermediate position with respect to these metrics.\u003c/p\u003e\n \u003cp\u003eUpon evaluating the prediction accuracy of the models using the Receiver Operating Characteristic (ROC) curve, it was confirmed that the Random Forest (RF), Support Vector Machine (SVM), and Lasso models demonstrated superior predictive accuracy among the five machine learning models assessed, as evidenced by their respective Area Under the Curve (AUC) values (Fig.\u0026nbsp;5C). Specifically, the RF, SVM, and Lasso models exhibited relatively higher AUC values in distinguishing inflammatory bowel disease (IBD) status, with values of 0.886, 0.873, and 0.871, respectively. An AUC value approaching 1 indicates high predictive accuracy, whereas an AUC value near 0.5 suggests performance akin to random guessing, rendering it ineffective. In this study, an AUC greater than 0.7 is considered indicative of good predictive performance. Although the Generalized Linear Model (GLM) and Extreme Gradient Boosting (XGB) models also demonstrated some predictive significance, the combined analysis of residual distribution results further supports the superior performance of the RF, SVM, and Lasso models.\u003c/p\u003e\n \u003cp\u003eIn this study, Random Forest (RF), Support Vector Machine (SVM), and Lasso regression\u0026mdash;three machine learning models known for their superior performance\u0026mdash;were employed to assess the feature importance of nine genes (refer to Fig.\u0026nbsp;5D) and to identify the most critical marker genes. The RF models were developed using the randomForest package within the R package caret, while the SVM models were constructed utilizing the R package e1071 in conjunction with caret. The method of \u0026quot;one minus AUC loss after permutations\u0026quot; was applied, wherein each feature was randomly permuted to evaluate its impact on model accuracy, as measured by the Area Under the Curve (AUC). The permutation test was used to assess the effect of each feature\u0026apos;s alteration on model performance. Based on the combined results from the SVM and RF models, it was preliminarily concluded that RHOB, SERPINF1, and PROK2 exhibit relatively low importance.\u003c/p\u003e\n \u003cp\u003eFurthermore, this study utilized the Lasso regression model to construct a predictive framework for assessing inflammatory bowel disease (IBD) status, based on the gene expression levels of nine specific genes. The Lasso regression technique enhances model coefficient estimation by integrating a penalty term into the conventional least squares approach, thereby imposing constraints on the coefficients. In Fig. 5E, the x-axis depicts the logarithmic value (Log Lambda) of the penalty parameter lambda, while the y-axis indicates the magnitude of the coefficients. Each line represents the coefficient of a variable, which decreases from non-zero values to zero as lambda increases (moving from left to right). It is apparent that with increasing lambda values, a larger number of coefficients are driven to zero, resulting in a more parsimonious model. Figure 5F presents the cross-validation error plot, where the x-axis similarly represents the logarithmic value (Log Lambda) of the penalty parameter lambda. The y-axis denotes the binomial deviance, a widely utilized loss function in classification tasks. The red solid line depicts the mean cross-validation error for each lambda value, while the dotted line illustrates the error range, represented by the standard error. The second dashed line, selected in this study, indicates the most streamlined modeling outcome within the minimum error range, known as the 1-SE rule. This rule typically favors a simpler, more regularized model. In practice, the 1-SE rule is inclined to select a more robust and generalized model, thereby mitigating the risk of overfitting. In conclusion, our study developed the most succinct model for predicting inflammatory bowel disease (IBD) states, utilizing CXCL8, Thy1, and COL4A2. These markers were ultimately identified as the key indicators in our analysis.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\n \u003ch2\u003e3.5 Analysis using Gene Set Enrichment Analysis (GSEA).\u003c/h2\u003e\n \u003cp\u003eThrough the analysis of gene expression levels in patient samples, we categorized patients into high and low expression groups based on the expression levels of the marker gene. Following this classification, we conducted differential gene expression analysis and ranked the genes according to the log-transformed fold change (logFC) of the differentially expressed genes, in descending order. Utilizing this ranked list, we performed Gene Set Enrichment Analysis (GSEA) on the KEGG database of the human genome to identify pathways that are enriched in the high expression group compared to the low expression group. This approach allowed us to evaluate the KEGG pathways potentially influenced by the marker gene.\u003c/p\u003e\n \u003cp\u003eWe conducted a Gene Set Enrichment Analysis (GSEA) on the key biomarkers identified by our machine learning model. Figure\u0026nbsp;6A illustrates the GSEA results for CXCL8, revealing significant enrichment of several related pathways in the high expression cohort. These pathways include cytokine-cytokine receptor interaction, chemokine signaling, Toll-like receptor signaling, and Nod-like receptor signaling pathways. Notably, these pathways exhibit substantial negative enrichment scores (Enrichment Score, ES), suggesting significant downregulation in the high expression group. Additionally, the Leishmania infection pathway is enriched, potentially indicating a broader immune response pattern rather than a direct pathogen infection. This observation implies that elevated CXCL8 expression may be associated with the downregulation of these immune-related pathways.\u003c/p\u003e\n \u003cp\u003eFigure 6B presents the TGSEA analysis of HY 1, indicating that pathways associated with adolescent-onset adult-type diabetes, drug metabolism via cytochrome P450, and other drug-metabolizing enzymes were significantly enriched in the low-expression group. The enrichment scores (ES) for these pathways were notably high, suggesting significant upregulation in the low-expression group. This finding may imply that the inhibition of THY 1 expression is closely linked to maturity-onset diabetes of the young (MODY) and could potentially enhance the metabolism rate of various drugs. Furthermore, pathways related to neuroactive ligand-receptor interactions and extracellular matrix-receptor interactions were significantly enriched in the high-expression group, with enrichment scores showing relatively large negative values. This suggests significant downregulation in the high-expression group, implying that the upregulation of THY 1 may downregulate pathways involved in neuroactive ligand-receptor interactions and extracellular matrix-receptor interactions, thereby affecting intercellular communication.\u003c/p\u003e\n \u003cp\u003eFigure 6C presents the Gene Set Enrichment Analysis (GSEA) of COL4A2, highlighting three principal pathways associated with the low expression group: ribosomes, oxidative phosphorylation, and Huntington\u0026apos;s disease, among others. The substantial enrichment score (ES) suggests a significant upregulation of these pathways in the low expression group, implying that COL4A2 expression may contribute to the upregulation of pathways related to ribosomes and oxidative phosphorylation, and may be closely linked to Huntington\u0026apos;s disease. Conversely, pathways associated with cytokine-cytokine receptor interaction and extracellular matrix-receptor interaction are markedly enriched in the high expression group, as indicated by large negative enrichment scores. This suggests a significant downregulation of these immune-related pathways in the high expression group, indicating that elevated COL4A2 expression may lead to the downregulation of immune-related pathways.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\n \u003ch2\u003e3.6 Validation of external datasets for biomarker analysis.\u003c/h2\u003e\n \u003cp\u003eTo evaluate the predictive capacity of individual gene expression levels in distinguishing between inflammatory bowel disease (IBD) and healthy states, we conducted a receiver operating characteristic (ROC) analysis utilizing the pROC package. ROC curves for each gene were constructed by assigning a value of 0 to healthy status and a value of 1 to IBD status, with the corresponding area under the curve (AUC) values calculated to quantify predictive performance. The specific results of the ROC analysis (Fig.\u0026nbsp;7A) indicated that the CXCL8 gene exhibited the highest AUC value of 0.796, followed by COL4A2 with an AUC of 0.748, and THY1 with an AUC of 0.734. These findings suggest that these gene expression levels demonstrate significant potential as biomarkers for the reliable prediction of IBD status.\u003c/p\u003e\n \u003cp\u003eTo assess the reliability of the analytical results, the study reanalyzed 127 samples to evaluate the predictive capability of individual gene expression levels in distinguishing between Inflammatory Bowel Disease (IBD) and healthy conditions. The findings, presented in Fig. 7B, demonstrate that all three marker genes exhibit strong predictive power. Additionally, boxplots (Figs. 7C and 7D) illustrate the differences in expression levels, revealing that the expression levels of the three genes are significantly elevated in the IBD group compared to the healthy group. This observation aligns with the expression patterns of the marker genes in the GSE165512 dataset utilized at the study\u0026apos;s outset, thereby further corroborating the reliability of the study\u0026apos;s analytical results.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\n \u003ch2\u003e3.7 Development of a diagnostic nomogram utilizing biomarkers\u003c/h2\u003e\n \u003cp\u003eWe employed the rmsR package to develop a nomogram, receiver operating characteristic (ROC) curve, and calibration curve for the clinical diagnosis of inflammatory bowel disease (IBD). Initially, we constructed a nomogram (refer to Fig.\u0026nbsp;8A), which generates a total score by assigning a score to each gene based on its expression level, subsequently summing these scores to estimate the risk of IBD. Within the nomogram, the uppermost \u0026apos;Points\u0026apos; band delineates a score range for each gene or variable. Scores corresponding to the patient\u0026apos;s gene expression levels can be identified on this band, and these scores are aggregated to derive the \u0026apos;Total Points.\u0026apos; The \u0026apos;Linear Predictor\u0026apos; scale converts the total score into a linear predictive value, facilitating the mapping of cumulative scores onto a probability scale. At the bottom of the nomogram, the \u0026apos;Probability of IBD\u0026apos; band displays the likelihood of IBD occurrence as determined by the cumulative score. This probability scale typically ranges from 0 to 1, with each probability value corresponding to distinct risk categories.\u003c/p\u003e\n \u003cp\u003eWe additionally employed the Receiver Operating Characteristic (ROC) curve to illustrate the relationship between the sensitivity and specificity of the model, as depicted in Fig.\u0026nbsp;8B. The model achieved an Area Under the Curve (AUC) value of 0.853, signifying a high level of diagnostic accuracy. AUC values approaching 1.0 are indicative of substantial predictive capability. In this context, the elevated AUC value suggests that the gene expression level serves as a robust biomarker for differentiating between Inflammatory Bowel Disease (IBD) and healthy status.\u003c/p\u003e\n \u003cp\u003eSubsequently, we validated the predictive accuracy of the model through the application of a calibration curve (Fig.\u0026nbsp;8C), which demonstrated a strong concordance between the observed probabilities and those predicted by the model. Ideally, the calibration curve should align with the 45-degree line (denoted as the Ideal line in the figure). Our analysis reveals that both the apparent probability predicted by the model and the bias-corrected predicted probability closely approximate the ideal line, thereby indicating the model\u0026apos;s proficiency in accurately estimating the probability of inflammatory bowel disease (IBD).\u003c/p\u003e\n \u003cp\u003eFinally, a decision curve analysis was conducted (refer to Fig. 8D) to illustrate the anticipated net benefit of employing the model for predictions, as opposed to taking no action (i.e., All or None), across various risk thresholds. The results indicate that the model offers a net benefit surpassing baseline predictions at the majority of risk thresholds when utilized for predicting inflammatory bowel disease (IBD).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\n \u003ch2\u003e3.8 Analysis of immune cell infiltration\u003c/h2\u003e\n \u003cp\u003eThrough immune infiltration analysis, we investigated the differences in immune cell composition between individuals with inflammatory bowel disease (IBD) and healthy controls. We assessed the enrichment of various immune cell types in each sample and compared the enrichment scores of these cells between IBD and healthy samples using the Wilcoxon test. This analysis allowed us to identify immune cell types closely associated with disease status. Figure\u0026nbsp;9A illustrates the enrichment scores of different immune cell types in IBD versus healthy samples, as determined by single-sample gene set enrichment analysis (ssGSEA), with the presence of various immune cells clearly indicated. By comparing the ssGSEA scores between the IBD and healthy groups, we observed significant differences in the enrichment scores of 11 immune cell types, including neutrophils, Th17 cells, and regulatory T cells (TReg), in the IBD group. These findings suggest that these cell types may play a crucial role in the pathogenesis of IBD. Additionally, we illustrated the relative infiltration levels through the use of cloud and rain plots (Fig.\u0026nbsp;9B) for the seven immune cell types that demonstrated the most significant differences between the IBD and healthy groups. These immune cell types include regulatory T cells (T Reg), Th 17 cells, Th 1 cells, effector memory T cells (Tem), neutrophils, activated dendritic cells (aDC), and macrophages.\u003c/p\u003e\n \u003cp\u003eSubsequently, we conducted an in-depth analysis of the correlation between the expression levels of specific marker genes and the immune cell enrichment scores. Our primary focus was on evaluating the strength of the association between marker gene expression and significantly different immune cell subsets. Initially, we integrated gene expression data with immune cell enrichment scores derived from single sample gene set enrichment analysis (ssGSEA). This integration allowed for the alignment and comparison of gene expression levels with immune cell abundance scores for each sample. To assess the correlation between marker gene expression levels and immune cell enrichment scores, we employed the Spearman rank correlation coefficient. In all correlation analyses conducted, we selected gene-cell pairs with a p-value less than 0.05 and an absolute Spearman correlation coefficient (\u0026rho;) exceeding 0.5. This criterion was designed to identify gene and cell type pairs exhibiting both statistically significant and robust correlations. Based on this screening threshold, we identified four relational pairs: COL4A2-macrophages, CXCL8-aDC, CXCL8-macrophages, and THY1-NK cells. The results demonstrated a positive correlation between the expression level of CXCL8 and the aDC enrichment score, as illustrated in Fig. 10A. The Spearman correlation coefficient was 0.55, with a 95% confidence interval of [0.43, 0.65], and a p-value of 1.12e-14, indicating a strong positive correlation. Additionally, the expression level of CXCL8 was positively associated with macrophage enrichment scores, as shown in Fig. 10B. The Spearman correlation coefficient for this relationship was 0.66, with a 95% confidence interval of [0.56, 0.74], and a p-value of 1.32e-22, indicating a very strong positive correlation. These findings suggest that increases in CXCL8 expression levels correspond to higher enrichment scores for both aDCs and macrophages. The expression level of COL4A2 demonstrated a positive correlation with the macrophage enrichment score, as depicted in Fig. 10C. The Spearman correlation coefficient was calculated to be 0.64, with a 95% confidence interval ranging from 0.53 to 0.72, and a p-value of 1.37e-20, indicating a very strong positive correlation between these variables. This suggests that as the expression levels of COL4A2 increase, there is a corresponding increase in the macrophage enrichment score. Similarly, the expression level of THY1 exhibited a positive correlation with the NK cell enrichment score, as shown in Fig. 10D. The Spearman correlation coefficient was determined to be 0.57, with a 95% confidence interval of [0.45, 0.66], and a p-value of 7.13e-16, indicating a strong positive correlation. This implies that upregulation of THY1 expression is associated with a corresponding enhancement in NK cell enrichment.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec19\" class=\"Section2\"\u003e\n \u003ch2\u003e3.9 Development of a transcription factor-microRNA-messenger RNA regulatory network\u003c/h2\u003e\n \u003cp\u003eFinally, the miRNet database was utilized to predict small RNAs (miRNAs) and transcription factors (TFs) associated with three biomarker genes. These three biomarker genes were found to predict 162 interacting gene relationships, comprising 31 transcription factors and 20 miRNAs. Subsequently, a Marker-miRNA-TF network was constructed using Cytoscape software, as illustrated in Fig. 11. In this network, the red nodes represent the three biomarker genes, the light green nodes denote transcription factors, and the yellow nodes indicate miRNAs. The figure illustrates the interactions between COL4A2, CXCL8, THY1, and various microRNAs (miRNAs) and transcription factors (TFs), suggesting that these miRNAs and TFs may influence COL4A2, CXCL8, and THY1 either directly or indirectly, thereby contributing to the pathogenesis of inflammatory bowel disease (IBD). CXCL8 is associated with multiple miRNAs and TFs, indicating that these molecules may play a role in the direct regulation of CXCL8 gene expression. In contrast, COL4A2 is primarily linked to miRNAs, suggesting that its gene expression is predominantly regulated by these miRNAs. Meanwhile, THY1 is associated only with the transcription factor POU5F1 and the miRNAs hsa-mir-494-3p and hsa-mir-16-5p, indicating that the expression of the THY1 gene is directly regulated by these specific miRNAs and TF. This suggests a relatively straightforward regulatory mechanism for THY1 gene expression, which may facilitate further mechanistic investigations.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eRecent studies have increasingly focused on the association between inflammatory bowel disease (IBD) and angiogenesis. Angiogenesis, the formation of new blood vessels, plays a critical role in the pathophysiology of IBD. Research has demonstrated that levels of angiogenic factors, such as vascular endothelial growth factor (VEGF), are significantly elevated in patients with IBD, potentially correlating with disease activity [\u003cspan class=\"CitationRef\"\u003e23\u003c/span\u003e]. Moreover, the inhibition of angiogenesis has been proposed as a promising therapeutic approach for IBD. For instance, ginsenoside Rg3, derived from traditional Chinese medicine ginseng, has garnered attention for its antiangiogenic properties, and its efficacy in IBD treatment may be enhanced through delivery via a thermosensitive hydrogel system [\u003cspan class=\"CitationRef\"\u003e24\u003c/span\u003e]. In the management of inflammatory bowel disease (IBD), anti-tumor necrosis factor (TNF) antibodies, such as infliximab, have demonstrated efficacy in restoring endothelial nitric oxide synthase (eNOS) and vascular endothelial growth factor receptor 2 (VEGFR2) protein expression in endothelial cells. This observation implies that modulation of angiogenesis may contribute to mitigating the inflammatory processes associated with IBD [\u003cspan class=\"CitationRef\"\u003e25\u003c/span\u003e]. These findings indicate that angiogenesis not only plays a crucial role in the pathogenesis of IBD but also represents a potential target for future therapeutic interventions. Further investigation into the specific mechanisms underlying angiogenesis in IBD could yield novel insights for the development of more effective treatment modalities.\u003c/p\u003e\n\u003cp\u003eIn the present study, three signature biomarkers\u0026mdash;CXCL8, COL4A2, and THY1\u0026mdash;were identified as significantly associated with angiogenesis in inflammatory bowel disease (IBD) through the application of LASSO regression, random forest (RF), and support vector machine (SVM) models, as well as weighted gene co-expression network analysis (WGCNA). Receiver operating characteristic (ROC) analysis and diagnostic nomogram demonstrated that these biomarkers exhibited excellent discriminatory power in differentiating IBD samples from healthy controls. Results from single-sample gene set enrichment analysis (ssGSEA) revealed a significant increase in the infiltration of various immune cells, with the study highlighting the differential infiltration levels of the seven most significantly distinct immune cell types between the IBD and healthy groups. Furthermore, the three biomarkers showed associations with macrophages, activated dendritic cells (aDC), and natural killer (NK) cells. These findings suggest that CXCL8, COL4A2, and THY1 are closely linked to angiogenesis in IBD and may serve as promising diagnostic and therapeutic biomarkers for the disease.\u003c/p\u003e\n\u003cp\u003eCXCL8, also referred to as interleukin-8 (IL-8), is integral to the inflammatory response, functioning as a potent neutrophil chemokine that orchestrates the directed migration of leukocytes. This is achieved through its activation upon binding to specific chemokine G protein-coupled receptors (GPCRs). The activity of CXCL8 is contingent upon its interaction with the human CXC chemokine receptors CXCR1 and CXCR2, as well as its binding to cell surface glycosaminoglycans (GAGs) [\u003cspan class=\"CitationRef\"\u003e26\u003c/span\u003e]. Such binding facilitates the formation of a solid-phase chemotactic gradient, which optimally presents CXCL8 to circulating neutrophils [\u003cspan class=\"CitationRef\"\u003e27\u003c/span\u003e]. Nonetheless, it has been demonstrated that CXCL8\u0026apos;s activity is not solely reliant on receptor interaction but is also intricately regulated at the levels of transcription, translation, and post-translational modifications. These regulatory mechanisms are crucial for ensuring the precise spatiotemporal activity of CXCL8 in the context of inflammatory diseases and cancer [\u003cspan class=\"CitationRef\"\u003e26\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eIn conjunction with the Gene Set Enrichment Analysis (GSEA) conducted in this study, the overexpression of CXCL8 in inflammatory bowel disease (IBD) appears to modulate various immune-related pathways, including cytokine-cytokine receptor interaction, chemokine signaling, and Toll-like receptor and Nod-like receptor signaling. This modulation results in the accumulation of neutrophils at the lesion site, thereby exacerbating tissue dysfunction and the inflammatory response. Furthermore, it has been observed that the activation of the P2Y6 receptor can enhance CXCL8 expression in an AP-1-dependent manner, further facilitating the recruitment of neutrophils. This finding suggests that P2Y6 may serve as a potential therapeutic target for mitigating intestinal inflammation. Beyond its role in inflammation, CXCL8 is also implicated in cancer progression[\u003cspan class=\"CitationRef\"\u003e28\u003c/span\u003e]. Research indicates that CXCL8 contributes to tumor progression and metastasis by influencing macrophages within the tumor microenvironment. Specifically, CXCL8 regulates the ATF3-CXCL8 axis via the PI3K/AKT/mTOR signaling pathway, thereby altering the phenotype of macrophages surrounding tumors and promoting tumor growth[\u003cspan class=\"CitationRef\"\u003e27\u003c/span\u003e]. This aligns with our study\u0026apos;s conclusion that CXCL8 is closely associated with macrophage enrichment.\u003c/p\u003e\n\u003cp\u003eCXCL8 is a critical mediator in the process of angiogenesis. Empirical evidence indicates that CXCL8 facilitates endothelial cell recruitment and angiogenesis. In the context of breast cancer, CXCL8 secreted by human adipose-derived mesenchymal stem cells (hADSCs) has been observed to enhance tumor growth by promoting angiogenesis. Specifically, CXCL8 augments the migratory and tubular structure formation capabilities of human umbilical vein endothelial cells (HUVECs) via the CXCR1 and CXCR2 signaling pathways, thereby contributing to tumor angiogenesis [\u003cspan class=\"CitationRef\"\u003e29\u003c/span\u003e]. Additionally, Epstein-Barr virus (EBV) infection has been shown to upregulate CXCL8 expression, which in turn promotes angiogenesis and tumor growth through the activation of the NF-\u0026kappa;B signaling pathway. The overexpression of CXCL8 correlates with a poorer prognosis in gastric cancer patients, underscoring its significant role in angiogenesis and tumor progression in this cancer type [\u003cspan class=\"CitationRef\"\u003e30\u003c/span\u003e]. In colorectal cancer, CXCL8 enhances FOXD1 expression through the AKT/NF-\u0026kappa;B signaling pathway, thereby facilitating angiogenesis. The ability of CXCL8 to enhance the tubular structure formation, proliferation, and migration of HUVECs further supports the involvement of the CXCR2-dependent pathway [\u003cspan class=\"CitationRef\"\u003e31\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eIn summary, CXCL8 is integral to the process of angiogenesis in various inflammatory diseases and cancers. It facilitates endothelial cell migration and proliferation through specific receptor interactions and the activation of associated signaling pathways, thereby augmenting angiogenesis. These insights offer novel therapeutic approaches for the management of inflammatory bowel disease.\u003c/p\u003e\n\u003cp\u003eThe COL4A2 (type IV collagen \u0026alpha; 2 chain) gene is critically involved in the process of angiogenesis. Our research has demonstrated that the heterotrimer formed by COL4A2 in conjunction with COL4A1 constitutes a vital component of the basement membrane, which is essential for maintaining the stability and functionality of the vascular basement membrane. Mutations in COL4A2 can result in angiogenic disorders, potentially leading to a variety of vascular-related diseases [\u003cspan class=\"CitationRef\"\u003e32\u003c/span\u003e]. In a particular study, molecular and genetic analyses of a COL4A2 mutant mouse model revealed that such mutations result in abnormal vascular development, which can precipitate small vessel disease, recurrent hemorrhagic stroke, and age-related macroscopic vascular lesions. Furthermore, it has been demonstrated that the intracellular accumulation of collagen in vascular endothelial cells and pericytes, due to COL4A2 mutations, is a significant predisposing factor for intracerebral hemorrhage (ICH) [\u003cspan class=\"CitationRef\"\u003e33\u003c/span\u003e]. Furthermore, mutations in the COL4A2 gene impact the secretion of both COL4A1 and COL4A2, leading to intracellular accumulation and endoplasmic reticulum (ER) stress, which may induce cytotoxic effects. Research indicates that treatment with chemical chaperones can mitigate the intracellular accumulation of mutant collagen and ameliorate the cellular phenotype associated with COL4A2 mutations [\u003cspan class=\"CitationRef\"\u003e34\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eThe involvement of COL4A2 in inflammatory bowel disease (IBD) is potentially linked to its role in preserving the integrity of the intestinal epithelial barrier. It is widely recognized that compromised intestinal barrier function constitutes a significant factor in the initiation and progression of IBD. Research has demonstrated that the disruption of intestinal epithelial barrier function is associated with various factors, including the overexpression of inflammatory mediators, heightened apoptosis, and modifications in tight junction proteins [\u003cspan class=\"CitationRef\"\u003e35\u003c/span\u003e]. In the context of inflammatory bowel disease (IBD), the remodeling of the extracellular matrix (ECM) is recognized as a significant pathological process that can influence the expression and functionality of type IV collagen protein [\u003cspan class=\"CitationRef\"\u003e36\u003c/span\u003e]. This assertion is corroborated by the Gene Set Enrichment Analysis (GSEA) conducted in the present study. Furthermore, it has been observed that aberrant ECM metabolism occurs in the intestines of individuals with IBD, potentially leading to collagen degradation and subsequent compromise of intestinal barrier integrity [\u003cspan class=\"CitationRef\"\u003e37\u003c/span\u003e]. Consequently, the involvement of COL4A2 in IBD may pertain to its critical role in maintaining basement membrane integrity and regulating intestinal barrier function.\u003c/p\u003e\n\u003cp\u003eIn summary, COL4A2 is integral to the stability of the vascular basement membrane, with mutations in this gene linked to various vascular diseases. Its involvement in inflammatory bowel disease (IBD) is likely connected to its function in preserving the integrity of the intestinal epithelial barrier and regulating extracellular matrix remodeling. Future research should aim to elucidate the specific mechanisms by which COL4A2 influences IBD, potentially identifying novel targets and strategies for therapeutic intervention.\u003c/p\u003e\n\u003cp\u003eTHY1, also referred to as CD90, is a glycoprotein prevalent across various cell types, including fibroblasts, neurons, and immune cells, and it plays significant roles in numerous biological processes, particularly within the immune system and in fibrotic diseases. Firstly, the involvement of THY1 in immune regulation appears to be intricately linked to the pathophysiological mechanisms of inflammatory bowel disease (IBD). Evidence suggests that THY1 is instrumental in modulating T cell activation and proliferation, processes that are pivotal in the immune response associated with IBD [\u003cspan class=\"CitationRef\"\u003e38\u003c/span\u003e]. This finding aligns with our conclusion that THY1 is closely associated with the enrichment of natural killer (NK) cells. Secondly, THY1 may influence the onset and progression of IBD by impacting intestinal barrier function. The maintenance of intestinal barrier integrity is crucial for preventing the invasion of pathogens and toxins. THY1 may affect this barrier function by regulating intercellular junctions and the composition of the extracellular matrix [\u003cspan class=\"CitationRef\"\u003e39\u003c/span\u003e], a hypothesis that was corroborated by the gene set enrichment analysis (GSEA) conducted in this study. The disruption of intestinal barrier function is a prevalent pathological characteristic observed in patients with inflammatory bowel disease (IBD), potentially linked to the aberrant expression and functionality of THY1. Furthermore, the involvement of THY1 in IBD may be pertinent to its role in fibrotic processes. Patients with IBD frequently experience intestinal fibrosis, and elevated THY1 expression has been noted in fibrotic tissues, indicating its potential contribution to the development and progression of fibrosis [\u003cspan class=\"CitationRef\"\u003e40\u003c/span\u003e]. Through its regulation of extracellular matrix remodeling and fibrocyte activation, THY1 may affect the severity of fibrosis associated with IBD.\u003c/p\u003e\n\u003cp\u003eRecently, the role of THY1 in angiogenesis has garnered significant scholarly interest. Research indicates that THY1 serves a crucial regulatory function in the process of angiogenesis. Initially, THY1 influences the migratory and invasive capabilities of cancer cells through its interaction with integrins. For instance, one study demonstrated that THY1 activates the calcium ion channel and P2X7 receptor signaling pathway via its interaction with \u0026alpha;V\u0026beta;3 integrin, thereby facilitating cancer cell migration and invasion [\u003cspan class=\"CitationRef\"\u003e41\u003c/span\u003e]. This mechanism suggests a potential involvement of THY1 in tumor angiogenesis. Furthermore, the expression level of THY1 can be modulated by various signaling pathways. For example, PMA (12-myristate acid-13-acetate) has been shown to upregulate THY1 expression through the activation of the PKC-\u0026delta;/Syk/NF-\u0026kappa;B signaling pathway, consequently inhibiting endothelial cell migration and the formation of capillary-like tubular structures [\u003cspan class=\"CitationRef\"\u003e42\u003c/span\u003e]. These findings indicate that THY1 is involved in a sophisticated signaling network that regulates angiogenesis. Furthermore, THY1 is integral to the wound-healing process, as it facilitates wound repair by improving blood perfusion in the skin. Evidence from studies on THY1 knockout mice, which exhibited delayed reepithelialization and reduced blood perfusion during wound healing, underscores the essential role of THY1 in angiogenesis [\u003cspan class=\"CitationRef\"\u003e43\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eIn conclusion, THY1 appears to play a multifaceted role in inflammatory bowel disease (IBD), encompassing immune regulation, intestinal barrier function, and fibrotic processes. Additionally, THY1 is integral to angiogenesis. Further investigation into the specific mechanisms by which THY1 influences IBD is essential to elucidate its contribution to disease pathogenesis and to identify novel therapeutic targets for IBD treatment.\u003c/p\u003e\n\u003cp\u003eDespite our rigorous efforts to enhance the reliability of our findings through the use of extensive sample datasets, diverse analytical methodologies, and both internal and external validation, certain limitations of our study must be acknowledged. Firstly, this research involves secondary data mining and analysis of previously published datasets, and variations in dataset selection and analytical approaches may yield different outcomes. Secondly, a substantial amount of clinical information pertaining to the sample was not obtained, leading to the omission of potential effects related to patient complications, gender, and age. Thirdly, the precise and well-defined mechanisms by which these signature biomarkers influence angiogenesis to drive the pathological processes of inflammatory bowel disease remain unclear. Consequently, further research is warranted, involving larger sample sizes from diverse regions or ethnic groups, as well as additional in vitro and in vivo experiments.\u003c/p\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eThis study ultimately identified three angiogenesis biomarkers of significance in inflammatory bowel disease (IBD): CXCL8, COL4A2, and THY1. These biomarkers are implicated in fundamental biological processes, including cytokine interactions, chemotaxis, extracellular matrix-receptor interactions, oxidative phosphorylation, and other related reactions. Immunoprofiling revealed a notable increase in 11 immune cell types within IBD samples. Moreover, a significant positive correlation was observed between these biomarkers and infiltrating immune cells. These findings suggest that the immune response plays a critical role in the angiogenic mechanisms underlying IBD, which can be attributed to the interaction between these signature biomarkers and immune-infiltrating cells.\u003c/p\u003e\n"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthical Approval and Consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eEthical approval was not required for this study, as it utilized anonymized/publicly available data and did not involve direct interaction with human or animal subjects. Consent to participate was therefore not applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable. This study did not involve human participants, personal data, identifiable images, or case reports that require consent for publication.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability/Availability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe data supporting this study were derived from the following publicly available resources:\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eGEO Database: https://www.ncbi.nlm.nih.gov/geo/\u003c/p\u003e\n\u003cp\u003eMSigDB Database: https://www.gsea-msigdb.org/gsea/msigdb/index.jsp\u003c/p\u003e\n\u003cp\u003eProcessed datasets and analysis scripts are available from the authors on request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions Statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003ePengliang Zhang: Conceptualization, Methodology, Formal Analysis, Data Curation, Software, Visualization, Writing\u0026nbsp;\u0026ndash;\u0026nbsp;Original Draft.\u003c/p\u003e\n\u003cp\u003eShuang Chen: Validation, Investigation, Resources, Writing\u0026nbsp;\u0026ndash;\u0026nbsp;Review \u0026amp; Editing.\u003c/p\u003e\n\u003cp\u003eXianmin Liu: Validation, Investigation, Resources, Writing\u0026nbsp;\u0026ndash;\u0026nbsp;Review \u0026amp; Editing.\u003c/p\u003e\n\u003cp\u003eLijuan Wu: Validation, Investigation, Resources, Writing\u0026nbsp;\u0026ndash;\u0026nbsp;Review \u0026amp; Editing.\u003c/p\u003e\n\u003cp\u003eYingjian Zhang: Supervision, Project Administration, Funding Acquisition, Writing\u0026nbsp;\u0026ndash;\u0026nbsp;Review \u0026amp; Editing.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNo additional acknowledgements beyond the listed authors are applicable to this study.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eSINGH N, BERNSTEIN C N. Environmental risk factors for inflammatory bowel disease [J]. United European gastroenterology journal, 2022, 10(10): 1047-53.\u003c/li\u003e\n\u003cli\u003eROSEN M J, DHAWAN A, SAEED S A. Inflammatory Bowel Disease in Children and Adolescents [J]. JAMA pediatrics, 2015, 169(11): 1053-60.\u003c/li\u003e\n\u003cli\u003eKOLIANI-PACE J L, SIEGEL C A. Prognosticating the Course of Inflammatory Bowel Disease [J]. Gastrointestinal endoscopy clinics of North America, 2019, 29(3): 395-404.\u003c/li\u003e\n\u003cli\u003eFABI\u0026aacute;N O, KAMARADOV\u0026aacute; K. Morphology of inflammatory bowel diseases (IBD) [J]. Ceskoslovenska patologie, 2022, 58(1): 27-37.\u003c/li\u003e\n\u003cli\u003eBRUNER L P, WHITE A M, PROKSELL S. Inflammatory Bowel Disease [J]. Primary care, 2023, 50(3): 411-27.\u003c/li\u003e\n\u003cli\u003eHEMMER A, FOREST K, RATH J, et al. Inflammatory Bowel Disease: A Concise Review [J]. South Dakota medicine : the journal of the South Dakota State Medical Association, 2023, 76(9): 416-23.\u003c/li\u003e\n\u003cli\u003eZHAO M, FENG R, BEN-HORIN S, et al. Systematic review with meta-analysis: environmental and dietary differences of inflammatory bowel disease in Eastern and Western populations [J]. Alimentary pharmacology \u0026amp; therapeutics, 2022, 55(3): 266-76.\u003c/li\u003e\n\u003cli\u003eFLYNN S, EISENSTEIN S. Inflammatory Bowel Disease Presentation and Diagnosis [J]. The Surgical clinics of North America, 2019, 99(6): 1051-62.\u003c/li\u003e\n\u003cli\u003eRAMOS G P, PAPADAKIS K A. Mechanisms of Disease: Inflammatory Bowel Diseases [J]. Mayo Clinic proceedings, 2019, 94(1): 155-65.\u003c/li\u003e\n\u003cli\u003eSCHULTSZ C, VAN DEN BERG F M, TEN KATE F W, et al. The intestinal mucus layer from patients with inflammatory bowel disease harbors high numbers of bacteria compared with controls [J]. Gastroenterology, 1999, 117(5): 1089-97.\u003c/li\u003e\n\u003cli\u003eWRIGHT E K, DING N S, NIEWIADOMSKI O. Management of inflammatory bowel disease [J]. The Medical journal of Australia, 2018, 209(7): 318-23.\u003c/li\u003e\n\u003cli\u003eSEYEDIAN S S, NOKHOSTIN F, MALAMIR M D. A review of the diagnosis, prevention, and treatment methods of inflammatory bowel disease [J]. Journal of medicine and life, 2019, 12(2): 113-22.\u003c/li\u003e\n\u003cli\u003eEDER P, KORYBALSKA K, LINKE K, et al. Angiogenesis-related proteins--their role in the pathogenesis and treatment of inflammatory bowel disease [J]. Current protein \u0026amp; peptide science, 2015, 16(3): 249-58.\u003c/li\u003e\n\u003cli\u003eALKIM C, ALKIM H, KOKSAL A R, et al. Angiogenesis in Inflammatory Bowel Disease [J]. International journal of inflammation, 2015, 2015: 970890.\u003c/li\u003e\n\u003cli\u003eBENELLI R, LORUSSO G, ALBINI A, et al. Cytokines and chemokines as regulators of angiogenesis in health and disease [J]. Current pharmaceutical design, 2006, 12(24): 3101-15.\u003c/li\u003e\n\u003cli\u003ePOUSA I D, MAT\u0026eacute; J, GISBERT J P. Angiogenesis in inflammatory bowel disease [J]. European journal of clinical investigation, 2008, 38(2): 73-81.\u003c/li\u003e\n\u003cli\u003eCHIDLOW J H, JR., SHUKLA D, GRISHAM M B, et al. Pathogenic angiogenesis in IBD and experimental colitis: new ideas and therapeutic avenues [J]. American journal of physiology Gastrointestinal and liver physiology, 2007, 293(1): G5-g18.\u003c/li\u003e\n\u003cli\u003eDEBAN L, CORREALE C, VETRANO S, et al. Multiple pathogenic roles of microvasculature in inflammatory bowel disease: a Jack of all trades [J]. The American journal of pathology, 2008, 172(6): 1457-66.\u003c/li\u003e\n\u003cli\u003eKOUTROUBAKIS I E, TSIOLAKIDOU G, KARMIRIS K, et al. Role of angiogenesis in inflammatory bowel disease [J]. Inflammatory bowel diseases, 2006, 12(6): 515-23.\u003c/li\u003e\n\u003cli\u003eAZZAM N. Angiogenesis and inflammatory bowel disease [J]. Saudi journal of gastroenterology : official journal of the Saudi Gastroenterology Association, 2007, 13(1): 37-8.\u003c/li\u003e\n\u003cli\u003eDANESE S. Negative regulators of angiogenesis in inflammatory bowel disease: thrombospondin in the spotlight [J]. Pathobiology : journal of immunopathology, molecular and cellular biology, 2008, 75(1): 22-4.\u003c/li\u003e\n\u003cli\u003eDANESE S, SANS M, DE LA MOTTE C, et al. Angiogenesis as a novel component of inflammatory bowel disease pathogenesis [J]. Gastroenterology, 2006, 130(7): 2060-73.\u003c/li\u003e\n\u003cli\u003eDAvgerinos Efthimios,Katergiannakis Vaggelogiannis,Kopanakis Nikolaos,et al: Serum VEGF and bFGF in patients with inflammatory bowel diseases. Ann Ital Chir, 2014 May-Jun;85(3):203-6.\u003c/li\u003e\n\u003cli\u003eXie Yiqiong,Ma Ying,Xu Lu,et al: Inhibition of Angiogenesis and Effect on Inflammatory Bowel Disease of Ginsenoside Rg3-Loaded Thermosensitive Hydrogel. Pharmaceutics, 2024 Sep 25;16(10):1243.\u003c/li\u003e\n\u003cli\u003eAltorjay I,Bacskai I,B\u0026aacute;tori R,et al: Anti-TNF-alpha antibody (infliximab) therapy supports the recovery of eNOS and VEGFR2 protein expression in endothelial cells. Int J Immunopathol Pharmacol, 2011 Apr-Jun;24(2):323-35. \u003c/li\u003e\n\u003cli\u003eCambier Seppe,Gouwy Mieke,Proost Paul : The chemokines CXCL8 and CXCL12: molecular and functional properties, role in disease and efforts towards pharmacological intervention. Cell Mol Immunol, 2023 Mar;20(3):217-251. \u003c/li\u003e\n\u003cli\u003eAdage Tiziana,Bartley Michael R,Del Bene Francesca,et al: PA401, a novel CXCL8-based biologic therapeutic with increased glycosaminoglycan binding, reduces bronchoalveolar lavage neutrophils and systemic inflammatory markers in a murine model of LPS-induced lung inflammation.Cytokine, 2015 Dec;76(2):433-441.\u003c/li\u003e\n\u003cli\u003eArguin Guillaume,Bilodeau Maude S,Degagn\u0026eacute; \u0026Eacute;milie,et al: P2Y6 receptor contributes to neutrophil recruitment to inflamed intestinal mucosa by increasing CXC chemokine ligand 8 expression in an AP-1-dependent manner in epithelial cells. Inflamm Bowel Dis, 2012 Aug;18(8):1456-69.\u003c/li\u003e\n\u003cli\u003eWang Yuan,Liu Junli,Jiang Qingyuan,et al: Human Adipose-Derived Mesenchymal Stem Cell-Secreted CXCL1 and CXCL8 Facilitate Breast Tumor Growth By Promoting Angiogenesis. Stem Cells, 2017 Sep;35(9):2060-2070.\u003c/li\u003e\n\u003cli\u003eZhang Jing-Yue,Du Yu,Gong Li-Ping,et al: EBV-Induced CXCL8 Upregulation Promotes Vasculogenic Mimicry in Gastric Carcinoma via NF-\u0026kappa;B Signaling. Front Cell Infect Microbiol, 2022 Mar 7:12:780416. \u003c/li\u003e\n\u003cli\u003eChen Chun,Xu Zhuo-Qing,Zong Ya-Ping,et al: CXCL5 induces tumor angiogenesis via enhancing the expression of FOXD1 mediated by the AKT/NF-\u0026kappa;B pathway in colorectal cancer. Cell Death Dis, 2019 Feb 21;10(3):178. \u003c/li\u003e\n\u003cli\u003eJeanne Marion,Labelle-Dumais Cassandre,Jorgensen Jeff,et al: COL4A2 mutations impair COL4A1 and COL4A2 secretion and cause hemorrhagic stroke. Am J Hum Genet, 2012 Jan 13;90(1):91-101. \u003c/li\u003e\n\u003cli\u003eJeanne Marion,Jorgensen Jeff,Gould Douglas B : Molecular and Genetic Analyses of Collagen Type IV Mutant Mouse Models of Spontaneous Intracerebral Hemorrhage Identify Mechanisms for Stroke Prevention. Circulation, 2015 May 5;131(18):1555-65.\u003c/li\u003e\n\u003cli\u003eMurray Lydia S,Lu Yinhui,Taggart Aislynn,et al: Chemical chaperone treatment reduces intracellular accumulation of mutant collagen IV and ameliorates the cellular phenotype of a COL4A2 mutation that causes haemorrhagic stroke. Hum Mol Genet, 2014 Jan 15;23(2):283-92.\u003c/li\u003e\n\u003cli\u003eYe Xiaolin,Sun Mei : AGR2 ameliorates tumor necrosis factor-\u0026alpha;-induced epithelial barrier dysfunction via suppression of NF-\u0026kappa;B p65-mediated MLCK/p-MLC pathway activation. Int J Mol Med, 2017 May;39(5):1206-1214. \u003c/li\u003e\n\u003cli\u003eMortensen Joachim H\u0026oslash;g,Manon-Jensen Tina,Jensen Michael Dam,et al: Ulcerative colitis, Crohns disease, and irritable bowel syndrome have different profiles of extracellular matrix turnover, which also reflects disease activity in Crohns disease. PLoS One, 2017 Oct 13;12(10):e0185855.\u003c/li\u003e\n\u003cli\u003eFischer Andreas,Gluth Markus,Weege Friderike,et al: Glucocorticoids regulate barrier function and claudin expression in intestinal epithelial cells via MKP-1. Am J Physiol Gastrointest Liver Physiol, 2014 Feb;306(3):G218-28. \u003c/li\u003e\n\u003cli\u003eRaza Ali,Yousaf Wajeeha,Giannella Ralph,et al: Th17 cells: interactions with predisposing factors in the immunopathogenesis of inflammatory bowel disease. Expert Rev Clin Immunol, 2012 Feb;8(2):161-8.\u003c/li\u003e\n\u003cli\u003eShi Ruoran,Yu Fazheng,Hu Xueyu,et al: Protective Effect of Lactiplantibacillus plantarum subsp. plantarum SC-5 on Dextran Sulfate Sodium-Induced Colitis in Mice. Foods, 2023 Feb 20;12(4):897.\u003c/li\u003e\n\u003cli\u003eAl-Araimi Amna,Al Kharusi Amira,Bani Oraba Asma,et al: Deletion of SOCS2 Reduces Post-Colitis Fibrosis via Alteration of the TGF\u0026beta; Pathway. Int J Mol Sci, 2020 Apr 27;21(9):3073. \u003c/li\u003e\n\u003cli\u003eBrenet Marianne,Mart\u0026iacute;nez Samuel,P\u0026eacute;rez-Nu\u0026ntilde;ez Ram\u0026oacute;n,et al: Thy-1 (CD90)-Induced Metastatic Cancer Cell Migration and Invasion Are \u0026beta;3 Integrin-Dependent and Involve a Ca2+/P2X7 Receptor Signaling Axis. Front Cell Dev Biol, 2021 Jan 12:8:592442.\u003c/li\u003e\n\u003cli\u003eWen Heng-Ching,Huo Yen Nien,Chou Chih-Ming,et al: PMA inhibits endothelial cell migration through activating the PKC-\u0026delta;/Syk/NF-\u0026kappa;B-mediated up-regulation of Thy-1. Sci Rep, 2018 Nov 2;8(1):16247.\u003c/li\u003e\n\u003cli\u003eP\u0026eacute;rez Leonardo A,Le\u0026oacute;n Jos\u0026eacute;,L\u0026oacute;pez Juan,et al: The GPI-Anchored Protein Thy-1/CD90 Promotes Wound Healing upon Injury to the Skin by Enhancing Skin Perfusion. Int J Mol Sci, 2022 Oct 19;23(20):12539.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-6554786/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6554786/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eBackground: The pathogenesis of inflammatory bowel disease (IBD) remains poorly understood, with angiogenesis playing a crucial role in its development. This study primarily aims to identify effective biomarkers of angiogenesis in IBD and to enhance the understanding of the disease's immunological characteristics.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMethods: Data sets related to IBD were sourced from the GEO database, including one set for bioinformatics analysis and machine learning, and another for external validation. Gene sets associated with angiogenesis were obtained from the MSigDB database, and IBD-related angiogenesis gene sets were identified by intersecting with the IBD data set. Support Vector Machine (SVM), Lasso regression, and Random Forest (RF) models were employed to identify marker genes. The diagnostic performance of the eigengene was evaluated using the receiver operating characteristic (ROC) curve and a diagnostic nomogram. Single-sample gene set enrichment analysis (ssGSEA) was utilized to elucidate the immune landscape, and correlation analysis was conducted to explore the relationship between eigengenes and immune infiltration.\u003c/p\u003e\n\u003cp\u003eResults: The convergence of results from LASSO, Random Forest (RF), and Support Vector Machine (SVM) analyses identified three key genes: CXCL8, THY1, and COL4A2. The biological processes associated with these genes primarily involve cytokine interactions, chemotaxis, extracellular matrix-receptor interactions, and oxidative phosphorylation, among others. Immune infiltration analysis demonstrated a significant increase in 11 immune cell types within the inflammatory bowel disease (IBD) samples. Furthermore, these signature genes exhibited a strong correlation with various immune cells.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eConclusions: CXCL8, THY1, and COL4A2 have been identified as reliable potential biomarkers for angiogenesis in IBD. The immune responses mediated by these biomarkers play a critical role in IBD angiogenesis through interactions with immune-infiltrating cells.\u003c/p\u003e","manuscriptTitle":"Biomarker Prediction and Immune Landscape of Angiogenesis in Inflammatory Bowel Disease: Insights from Bioinformatics and Machine Learning Approaches","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-05-22 17:05:50","doi":"10.21203/rs.3.rs-6554786/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"dbb99812-fd73-4125-8605-07903b3f9643","owner":[],"postedDate":"May 22nd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":48871794,"name":"Biological sciences/Immunology"},{"id":48871795,"name":"Health sciences/Gastroenterology"}],"tags":[],"updatedAt":"2025-09-22T18:08:23+00:00","versionOfRecord":[],"versionCreatedAt":"2025-05-22 17:05:50","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6554786","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6554786","identity":"rs-6554786","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.