Machine Learning-Driven Identification of Diagnostic Biomarkers in Ischemic Stroke: Focus on PI3K Pathway | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Machine Learning-Driven Identification of Diagnostic Biomarkers in Ischemic Stroke: Focus on PI3K Pathway Shasha Lei, Zhi-Xin Huang, Zhenzhen Wang, Kaili Huang, Zhihui Chen, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6186335/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 11 You are reading this latest preprint version Abstract Summary :High mortality and disability rates in ischemic stroke patients continue to pose substantial societal challenges, with the PI3K signaling pathway emerging as a critical mediator of post-stroke pathological processes. While this pathway's involvement in stroke pathophysiology is established, the complex interplay between PI3K-associated genes, stroke outcomes, and the immune microenvironment remains poorly understood, limiting the development of targeted immunotherapies. Here, we conducted a comprehensive analysis of PI3K pathway-related gene expression patterns in ischemic stroke samples, employing consensus clustering and immune infiltration analysis, coupled with machine learning algorithms and molecular docking experiments. Our analysis revealed two distinct patient subgroups with significant differences in immune infiltration profiles and identified five key diagnostic genes (PIN1, CDK2, VAV3, YWHAB, and CFL1). The developed predictive nomogram demonstrated high accuracy in disease onset prediction, validated through ROC analysis, while molecular docking experiments confirmed strong binding affinities between these genes and potential therapeutic compounds. These findings establish the PI3K signaling pathway as a crucial regulator of cerebrovascular and neural tissue repair following ischemic stroke, with the identified gene signature offering promising applications for early detection and prognostic assessment. Importantly, this classification system may enable the development of personalized immunotherapy strategies, potentially transforming the landscape of individualized stroke management. Ischemic stroke PI3K Signaling Pathway Biomarkers Immune infiltration Machine Learning Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Introduction Stroke remains the second leading cause of death globally, with its aftermath placing a profound burden on individuals and society due to the resulting disability. The World Health Organization projects a concerning increase in stroke-related mortality, underscoring the urgent need for the development of advanced diagnostic and therapeutic strategies [ 1 ]. Ischemic stroke (IS), which accounts for approximately 70% of all strokes, is a major contributor to this trend. A review of the global burden of IS over the past three decades predicts a continued rise, with a staggering 4.9 million deaths expected by 2030 [ 2 ]. The significant social and economic toll of IS calls for intensified efforts to reduce its incidence and improve patient outcomes. The pathophysiological process of cerebral ischemia rapidly reduces blood perfusion to the brain's affected regions, triggering a complex cascade of pathological events, including oxidative stress, apoptosis, and inflammation[ 3 – 5 ]. Among these, inflammation plays a central role, initiating a sequence of molecular reactions that contribute to tissue damage and repair. A key regulator of these processes is the Phosphoinositide 3-kinase (PI3K) signaling pathway, which is notably upregulated after stroke. This pathway regulates immune responses and cellular survival, while also enhancing the production of Vascular Endothelial Growth Factor (VEGF), a critical factor for vascular and neuronal remodeling. VEGF, through the activation of Focal Adhesion Kinase (FAK) and Paxillin, fosters neurorepair and angiogenesis, promoting the generation of epithelial cells and astrocytes around microvessels, thus underscoring the PI3K pathway’s regenerative capacity [ 6 , 7 ]. These molecular events make the PI3K signaling pathway an attractive target for therapeutic strategies aimed at improving IS outcomes. Inflammation remains a cornerstone of IS pathophysiology and is closely linked with patient prognosis. The activation of the PI3K pathway post-stroke triggers a cascade of immune responses, particularly the activation and regulation of central nervous system (CNS)-resident immune cells such as microglia and astrocytes [ 8 ]. Understanding how the PI3K pathway modulates the generation of these immune cells could provide crucial insights into IS progression and recovery. Despite the growing body of research on the PI3K signaling pathway, the precise relationships between its associated genes, stroke prognosis, the immune microenvironment, and the effectiveness of immunotherapies remain insufficiently understood. This study aims to comprehensively investigate the role of PI3K-related genes in IS, exploring their impact on disease progression, prognosis, and immune responses. By identifying potential biomarkers for diagnosis and prognosis, this research not only enhances our ability to predict clinical outcomes but also provides a foundation for developing targeted therapeutic strategies to improve IS treatment. Materials and methods Acquisition and Processing of Gene Expression Data We acquired gene expression microarray data of IS patients from the NCBI Gene Expression Omnibus (GEO) databases GSE16561 and GSE37587 ( https://www.ncbi.nlm.nih.gov/geo/ ). Both datasets, derived from the same platform, consist of whole blood specimens from human peripheral blood, collected within 24 to 48 hours post- IS. The data processing was conducted in a three-tiered approach. Initially, the single probe expression matrix files downloaded from the GEO database underwent normalization and a log2 transformation. Subsequently, the platform annotation files were aligned with each probe expression matrix, retaining only those probes with well-defined annotations. To ensure the accuracy of the included data, we analyzed the mean expression values of multiple probes corresponding to a single gene. Ultimately, we employed the ' ComBat ' function from the sva R package, installed from Bioconductor ( https://bioconductor.org/ ), to mitigate heterogeneity across different experimental batches. The final dataset comprised 24 healthy controls and 73 IS patient specimens. Additionally, we downloaded two independent datasets from the GEO database, namely GSE22255 and GSE58294, to serve as validation cohorts. The GSE22255 dataset comprises 20 IS patients and 20 age- and sex-matched controls. The GSE58294 dataset includes 69 cardiogenic embolism stroke samples and 23 control samples. By analyzing these external datasets, we can validate the diagnostic biomarkers identified in the PI3K signaling pathway, thereby enhancing the reliability and biological significance of our findings. Through a meticulous search of the MSigDB database, we identified 105 PI3K signaling-associated genes. Utilizing the curated expression data, we constructed a PI3K-related gene expression matrix. This research leveraged existing data, obviating the need for additional human or animal experimentation. A schematic of our methodology is presented in graphical abstract(Fig. 1 ). Data processing and identification of PI3K signaling-related DEGs We extracted an expression matrix for genes associated with the PI3K pathway from our processed dataset, the 'ggplot2' R package facilitated the visualization of these genes through volcano plots and boxplots. Employing the 'limma' R package, we performed differential gene expression analysis, discerning Differentially Expressed Genes (DEGs) with an adjusted p-value threshold of less than 0.05. Additionally, we utilized the 'ComplexHeatmap' package to generate two types of clustered heatmaps: one to delineat the 39 DEGs pertinent to the PI3K pathway, and another to highlight the correlation coefficients among the DEGs. This streamlined analysis offers an insightful visual synopsis of the PI3K pathway's role in IS, setting the stage for in-depth biological interpretation. GO and KEGG enrichment analyses To further elucidate the functional mechanisms of the PI3K signaling pathway in IS, we conducted Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses[ 9 , 10 ]. GO analysis provides a comprehensive description of gene functions across three categories: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). KEGG pathway analysis, on the other hand, enables us to visualize the interactions and functional relationships among the differentially expressed genes (DEGs) within established metabolic and signaling pathways. We utilized R packages, including "ClusterProfiler," "BiocManager," and "org.Hs.eg.Db," to perform GO and KEGG enrichment analyses on 39 DEGs associated with the PI3K pathway. This allowed us to gain insights into the potential roles of the PI3K signaling axis in key biological processes such as metabolism, growth, proliferation, survival, transcription, and protein synthesis. We displayed the top five enriched GO terms and KEGG pathways in ascending order of statistical significance (P < 0.05) using bubble plots to provide a concise summary of the functional relevance of the PI3K pathway in IS. Our research focuses on further exploring the role of the PI3K signaling pathway in inflammation, particularly in the context of immune infiltration following IS. The enrichment results above provide strong theoretical evidence for considering immune infiltration as a key aspect of the pathophysiology of IS. Immune landscape analysis The Immuno-Oncology-Biological-Research (IOBR) package in R is a sophisticated computational toolkit designed for studies in immunobiology[ 11 ]. It integrates six widely-used algorithms to analyze gene expression data and quantify immune cell populations: MCPcounter, TIMER, xCell, CIBERSORT, EPIC, and quanTiseq. It also includes a comprehensive collection of 255 curated gene sets designed to investigate intricate aspects of the tumor microenvironment, such as metabolic pathways, m6A modifications, exosome biology, and more. This toolkit excels at analyzing immune cell infiltration patterns across a cohort of 131 patients and extracting biologically relevant gene signatures. CIBERSORT utilizes the LM22 gene signature matrix, comprising 547 characteristic genes associated with 22 leukocyte subpopulations, including myeloid cells, natural killer cells, plasma cells, memory B cells, and seven T cell subsets. By applying CIBERSORT and the LM22 matrix, researchers can estimate the proportions of these 22 cellular phenotypes within each sample, with the sum of all immune cell proportions equaling 1. xCell complements CIBERSORT by performing cell type enrichment analysis using gene expression data from 64 distinct immune and stromal cell types. Leveraging machine learning algorithms trained on diverse cell type gene signatures, xCell effectively captures the cellular heterogeneity inherent in tissue expression landscapes. Together, these two algorithms quantify the abundance levels of immune cells, stem cells, and stromal cells within samples. Statistical tests (e.g., Wilcoxon test) are applied to analyze differences in the immune microenvironment between pathological states, laying the foundation for in-depth research on the interplay between immunity and cancer. Consensus clustering for IS samples By harnessing the ConsensusClusterPlus package, we conducted a comprehensive clustering analysis on IS patient cohorts, with the potential number of clusters (K) ranging from 1 to 9. This analytical approach integrated the K-Means algorithm based on Euclidean distance, and the hierarchical clustering process was meticulously repeated 1000 times to ensure cluster stability. The determination of the optimal cluster count and the achievement of a relatively stable clustering outcome were guided by the application of the cumulative distribution function (CDF). This systematic methodology enabled the identification of various molecular subtypes, and their distribution among different IS subgroups was effectively visualized through principal component analysis (PCA) and heatmaps. To explore the clinical significance of these distinct subgroups, a comparative analysis was performed to evaluate the differences in immune cell infiltration rates between the two subgroups. Identification of PI3K signaling-related diagnostic biomarkers In this study, we harnessed advanced machine learning algorithms, including Support Vector Machine (SVM), Random Forest (RF), and Least Absolute Shrinkage and Selection Operator (LASSO), to dissect the complex interplay of genetic factors associated with IS, thereby enhancing the predictive accuracy of our diagnostic models. The SVM algorithm, implemented via the 'e1071' R package, was adept at discerning genes with heightened discriminatory power, thereby facilitating the extraction of optimal genes for the diagnosis of IS The RF algorithm, renowned for its high flexibility and precision in capturing nonlinear relationships between dependent and independent variables, is fundamentally grounded in the construction of multiple decision trees. By leveraging bootstrap aggregating, or bagging, to diminish the inter-tree correlation, this approach effectively mitigates the overfitting issue. In our research, we harnessed the RF algorithm to forecast the risk of developing IS. LASSO, executed through the 'glmnet' R package, represents a regression analysis technique that employs regularization for variable selection. Utilizing LASSO, we were able to pinpoint genes that exhibit significant differences between IS and normal samples. Having applied these three algorithms, we subsequently employed a Venn diagram analysis to examine the intersection of genes identified by the algorithms, thereby further validating the expression levels of candidate diagnostic biomarkers. Construction and validation of a Predictive Model for IS Utilizing three distinct machine learning algorithms, we identified key diagnostic biomarkers for IS and developed a binary logistic regression model for risk prediction. The most efficient model was selected and implemented with the 'rms' R package to create a nomogram predicting IS likelihood. The nomogram assigns individual 'scores' to each biomarker, with a 'total score' reflecting their cumulative impact. Each gene's score is determined by a vertical reference to the nomogram scale, and these scores are summed to calculate the overall risk probability for IS. The model's predictive accuracy was evaluated using the Receiver Operating Characteristic (ROC) curve, and the GSE22255 and GSE58294 dataset further validated its performance, confirming the diagnostic biomarkers' expression levels and predictive value. Drug discovery in DSigDB Assessing protein-drug interactions is pivotal for discerning the viability of target genes as practical pharmaceutical targets. In this study, we will harness the DSigDB drug signature database ( https://amp.pharm.mssm.edu/Enrichr/ ) to curate a selection of candidate drugs. DSigDB is an expansive repository that integrates 22,527 gene sets with 17,389 distinct compounds across 19,531 genes, effectively linking drugs and other chemicals to their target genes. By uploading the identified target genes to DSigDB, we can prognosticate potential candidate drugs and evaluate the pharmacological activity of the target genes. Molecular docking analysis To enhance our understanding of drug-target gene interactions and assess the potential of these genes as druggable targets, our study incorporates molecular docking at the atomic level. Molecular docking allows for the calculation of binding energy, indicative of the predicted affinity between ligands and receptor proteins. A negative binding energy suggests spontaneous binding between the two molecules, with a lower energy indicating a more stable conformation. Identifying ligands with strong binding affinity and optimal interaction profiles enables us to prioritize targets for experimental validation and refine the development of promising drug candidates. Our molecular docking studies employed AutoDock version 1.5.6 to simulate interactions between potential drugs and their corresponding target gene-encoded proteins. The three-dimensional structures of candidate drugs were obtained from PubChem, and protein structures were retrieved from the Protein Data Bank (PDB). Docking outcomes were rendered using PyMOL 2.4.1, yielding detailed structures for five proteins and two candidate drugs. Results Differential Expression of PI3K Signaling Pathway Genes in IS After meticulous processing, we successfully merged the GSE16561 and GSE37587 datasets. We then integrated the gene expression matrix with genes related to PI3K signaling, constructing an expression matrix that encompasses genes associated with the PI3K pathway, from which we generated a volcano plot and a box plot (Fig. 2 A, 2 C). Through rigorous differential gene expression analysis, we delineated a cohort of 39 differentially expressed genes (DEGs), which exhibited significant dysregulation between the IS patient group and controls, as illustrated in the accompanying heatmaps (Figs. 2 B). In the IS group, we identified upregulation of genes, including PTEN , ITPR2 , MAPK1 , MYD88 , SLA , RAC1 , RAF1 , CAB39 , UBE2D3 , RPS6KA3 , GSK3B , PTPN11 , DDIT3 , VAV3 , RIT1 , TBK1 , CLTC , PLCB1 , RALB , GRB2 , RPS6KA1 , E2F1 , YWHAB , DUSP3 , MKNK1 , and ACTR2 , alongside downregulation of genes like PIN1 , CDK4 , CDK2 , LCK , FASLG , TRAF2 , PLCG1 , THEM4 , NCK1 , UBE2N , CSNK2B , HRAS , and CFL1 . This differential expression pattern offers valuable biological insights into the molecular changes occurring in IS. Further analysis revealed significant correlations among the DEGs, including a positive correlation between PIN1 and CDK4 and a negative correlation between PTEN and CDK4 (Supplementary Figure S1 ). These correlations enhance our comprehension of the PI3K pathway's role in IS and suggest promising avenues for future investigations. Functional enrichment analysis of PI3K signaling-related DEGs We conducted GO and KEGG pathway enrichment analyses on the identified 39 genes, focusing on the top five terms for MF, BP, and CC from the GO analysis, as well as the top five pathways from the KEGG analysis. The GO analysis revealed that the top biological processes associated with the PI3K pathway included peptidyl-serine phosphorylation, peptidyl-serine modification, ERBB signaling pathway, immune response-activating cell surface receptor signaling, and immune response-regulating cell surface receptor signaling pathway (Fig. 3 A). The most relevant cellular component was the serine/threonine protein kinase complex (Fig. 3 B), and the primary molecular function was phosphoprotein binding (Fig. 3 C). KEGG pathway analysis further highlighted the enrichment of these PI3K-related DEGs in pathways associated with infectious diseases (both bacterial and viral), as well as nervous system and immune system signaling cascades (Fig. 3 D). These findings suggest that the PI3K signaling pathway may play a multifaceted regulatory role in IS, potentially through modulating immune responses and immune system-related processes. The enrichment in infectious disease pathways also implies a potential involvement of the PI3K pathway in the interplay between stroke pathogenesis and inflammatory/immune mechanisms triggered by microbial infections. PI3K Signaling-Associated Gene Sets for Stroke Stratification In our analysis of 107 IS samples, we employed consensus clustering to identify two distinct subclasses, termed cluster A and cluster B (Figs. 4 A-B, Supplementary Figure S2). To explore the correlation between different subclasses of IS patients and the immune microenvironment, we analyzed immune cells and stromal cells, which are two major non-tumor components that have been proven to be of significant value in the diagnosis and prognostic assessment of various diseases. In this research, we calculated the scores for immune cells and stromal cells (Supplementary Figure S3). We quantified the presence of immune and stromal cells, two non-tumor components with significant diagnostic and prognostic value. Cluster A was characterized by an increased expression of key immune cells, including CD4 memory T cells, CD8 T cells, CD8 Central Memory T cells, naive B cells, plasma cells, smooth muscle cells, Th1, and Th2 cells. Additionally, cluster A demonstrated elevated ImmuneScore and MicroenvironmentScore, suggesting a more active immune profile (Fig. 4 C). In stark contrast, cluster B was defined by a higher expression of NKT cells and megakaryocytes, indicating a distinct immunological pattern. Further analysis using the CIBERSORT algorithm delineated the infiltration levels of 22 immune cell types across the IS subclasses. Cluster A showed a significant predominance of CD8 T cells and CD4 memory T cells, suggesting a robust immune response. Conversely, cluster B had a lower infiltration of M0 macrophages and neutrophils, pointing to a subdued inflammatory reaction (Fig. 4 D). Identification of immune microenvironment in different PI3K clusters To investigate the biological nuances of IS subtypes, we pinpointed 112 DEGs distinguishing clusters A and B. Utilizing these DEGs, we applied unsupervised clustering to segment IS patients into two novel clusters, G-1 and G-2, closely associated with the DEGs' profiles (Supplementary Figure S4). Principal component analysis (PCA) confirmed the distinct separability of these clusters, as illustrated in Fig. 5 A-B. To further elucidate the immunological differences caused by different subtypes, we conducted an immunological analysis. The immunological analysis revealed that the expression levels of CD4 memory resting T cells, CD4 memory T cells, CD8 T cells, CD8 central memory T cells, plasma cells, smooth muscle cells, Th1 cells, and Th2 cells were significantly higher in cluster G-2. In contrast, cluster G-1 exhibited higher levels of expression for macrophages M0, neutrophils, mast cells, megakaryocytes, microvascular endothelial cells, and NKT cells (Fig. 5 C-D). Based on these findings, we concluded that the gene-based grouping method more accurately describes the characteristics of patients than the traditional PI3K grouping. Construction and analysis of the prediction model To screen candidate diagnostic biomarkers, we employed three distinct algorithms. The LASSO logistic regression initially identified 15 key variables, which were visually represented in Figs. 6 A-B. The SVM algorithm further refined the list to 38 features, detailed in Fig. 6 C. The random forest algorithm complemented this by identifying nine candidate features, as shown in Figs. 6 D-E. The intersection of these analyses yielded seven consensus genes: PIN1 , CDK2 , VAV3 , PTPN11 , ITPR2 , YWHAB , and CFL1 (Fig. 6 F). In the selection of biomarkers, we strictly adhered to two core criteria: significant correlation with the research outcomes and substantial impact on the prediction of stroke events[ 12 ]. Initially, we utilized the Forward Stepwise Likelihood Ratio method to successfully identify six genes. Subsequently, we eliminated those genes that did not significantly contribute to the predictive model statistically, while vigilantly monitoring the potential effects of multicollinearity. This strategic variable selection not only simplified the model structure and reduced the risk of overfitting but also significantly enhanced the model's interpretability and predictive power. After comprehensive review and rigorous refinement, we ultimately determined five robust biomarkers: PIN1 , CDK2 , VAV3 , YWHAB , and CFL1 . Clinical Nomogram for Stroke Risk Prediction: Development and Validation We constructed a clinical nomogram utilizing multiple machine learning techniques to predict the risk of stroke based on diagnostic biomarkers (Fig. 7 A). This tool provides a scoring system for individuals, translating biomarker data into stroke risk probabilities. The predictive accuracy was affirmed through ROC analysis, which showed an AUC of 0.984, indicative of the nomogram's high predictive power (Fig. 7 B). Further validation with an independent test cohort (GSE22255) yielded an AUC of 0.77, substantiating the nomogram's reliability and robustness, Additionally, validation with another dataset resulted in an impressive AUC of 0.962, further confirming the model's predictive accuracy (Fig. 7 C). Prioritizing Drug Candidates with Bioinformatics Tools Leveraging the Enrichr platform, we harnessed the five core genes to identify potential drug candidates. We curated data from the DSigDB database to shortlist the top 20 candidate drugs, visually presented in a circular chart format (Supplementary Figure S5). A P-value-based analysis led us to prioritize five promising drugs—EMBELIN, okadaic acid, luteolin, staurosporine, and Cyperquat—that demonstrated significant interaction potential with the core genes under study. Molecular docking results Among the drug candidates, Embelin and Okadaic acid stood out for their potential interactions with our panel of five diagnostic genes. Our analysis revealed that both drugs exhibited binding energies below − 4 with these genes (Table 1 ), a threshold suggesting favorable molecular docking. The structural details of these interactions are vividly portrayed in Fig. 8 , illustrating the local architecture of the molecular complexes. These preliminary yet compelling results call for deeper exploration of Embelin and Okadaic acid's therapeutic efficacy and their potential to shape future treatment strategies. Table 1 The binding energy of medications with diagnostic genes Embelin Okadaic acid PIN1 -5.5 -10.1 YWHAB -5.3 -10.2 VAV3 -5.8 -9.2 CDK2 -6.3 -8.4 CFL1 -5.4 -7.7 Data are presented in kilocalories per mole (kcal/mol). Discussion IS carries a high incidence and significant economic burden, posing substantial challenges to public health. Timely diagnosis is essential for better patient outcomes, as delays can lead to less effective treatments and poorer prognoses. The scientific community continues to advance in early detection and treatment strategies for IS, with a growing focus on microRNAs (miRNAs) and messenger RNAs (mRNAs) as potential biomarkers, as noted in references[ 13 , 14 ]. The PI3K signaling pathway is a key player in post-stroke pathophysiology, and our study meticulously investigates its associated genes for new diagnostic insights. Our findings could significantly bolster clinical diagnostics in IS. Consequently, the present study is poised to pinpoint candidate biomarkers for the detection of IS and delve into the immunological mechanisms through which the PI3K signaling pathway exerts its influence on IS. Building on the potential of the PI3K pathway as a diagnostic avenue, our research integrated a suite of machine learning algorithms to delve into the roles of PI3K-associated genes in IS. We conducted a rigorous analysis of the expression profiles across this pathway. Through GO term and KEGG pathway analyses, we identified that peptidyl-serine phosphorylation and modification can regulate protein functions[ 15 ], influencing cellular responses to ischemic injury. The ERBB signaling pathway, part of the cell surface receptor tyrosine kinase family, triggers downstream signaling including PI3K/Akt[ 16 ], which is crucial for cell proliferation and survival. Furthermore, immune response-activating and regulating cell surface receptor signaling pathways play key roles in neuroinflammation following IS, and the PI3K/Akt pathway may modulate these pathways to affect immune responses and neural repair in stroke. Our analysis established a molecular framework linking the PI3K pathway to immune responses and cellular metabolism, which is vital for understanding the pathophysiology of IS. KEGG analysis further revealed potential connections between PI3K pathway activation and immune microenvironment dynamics, emphasizing the close association between immune microenvironment exhaustion and the PI3K pathway. Based on the significant correlation between inflammation and the PI3K pathway, which was corroborated by both existing literature and our study[ 17 – 19 ], our research focuses on further exploring the role of the PI3K signaling pathway in inflammation, especially in the context of immune infiltration following IS. To delve deeper into these mechanisms, we employed cluster analysis to identify distinct patient groups. This approach allowed us to scrutinize immune cell infiltration and to pinpoint specific genes with prognostic value, culminating in the development of a risk model tailored for IS. Comparative analysis of the transcriptomic profiles identified 39 DEGs associated with the PI3K signaling pathway when comparing IS patients to healthy controls. Of these, 26 genes were upregulated, and 13 were downregulated in the IS group. A detailed co-expression analysis among these genes highlighted significant interactions, particularly involving PIN1 with CDK4 and PTEN with CFL1 . These findings underscored the heterogeneity within IS patients, suggesting that they could be classified into two subtypes based on gene expression patterns. By examining the immune infiltration profiles, we identified novel subtypes that may respond differently to therapeutic interventions. This classification could refine treatment strategies, offering a more personalized approach to IS management. Our findings thus introduce the PI3K gene set as a promising resource for both diagnosis and therapy in IS. Our research further delves into the intricate dynamics of the immune response post-IS, with a particular focus on the adaptive immune system's role. Resting CD4 + memory T cells, once activated by specific antigens, initiate a swift defense, highlighting their critical position within the immune response[ 20 ]. Their interaction with P-selectin and platelets through P-selectin glycoprotein ligand-1[ 21 ] intertwines the immune reaction with hemostasis, a significant consideration in IS. Furthermore, CD8 + T lymphocytes have been identified to promote the proliferation of microglia and oligodendrocytes, tmodulating the brain's post-injury immune response[ 22 ]. The CD8 + central memory T cell subset, with its long-term memory capacity, may play a role in clearing necrotic cells post-stroke and mediating regulatory responses to protect brain tissue under autoimmune conditions[ 23 ]. Plasma cells also contribute significantly by secreting memory B cells and cytokines, influencing the stroke's injury and recovery processes[ 24 ]. The early stages of cerebral infarction may benefit from smooth muscle cell proliferation, which can support angiogenesis and collateral circulation development[ 25 ]. However, excessive proliferation can lead to macrophage-like differentiation, potentially leading to atherosclerosis and increased risk of recurrent vascular events. Th1 cells, secreting pro-inflammatory cytokines such as IFN-γ and TNF-α, are vital for combating infections but must be carefully balanced to prevent exacerbating brain damage[ 26 , 27 ]. Conversely, a shift towards a Th2 phenotype post-CNS injury aids in wound healing and regeneration while guarding against autoimmune diseases within the CNS[ 28 ].Current research indicates that unstable plaques harbor a higher proportion of M0 macrophages compared to stable ones[ 29 ]. The formation of neutrophil extracellular traps (NETs) could also significantly affect the severity of brain injury and patient prognosis[ 30 ]. Mast cells are implicated in intensifying inflammatory responses in CNS injuries, potentially contributing to blood-brain barrier (BBB) breakdown and associated complications[ 31 ]. Megakaryocytes, as platelet progenitors, are involved in thrombosis promotion, while brain microvascular endothelial cells protect the BBB, playing a crucial role in brain tissue and neuron preservation[ 32 ]. NKT cells have been associated with increased cerebral infarction volume and neurological deficits within the critical first 24 hours post-stroke[ 33 ]. In our study, Group G-1's elevated levels of neutrophils, M0 macrophages, mast cells, megakaryocytes, and NKT cells hinted at a more severe prognosis, while Group G-2's higher counts of resting CD4 memory T cells, CD8 + T cells, and other immune cells pointed to better neural repair and vascular regeneration, reducing the risk of complications. Our meticulous application of integrated bioinformatics and machine learning methods led to the identification of five key PI3K-associated genes— PIN1 , CDK2 , VAV3 , YWHAB , and CFL1 —as potential diagnostic markers for IS. We developed a nomogram based on peripheral blood samples to evaluate these markers' diagnostic efficacy, offering an efficient and clinically relevant method for assessing the risk of IS. This approach enhances diagnostic precision and highlights the applicability of personalized medicine in stroke diagnostics, paving the way for tailored immunotherapies based on the specifics of a patient's immune profile. Expanding on our exploration of personalized medicine in stroke diagnostics, our research has also concentrated on predicting gene-drug interactions, a critical step for identifying targeted treatments for IS. By employing Enrichr[ 34 ], we've mapped a range of drugs to five key PI3K-related biomarkers, enriching our understanding of potential therapeutic targets. Among these, embelin and okadaic acid have emerged as particularly promising candidates for further investigation[ 35 ]. Our investigation into gene-drug interactions has led us to focus on Embelin, a benzoquinone compound extracted from Embelia ribes, known for its diverse pharmacological profiles. Embelin's molecular complexity underpins its multifaceted biological activities, including anti-inflammatory, antioxidant, and antibacterial actions, alongside potential analgesic, anxiolytic, and contraceptive effects. Its noted antitumor and immunosuppressive capabilities make it a standout candidate for therapeutic development. Most notably, Embelin's neuroprotective properties have attracted significant interest, particularly its ability to cross the blood-brain barrier, positioning it as a potential therapeutic agent in neurodegenerative diseases, including stroke[ 36 ]. While its role in stroke treatment is contested, with evidence suggesting both exacerbation of injury in some cases[ 37 ] and neuroprotection in others[ 38 ]. the need for deeper exploration of Embelin's molecular mechanisms and pharmacological targets is clear. This will help clarify its potential as a stroke treatment, as illustrated in the mechanism Fig. 7 . Following our examination of Embelin, we turn to Okadaic acid, a marine toxin with potent protein phosphatase inhibitory activity, notably against PP2A. Its role in neurodegenerative research, especially in Alzheimer's disease through tau protein phosphorylation, suggests potential relevance to post-stroke cellular recovery[ 39 ]. We hypothesize that Okadaic acid may positively modulate cellular signaling to reduce damage and promote neurological healing post-stroke, as illustrated in Fig. 7 . However, these hypotheses require rigorous scientific validation and clinical trials to confirm efficacy and safety. While our study provides valuable insights, we recognize its limitations, including the need for external validation of our biomarkers for IS due to the retrospective design, further exploration in our wet lab section to validate our findings and understand the underlying mechanisms, and the unique advantages of traditional Chinese medicine (TCM) in IS treatment through its multi-component, multi-target, and multi-pathway effects. To leverage these benefits, Tian, S. et al. developed the Integrated Traditional Chinese Medicine (ITCM) platform[ 40 ], the largest herb-based pharmacotranscriptomics database, along with the COIMMR framework for rapid screening of active compounds[ 41 ]. We believe that employing pharmacotranscriptomic analysis to explore how TCM components and active agents contribute to IS treatment via the PI3K signaling pathway holds significant promise for enhancing therapeutic strategies and deepening our understanding of the underlying molecular mechanisms. Conclusion In conclusion, our research underscores the pivotal role of the PI3K signaling pathway in post-ischemic cerebrovascular and neurorestorative processes, offering a foundation for understanding the underlying pathophysiology of IS. The refined stratification of IS patients through secondary clustering has elucidated distinct immune profiles, which are essential for the development of personalized immunotherapeutic approaches. Our identification of the PI3K-related genes PIN1 , CDK2 , VAV3 , YWHAB , and CFL1 introduces a robust diagnostic framework with significant implications for early detection and prognosis. Additionally, the discovery of Embelin and Okadaic acid as potential therapeutic agents presents a novel avenue for intervention, highlighting the need for further exploration into their molecular mechanisms and clinical efficacy. Looking forward, the imperative for extensive clinical trials is clear, with the goal of validating our findings and translating them into effective, personalized stroke management strategies. Abbreviations IS Ischemic stroke PI3K Phosphoinositide 3-kinase VEGF Vascular Endothelial Growth Factor FAK Focal Adhesion Kinase CNS Central nervous system GEO Gene Expression Omnibus NCBI National Center for Biotechnology Information DEGs Differentially Expressed Genes GO Gene Ontology KEGG Kyoto Encyclopedia of Genes and Genomes MF Molecular Function BP Biological Process CC Cellular Component IOBR Immuno-Oncology-Biological-Research CDF Cumulative distribution function PCA Principal component analysis SVM Support Vector Machine RF Random Forest LASSO Least Absolute Shrinkage and Selection Operator ROC Receiver Operating Characteristic PDB Protein Data Bank NETs Neutrophil extracellular traps BBB Blood-brain barrier ITCM Integrated Traditional Chinese Medicine TCM Traditional Chinese medicine Declarations Acknowledgements Not applicable. Author contributions LSS and HZX designed the study. WZZ, HKL,OYDD and CZH conducted literature collection and summary. LSS drafted the manuscript. All authors critically revised the manuscript. Funding This work was supported by the Science and Technology Program of Guangzhou, China [2024B03J0436]; and Research Funds of Centre for Leading Medicine and Advanced Technologies of IHM [No. 2023IHM01052]. The funding sources had no role in study design, data collection, analysis, or interpretation. Availability of data and materials The datasets generated during the current study are available in the GEO repository, https://www.ncbi.nlm.nih.gov/geo/ Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Competing interests Not applicable. References Feigin VL, Owolabi MO. Pragmatic solutions to reduce the global burden of stroke: A world stroke organization-lancet neurology commission. Lancet Neurol. 2023;22:1160–206. https://doi.org/10.1016/s1474-4422(23)00277-6 . Fan J, Li X, Yu X, et al. Global burden, risk factor analysis, and prediction study of ischemic stroke, 1990–2030. Neurology. 2023;101:e137–50. https://doi.org/10.1212/wnl.0000000000207387 . del Zoppo GJ. The neurovascular unit in the setting of stroke. J Intern Med. 2010;267:156–71. https://doi.org/10.1111/j.1365-2796.2009.02199.x . Doyle KP, Simon RP, Stenzel-Poore MP. Mechanisms of ischemic brain damage. Neuropharmacology. 2008;55:310–8. https://doi.org/10.1016/j.neuropharm.2008.01.005 . Woodruff TM, Thundyil J, Tang SC, et al. Pathophysiology, treatment, and animal and cellular models of human ischemic stroke. Mol neurodegeneration. 2011;6:11. https://doi.org/10.1186/1750-1326-6-11 . Wang HJ, Ran HF, Yin Y, et al. Catalpol improves impaired neurovascular unit in ischemic stroke rats via enhancing vegf-pi3k/akt and vegf-mek1/2/erk1/2 signaling. Acta Pharmacol Sin. 2022;43:1670–85. https://doi.org/10.1038/s41401-021-00803-4 . Shi G, Chen J, Zhang C, et al. Astragaloside iv promotes cerebral angiogenesis and neurological recovery after focal ischemic stroke in mice via activating pi3k/akt/mtor signaling pathway. Heliyon. 2023;9:e22800. https://doi.org/10.1016/j.heliyon.2023.e22800 . Kerr N, Dietrich DW, Bramlett HM, et al. Sexually dimorphic microglia and ischemic stroke. CNS Neurosci Ther. 2019;25:1308–17. https://doi.org/10.1111/cns.13267 . The gene ontology resource. 20 years and still going strong. Nucleic Acids Res. 2019;47:D330. d338. Kanehisa M, Goto S, Kegg. Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. https://doi.org/10.1093/nar/28.1.27 . Zeng D, Ye Z, Shen R, et al. Iobr: Multi-omics immuno-oncology biological research to decode tumor microenvironment and signatures. Front Immunol. 2021;12:687975. https://doi.org/10.3389/fimmu.2021.687975 . Jaddoe VW, de Jonge LL, Hofman A, et al. First trimester fetal growth restriction and cardiovascular risk factors in school age children: Population based cohort study. BMJ (Clinical Res ed. 2014;348:g14. https://doi.org/10.1136/bmj.g14 . Rink C, Khanna S. Microrna in ischemic stroke etiology and pathology. Physiol Genom. 2011;43:521–8. https://doi.org/10.1152/physiolgenomics.00158.2010 . Kanki H, Matsumoto H, Togami Y, et al. Importance of micrornas by mrna-microrna integration analysis in acute ischemic stroke patients. J stroke Cerebrovasc diseases: official J Natl Stroke Association. 2023;32:107277. https://doi.org/10.1016/j.jstrokecerebrovasdis.2023.107277 . Marmelstein AM, Moreno J, Fiedler D. Chemical approaches to studying labile amino acid phosphorylation. Top Curr Chem (Cham). 2017;375:22. https://doi.org/10.1007/s41061-017-0111-1 . Milton CK, Self AJ, Clarke PA, et al. A genome-scale crispr screen identifies the erbb and mtor signaling networks as key determinants of response to pi3k inhibition in pancreatic cancer. Mol Cancer Ther. 2020;19:1423–35. https://doi.org/10.1158/1535-7163.Mct-19-1131 . Vanhaesebroeck B, Perry MWD, Brown JR, et al. Pi3k inhibitors are finally coming of age. Nat Rev Drug Discov. 2021;20:741–69. https://doi.org/10.1038/s41573-021-00209-1 . Fruman DA, Chiu H, Hopkins BD, et al. The pi3k pathway in human disease. Cell. 2017;170:605–35. https://doi.org/10.1016/j.cell.2017.07.029 . Chu E, Mychasiuk R, Hibbs ML, et al. Dysregulated phosphoinositide 3-kinase signaling in microglia: Shaping chronic neuroinflammation. J Neuroinflamm. 2021;18:276. https://doi.org/10.1186/s12974-021-02325-6 . Chalmin F, Humblin E, Ghiringhelli F, et al. Transcriptional programs underlying cd4 t cell differentiation and functions. Int Rev cell Mol biology. 2018;341:1–61. https://doi.org/10.1016/bs.ircmb.2018.07.002 . Salas-Perdomo A, Miró-Mur F, Urra X, et al. T cells prevent hemorrhagic transformation in ischemic stroke by p-selectin binding. Arterioscler Thromb Vasc Biol. 2018;38:1761–71. https://doi.org/10.1161/atvbaha.118.311284 . Kaya T, Mattugini N, Liu L, et al. Cd8(+) t cells induce interferon-responsive oligodendrocytes and microglia in white matter aging. Nat Neurosci. 2022;25:1446–57. https://doi.org/10.1038/s41593-022-01183-6 . Su B, Ng LG. Immunological modulation in health and disease. Cell Mol Immunol. 2023;20:981–2. https://doi.org/10.1038/s41423-023-01066-1 . Wang R, Li H, Ling C, et al. A novel phenotype of b cells associated with enhanced phagocytic capability and chemotactic function after ischemic stroke. Neural regeneration Res. 2023;18:2413–23. https://doi.org/10.4103/1673-5374.371365 . Zhai M, Gong S, Luan P, et al. Extracellular traps from activated vascular smooth muscle cells drive the progression of atherosclerosis. Nat Commun. 2022;13:7500. https://doi.org/10.1038/s41467-022-35330-1 . Prass K, Meisel C, Höflich C, et al. Stroke-induced immunodeficiency promotes spontaneous bacterial infections and is mediated by sympathetic activation reversal by poststroke t helper cell type 1-like immunostimulation. J Exp Med. 2003;198:725–36. https://doi.org/10.1084/jem.20021098 . Gisterå A, Hansson GK. The immunology of atherosclerosis. Nat Rev Nephrol. 2017;13:368–80. https://doi.org/10.1038/nrneph.2017.51 . Hendrix S, Nitsch R. The role of t helper cells in neuroprotection and regeneration. J Neuroimmunol. 2007;184:100–12. https://doi.org/10.1016/j.jneuroim.2006.11.019 . Wang J, Kang Z, Liu Y, et al. Identification of immune cell infiltration and diagnostic biomarkers in unstable atherosclerotic plaques by integrated bioinformatics analysis and machine learning. Front Immunol. 2022;13:956078. https://doi.org/10.3389/fimmu.2022.956078 . Denorme F, Portier I, Rustad JL et al. Neutrophil extracellular traps regulate ischemic stroke brain injury. J Clin Investig 2022;132. https://doi.org/10.1172/jci154225 Parrella E, Porrini V, Benarese M et al. The role of mast cells in stroke. Cells 2019;8. https://doi.org/10.3390/cells8050437 Lei W, Liu Z, Su Z, et al. Hyperhomocysteinemia potentiates megakaryocyte differentiation and thrombopoiesis via gh-pi3k-akt axis. J Hematol Oncol. 2023;16:84. https://doi.org/10.1186/s13045-023-01481-x . Wang ZK, Xue L, Wang T, et al. Infiltration of invariant natural killer t cells occur and accelerate brain infarction in permanent ischemic stroke in mice. Neurosci Lett. 2016;633:62–8. https://doi.org/10.1016/j.neulet.2016.09.010 . Kuleshov MV, Jones MR, Rouillard AD, et al. Enrichr: A comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44. https://doi.org/10.1093/nar/gkw377 . :W90-97. Sheng Z, Ge S, Gao M, et al. Synthesis and biological activity of embelin and its derivatives: An overview. Mini Rev Med Chem. 2020;20:396–407. https://doi.org/10.2174/1389557519666191015202723 . Anika AR, Virendra SA, et al. Mechanistic study on the possible role of embelin in treating neurodegenerative disorders. CNS Neurol Disord Drug Target. 2024;23:55–66. https://doi.org/10.2174/1871527322666230119100053 . Siegel C, Li J, Liu F, et al. Mir-23a regulation of x-linked inhibitor of apoptosis (xiap) contributes to sex differences in the response to cerebral ischemia. Proc Natl Acad Sci USA. 2011;108:11662–7. https://doi.org/10.1073/pnas.1102635108 . Thippeswamy BS, Nagakannan P, Shivasharan BD, et al. Protective effect of embelin from embelia ribes burm. Against transient global ischemia-induced brain damage in rats. Neurotox Res. 2011;20:379–86. https://doi.org/10.1007/s12640-011-9258-7 . Kamat PK, Rai S, Swarnkar S, et al. Molecular and cellular mechanism of okadaic acid (oka)-induced neurotoxicity: A novel tool for alzheimer's disease therapeutic application. Mol Neurobiol. 2014;50:852–65. https://doi.org/10.1007/s12035-014-8699-4 . Tian S, Zhang J, Yuan S et al. Exploring pharmacological active ingredients of traditional chinese medicine by pharmacotranscriptomic map in itcm. Briefings in bioinformatics 2023;24. https://doi.org/10.1093/bib/bbad027 Tian S, Li Y, Xu J et al. Coimmr: A computational framework to reveal the contribution of herbal ingredients against human cancer via immune microenvironment and metabolic reprogramming. Briefings in bioinformatics 2023;24. https://doi.org/10.1093/bib/bbad346 Additional Declarations No competing interests reported. Supplementary Files SupplementaryFigure.docx Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 22 Sep, 2025 Reviews received at journal 19 Sep, 2025 Reviews received at journal 19 Sep, 2025 Reviews received at journal 10 Sep, 2025 Reviewers agreed at journal 29 Aug, 2025 Reviewers agreed at journal 27 Aug, 2025 Reviewers agreed at journal 23 Aug, 2025 Reviewers invited by journal 11 Apr, 2025 Editor assigned by journal 06 Apr, 2025 Submission checks completed at journal 10 Mar, 2025 First submitted to journal 08 Mar, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6186335","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":426556121,"identity":"524c61f2-3e7c-4b66-82fb-cb419d594ba1","order_by":0,"name":"Shasha Lei","email":"","orcid":"","institution":"The Affiliated Guangdong Second Provincial General Hospital of Jinan University","correspondingAuthor":false,"prefix":"","firstName":"Shasha","middleName":"","lastName":"Lei","suffix":""},{"id":426556122,"identity":"2164eb21-0c1b-446c-ad23-f0336150ffd4","order_by":1,"name":"Zhi-Xin Huang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAoklEQVRIiWNgGAWjYBACAyCW+FAhISdPkhbJGWcsjA0bSNEizdlWkchwgFgt5vxnDG8zzpNIYGxgfvjoBjFaLBuOJVsXbpPIY2dgMzbOIcphB5uPSc/cJlHM2MDDJk2clsOMbdK8cyQSGw4QreUY8zFp3gaStJxhS7accUzC2LCZaL+cP2N440NNnZw8e/PDx0RpQQBm0pSPglEwCkbBKMAHAArhLVxVdREDAAAAAElFTkSuQmCC","orcid":"","institution":"The Affiliated Guangdong Second Provincial General Hospital of Jinan University","correspondingAuthor":true,"prefix":"","firstName":"Zhi-Xin","middleName":"","lastName":"Huang","suffix":""},{"id":426556123,"identity":"ae9e8ae1-0a99-4d1e-9968-fc6527170702","order_by":2,"name":"Zhenzhen Wang","email":"","orcid":"","institution":"The Affiliated Guangdong Second Provincial General Hospital of Jinan University","correspondingAuthor":false,"prefix":"","firstName":"Zhenzhen","middleName":"","lastName":"Wang","suffix":""},{"id":426556124,"identity":"9796b746-ad01-4740-a880-d6a2368bf3ef","order_by":3,"name":"Kaili Huang","email":"","orcid":"","institution":"The Affiliated Guangdong Second Provincial General Hospital of Jinan University","correspondingAuthor":false,"prefix":"","firstName":"Kaili","middleName":"","lastName":"Huang","suffix":""},{"id":426556125,"identity":"5d4ca83c-11c1-4646-98bc-d6a183bd77a4","order_by":4,"name":"Zhihui Chen","email":"","orcid":"","institution":"The Affiliated Guangdong Second Provincial General Hospital of Jinan University","correspondingAuthor":false,"prefix":"","firstName":"Zhihui","middleName":"","lastName":"Chen","suffix":""},{"id":426556126,"identity":"5d0d2a65-d391-4660-93c3-7e61f5416d7d","order_by":5,"name":"Dandang Ouyang","email":"","orcid":"","institution":"The Affiliated Guangdong Second Provincial General Hospital of Jinan University","correspondingAuthor":false,"prefix":"","firstName":"Dandang","middleName":"","lastName":"Ouyang","suffix":""}],"badges":[],"createdAt":"2025-03-09 02:08:08","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6186335/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6186335/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":78424827,"identity":"d97033a6-2a54-40f4-b759-2719b91ac811","added_by":"auto","created_at":"2025-03-13 06:18:52","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":2135721,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eGraphical abstract\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/a10fa42f23338b99293b664b.png"},{"id":78426210,"identity":"2403158d-1f82-421e-9079-90bd77c4d9ef","added_by":"auto","created_at":"2025-03-13 06:26:52","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":423590,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eExpression profile of PI3K signaling-related genes in ischemic stroke. \u003c/strong\u003e(\u003cstrong\u003eA\u003c/strong\u003e) Boxplots showing the expression of PI3K signaling-related genes after combined datasets, the abscissa is the gene associated with PI3K signaling, and the ordinate is the expression of the gene. *P \u0026lt; 0.05; **P \u0026lt; 0.01; ***P \u0026lt; 0.001; the blank, not significant.(\u003cstrong\u003eB\u003c/strong\u003e) Heatmap showing the expression of 39 differentially expressed PI3K signaling-related genes, the abscissa is the sample, and the ordinate is the gene. Red represents genes with high expression, and blue represents genes with low expression. (\u003cstrong\u003eC\u003c/strong\u003e) Volcano plot of gene expression profile data between IS and control group in datasets. Red dots: significantly upregulated genes in IS; Blue dots: significantly downregulated genes in IS; black dots: nondifferentially expressed genes. P\u0026lt;0.05 were considered as significant.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/d6c4b96aa684bdf0ce914e3d.png"},{"id":78426555,"identity":"a594f81d-38dd-4c08-bb36-a59a335d75c1","added_by":"auto","created_at":"2025-03-13 06:34:52","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":223982,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eGene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) for the identified 39 PI3K signaling-related differentially expressed genes. \u003c/strong\u003e(A-C) Biological processes (BP), cellular components(CC), and molecular functions(MF) are mostly related to the differentially expressed genes. (D) Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the differentially expressed genes.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/835f50876d937731d3e622ce.png"},{"id":78426213,"identity":"b639c0c6-4c2b-48e4-b922-1d590c6cf1d3","added_by":"auto","created_at":"2025-03-13 06:26:52","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":665517,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003e(A) \u003c/strong\u003ePrincipal component analysis indicates a partial but significant separation between cluster A and cluster B for taxonomic and functional profiles. (\u003cstrong\u003eB\u003c/strong\u003e) Heatmap showing the expression profile of 39 PI3K signaling-related differentially expressed genes (DEGs) between two PI3K clusters. (\u003cstrong\u003eC\u003c/strong\u003e) PI3K signaling-related molecular subtypes characterized by distinct immune and metabolism landscapes. Boxplot shows the abundance of 64 distinct immune and stromal cell types in two PI3K signaling-related molecular subtypes. The upper and lower ends of the boxes represented the interquartile range of values. The lines in the boxes represented the median value, and the black dots showed outliers. (\u003cstrong\u003eD\u003c/strong\u003e) Boxplot shows the abundance of 22 immune infiltrating cells in two PI3K signaling-related molecular subtypes. The upper and lower ends of the boxes represented the interquartile range of values. The lines in the boxes represented the median value, and the black dots showed outliers. The asterisks represented the statistical pvalue (*P\u0026lt;0.05; **P\u0026lt;0.01; ***P\u0026lt;0.001).\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/8f5f7999b264cd00bc710106.png"},{"id":78424831,"identity":"b0c3ab31-632c-40f7-9a59-ae68ea3c61c8","added_by":"auto","created_at":"2025-03-13 06:18:52","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":801322,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003e(A) \u003c/strong\u003ePrincipal component analysis indicates a partial but significant separation between cluster G-1 and cluster G-2 for taxonomic and functional profiles. (\u003cstrong\u003eB\u003c/strong\u003e) Heatmap showing the DEGs between PI3K clusters. (\u003cstrong\u003eC\u003c/strong\u003e) Boxplot shows the abundance of 64 distinct immune and stromal cell types in two gene clusters. The upper and lower ends of the boxes represented the interquartile range of values. The lines in the boxes represented the median value, and the black dots showed outliers. (\u003cstrong\u003eD\u003c/strong\u003e) Boxplot shows the abundance of 22 immune infiltrating cells in two gene clusters. The upper and lower ends of the boxes represented the interquartile range of values. The lines in the boxes represented the median value, and the black dots showed outliers. The asterisks represented the statistical pvalue (*P\u0026lt;0.05; **P\u0026lt;0.01; ***P\u0026lt;0.001).\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/085823896bc6be2d6d52687d.png"},{"id":78424833,"identity":"7f064478-f369-4cfa-bf53-7c016840f36a","added_by":"auto","created_at":"2025-03-13 06:18:52","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":522201,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eScreening candidate diagnostic markers for ischemic stroke.\u003c/strong\u003e (\u003cstrong\u003eA\u003c/strong\u003e)Tencross-validations of adjusted parameter selection in the LASSO model. Each curve corresponds to one gene. (\u003cstrong\u003eB\u003c/strong\u003e)LASSO coefficient analysis. Vertical dashed lines are plotted at the best lambda. (\u003cstrong\u003eC\u003c/strong\u003e) Diagnostic markers were screened by a support vector machine-recursive feature elimination (SVM-RFE) algorithm, and the importance of each hub gene is depicted on the right side. (\u003cstrong\u003eD\u003c/strong\u003e) Variable importance plot for the RF model. The features are ranked by the percentage increase in Mean Squared Error (MSE) when they are permuted. The more the percentage in MSE increase, the more important the variable is. (\u003cstrong\u003eE\u003c/strong\u003e) Venn diagram showing the feature genes shared by LASSO, random forest, and SVM-RFE algorithms.\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/260152099e5ce94b8f0780aa.png"},{"id":78424836,"identity":"7c131aa6-47e7-4d0f-9659-d17799817630","added_by":"auto","created_at":"2025-03-13 06:18:52","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":203397,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eConstruction of a nomogram model for ischemic stroke diagnosis. \u003c/strong\u003e(\u003cstrong\u003eA\u003c/strong\u003e) Nomogram of the five-gene PI3K signaling diagnostic signature for IS probability. (\u003cstrong\u003eB\u003c/strong\u003e) The receiver operating characteristic (ROC) analysis of nomogram in training cohort and each candidate gene. (\u003cstrong\u003eC\u003c/strong\u003e) The receiver operating characteristic (ROC) analysis of nomogram in test cohort (GSE22255).\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/bf65efb324480bf920e7c5ef.png"},{"id":78424839,"identity":"30e2a8d2-a1a2-4454-a981-36152dfa05b0","added_by":"auto","created_at":"2025-03-13 06:18:52","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":4255408,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eMolecular docking patterns for the top two candidate drugs and proteins corresponding to key genes.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/c179d3871298c9037cc8fba3.png"},{"id":78427653,"identity":"95721ea6-a11e-42a3-b448-2d754f70ce8d","added_by":"auto","created_at":"2025-03-13 06:51:05","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":11067008,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/9e1a2221-b656-4a5a-a303-f13a60801c8d.pdf"},{"id":78424847,"identity":"bb616f87-fc4a-4149-a0e3-b9646925683f","added_by":"auto","created_at":"2025-03-13 06:18:53","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":3013603,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFigure.docx","url":"https://assets-eu.researchsquare.com/files/rs-6186335/v1/1ecdb488a3c045e6a392cd2b.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Machine Learning-Driven Identification of Diagnostic Biomarkers in Ischemic Stroke: Focus on PI3K Pathway ","fulltext":[{"header":"Introduction","content":"\u003cp\u003eStroke remains the second leading cause of death globally, with its aftermath placing a profound burden on individuals and society due to the resulting disability. The World Health Organization projects a concerning increase in stroke-related mortality, underscoring the urgent need for the development of advanced diagnostic and therapeutic strategies [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Ischemic stroke (IS), which accounts for approximately 70% of all strokes, is a major contributor to this trend. A review of the global burden of IS over the past three decades predicts a continued rise, with a staggering 4.9\u0026nbsp;million deaths expected by 2030 [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. The significant social and economic toll of IS calls for intensified efforts to reduce its incidence and improve patient outcomes.\u003c/p\u003e \u003cp\u003eThe pathophysiological process of cerebral ischemia rapidly reduces blood perfusion to the brain's affected regions, triggering a complex cascade of pathological events, including oxidative stress, apoptosis, and inflammation[\u003cspan additionalcitationids=\"CR4\" citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Among these, inflammation plays a central role, initiating a sequence of molecular reactions that contribute to tissue damage and repair. A key regulator of these processes is the Phosphoinositide 3-kinase (PI3K) signaling pathway, which is notably upregulated after stroke. This pathway regulates immune responses and cellular survival, while also enhancing the production of Vascular Endothelial Growth Factor (VEGF), a critical factor for vascular and neuronal remodeling. VEGF, through the activation of Focal Adhesion Kinase (FAK) and Paxillin, fosters neurorepair and angiogenesis, promoting the generation of epithelial cells and astrocytes around microvessels, thus underscoring the PI3K pathway\u0026rsquo;s regenerative capacity [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. These molecular events make the PI3K signaling pathway an attractive target for therapeutic strategies aimed at improving IS outcomes.\u003c/p\u003e \u003cp\u003eInflammation remains a cornerstone of IS pathophysiology and is closely linked with patient prognosis. The activation of the PI3K pathway post-stroke triggers a cascade of immune responses, particularly the activation and regulation of central nervous system (CNS)-resident immune cells such as microglia and astrocytes [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. Understanding how the PI3K pathway modulates the generation of these immune cells could provide crucial insights into IS progression and recovery.\u003c/p\u003e \u003cp\u003eDespite the growing body of research on the PI3K signaling pathway, the precise relationships between its associated genes, stroke prognosis, the immune microenvironment, and the effectiveness of immunotherapies remain insufficiently understood. This study aims to comprehensively investigate the role of PI3K-related genes in IS, exploring their impact on disease progression, prognosis, and immune responses. By identifying potential biomarkers for diagnosis and prognosis, this research not only enhances our ability to predict clinical outcomes but also provides a foundation for developing targeted therapeutic strategies to improve IS treatment.\u003c/p\u003e"},{"header":"Materials and methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eAcquisition and Processing of Gene Expression Data\u003c/h2\u003e \u003cp\u003eWe acquired gene expression microarray data of IS patients from the NCBI Gene Expression Omnibus (GEO) databases GSE16561 and GSE37587 (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/geo/\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/geo/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Both datasets, derived from the same platform, consist of whole blood specimens from human peripheral blood, collected within 24 to 48 hours post- IS. The data processing was conducted in a three-tiered approach. Initially, the single probe expression matrix files downloaded from the GEO database underwent normalization and a log2 transformation. Subsequently, the platform annotation files were aligned with each probe expression matrix, retaining only those probes with well-defined annotations. To ensure the accuracy of the included data, we analyzed the mean expression values of multiple probes corresponding to a single gene. Ultimately, we employed the ' ComBat ' function from the sva R package, installed from Bioconductor (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://bioconductor.org/\u003c/span\u003e\u003cspan address=\"https://bioconductor.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), to mitigate heterogeneity across different experimental batches. The final dataset comprised 24 healthy controls and 73 IS patient specimens.\u003c/p\u003e \u003cp\u003eAdditionally, we downloaded two independent datasets from the GEO database, namely GSE22255 and GSE58294, to serve as validation cohorts. The GSE22255 dataset comprises 20 IS patients and 20 age- and sex-matched controls. The GSE58294 dataset includes 69 cardiogenic embolism stroke samples and 23 control samples. By analyzing these external datasets, we can validate the diagnostic biomarkers identified in the PI3K signaling pathway, thereby enhancing the reliability and biological significance of our findings.\u003c/p\u003e \u003cp\u003eThrough a meticulous search of the MSigDB database, we identified 105 PI3K signaling-associated genes. Utilizing the curated expression data, we constructed a PI3K-related gene expression matrix. This research leveraged existing data, obviating the need for additional human or animal experimentation. A schematic of our methodology is presented in graphical abstract(Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eData processing and identification of PI3K signaling-related DEGs\u003c/h2\u003e \u003cp\u003eWe extracted an expression matrix for genes associated with the PI3K pathway from our processed dataset, the 'ggplot2' R package facilitated the visualization of these genes through volcano plots and boxplots. Employing the 'limma' R package, we performed differential gene expression analysis, discerning Differentially Expressed Genes (DEGs) with an adjusted p-value threshold of less than 0.05. Additionally, we utilized the 'ComplexHeatmap' package to generate two types of clustered heatmaps: one to delineat the 39 DEGs pertinent to the PI3K pathway, and another to highlight the correlation coefficients among the DEGs.\u003c/p\u003e \u003cp\u003eThis streamlined analysis offers an insightful visual synopsis of the PI3K pathway's role in IS, setting the stage for in-depth biological interpretation.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eGO and KEGG enrichment analyses\u003c/h3\u003e\n\u003cp\u003eTo further elucidate the functional mechanisms of the PI3K signaling pathway in IS, we conducted Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. GO analysis provides a comprehensive description of gene functions across three categories: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). KEGG pathway analysis, on the other hand, enables us to visualize the interactions and functional relationships among the differentially expressed genes (DEGs) within established metabolic and signaling pathways. We utilized R packages, including \"ClusterProfiler,\" \"BiocManager,\" and \"org.Hs.eg.Db,\" to perform GO and KEGG enrichment analyses on 39 DEGs associated with the PI3K pathway. This allowed us to gain insights into the potential roles of the PI3K signaling axis in key biological processes such as metabolism, growth, proliferation, survival, transcription, and protein synthesis. We displayed the top five enriched GO terms and KEGG pathways in ascending order of statistical significance (P\u0026thinsp;\u0026lt;\u0026thinsp;0.05) using bubble plots to provide a concise summary of the functional relevance of the PI3K pathway in IS. Our research focuses on further exploring the role of the PI3K signaling pathway in inflammation, particularly in the context of immune infiltration following IS. The enrichment results above provide strong theoretical evidence for considering immune infiltration as a key aspect of the pathophysiology of IS.\u003c/p\u003e\n\u003ch3\u003eImmune landscape analysis\u003c/h3\u003e\n\u003cp\u003eThe Immuno-Oncology-Biological-Research (IOBR) package in R is a sophisticated computational toolkit designed for studies in immunobiology[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. It integrates six widely-used algorithms to analyze gene expression data and quantify immune cell populations: MCPcounter, TIMER, xCell, CIBERSORT, EPIC, and quanTiseq.\u0026nbsp;It also includes a comprehensive collection of 255 curated gene sets designed to investigate intricate aspects of the tumor microenvironment, such as metabolic pathways, m6A modifications, exosome biology, and more. This toolkit excels at analyzing immune cell infiltration patterns across a cohort of 131 patients and extracting biologically relevant gene signatures.\u003c/p\u003e \u003cp\u003eCIBERSORT utilizes the LM22 gene signature matrix, comprising 547 characteristic genes associated with 22 leukocyte subpopulations, including myeloid cells, natural killer cells, plasma cells, memory B cells, and seven T cell subsets. By applying CIBERSORT and the LM22 matrix, researchers can estimate the proportions of these 22 cellular phenotypes within each sample, with the sum of all immune cell proportions equaling 1.\u003c/p\u003e \u003cp\u003exCell complements CIBERSORT by performing cell type enrichment analysis using gene expression data from 64 distinct immune and stromal cell types. Leveraging machine learning algorithms trained on diverse cell type gene signatures, xCell effectively captures the cellular heterogeneity inherent in tissue expression landscapes.\u003c/p\u003e \u003cp\u003eTogether, these two algorithms quantify the abundance levels of immune cells, stem cells, and stromal cells within samples. Statistical tests (e.g., Wilcoxon test) are applied to analyze differences in the immune microenvironment between pathological states, laying the foundation for in-depth research on the interplay between immunity and cancer.\u003c/p\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eConsensus clustering for IS samples\u003c/h2\u003e \u003cp\u003eBy harnessing the ConsensusClusterPlus package, we conducted a comprehensive clustering analysis on IS patient cohorts, with the potential number of clusters (K) ranging from 1 to 9. This analytical approach integrated the K-Means algorithm based on Euclidean distance, and the hierarchical clustering process was meticulously repeated 1000 times to ensure cluster stability. The determination of the optimal cluster count and the achievement of a relatively stable clustering outcome were guided by the application of the cumulative distribution function (CDF). This systematic methodology enabled the identification of various molecular subtypes, and their distribution among different IS subgroups was effectively visualized through principal component analysis (PCA) and heatmaps. To explore the clinical significance of these distinct subgroups, a comparative analysis was performed to evaluate the differences in immune cell infiltration rates between the two subgroups.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eIdentification of PI3K signaling-related diagnostic biomarkers\u003c/h3\u003e\n\u003cp\u003eIn this study, we harnessed advanced machine learning algorithms, including Support Vector Machine (SVM), Random Forest (RF), and Least Absolute Shrinkage and Selection Operator (LASSO), to dissect the complex interplay of genetic factors associated with IS, thereby enhancing the predictive accuracy of our diagnostic models. The SVM algorithm, implemented via the 'e1071' R package, was adept at discerning genes with heightened discriminatory power, thereby facilitating the extraction of optimal genes for the diagnosis of IS The RF algorithm, renowned for its high flexibility and precision in capturing nonlinear relationships between dependent and independent variables, is fundamentally grounded in the construction of multiple decision trees. By leveraging bootstrap aggregating, or bagging, to diminish the inter-tree correlation, this approach effectively mitigates the overfitting issue. In our research, we harnessed the RF algorithm to forecast the risk of developing IS. LASSO, executed through the 'glmnet' R package, represents a regression analysis technique that employs regularization for variable selection. Utilizing LASSO, we were able to pinpoint genes that exhibit significant differences between IS and normal samples. Having applied these three algorithms, we subsequently employed a Venn diagram analysis to examine the intersection of genes identified by the algorithms, thereby further validating the expression levels of candidate diagnostic biomarkers.\u003c/p\u003e\n\u003ch3\u003eConstruction and validation of a Predictive Model for IS\u003c/h3\u003e\n\u003cp\u003eUtilizing three distinct machine learning algorithms, we identified key diagnostic biomarkers for IS and developed a binary logistic regression model for risk prediction. The most efficient model was selected and implemented with the 'rms' R package to create a nomogram predicting IS likelihood.\u003c/p\u003e \u003cp\u003eThe nomogram assigns individual 'scores' to each biomarker, with a 'total score' reflecting their cumulative impact. Each gene's score is determined by a vertical reference to the nomogram scale, and these scores are summed to calculate the overall risk probability for IS.\u003c/p\u003e \u003cp\u003eThe model's predictive accuracy was evaluated using the Receiver Operating Characteristic (ROC) curve, and the GSE22255 and GSE58294 dataset further validated its performance, confirming the diagnostic biomarkers' expression levels and predictive value.\u003c/p\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eDrug discovery in DSigDB\u003c/h2\u003e \u003cp\u003eAssessing protein-drug interactions is pivotal for discerning the viability of target genes as practical pharmaceutical targets. In this study, we will harness the DSigDB drug signature database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://amp.pharm.mssm.edu/Enrichr/\u003c/span\u003e\u003cspan address=\"https://amp.pharm.mssm.edu/Enrichr/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) to curate a selection of candidate drugs. DSigDB is an expansive repository that integrates 22,527 gene sets with 17,389 distinct compounds across 19,531 genes, effectively linking drugs and other chemicals to their target genes. By uploading the identified target genes to DSigDB, we can prognosticate potential candidate drugs and evaluate the pharmacological activity of the target genes.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eMolecular docking analysis\u003c/h2\u003e \u003cp\u003eTo enhance our understanding of drug-target gene interactions and assess the potential of these genes as druggable targets, our study incorporates molecular docking at the atomic level. Molecular docking allows for the calculation of binding energy, indicative of the predicted affinity between ligands and receptor proteins. A negative binding energy suggests spontaneous binding between the two molecules, with a lower energy indicating a more stable conformation. Identifying ligands with strong binding affinity and optimal interaction profiles enables us to prioritize targets for experimental validation and refine the development of promising drug candidates. Our molecular docking studies employed AutoDock version 1.5.6 to simulate interactions between potential drugs and their corresponding target gene-encoded proteins. The three-dimensional structures of candidate drugs were obtained from PubChem, and protein structures were retrieved from the Protein Data Bank (PDB). Docking outcomes were rendered using PyMOL 2.4.1, yielding detailed structures for five proteins and two candidate drugs.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eDifferential Expression of PI3K Signaling Pathway Genes in IS\u003c/h2\u003e \u003cp\u003eAfter meticulous processing, we successfully merged the GSE16561 and GSE37587 datasets. We then integrated the gene expression matrix with genes related to PI3K signaling, constructing an expression matrix that encompasses genes associated with the PI3K pathway, from which we generated a volcano plot and a box plot (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA, \u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC). Through rigorous differential gene expression analysis, we delineated a cohort of 39 differentially expressed genes (DEGs), which exhibited significant dysregulation between the IS patient group and controls, as illustrated in the accompanying heatmaps (Figs.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn the IS group, we identified upregulation of genes, including \u003cem\u003ePTEN\u003c/em\u003e, \u003cem\u003eITPR2\u003c/em\u003e, \u003cem\u003eMAPK1\u003c/em\u003e, \u003cem\u003eMYD88\u003c/em\u003e, \u003cem\u003eSLA\u003c/em\u003e, \u003cem\u003eRAC1\u003c/em\u003e, \u003cem\u003eRAF1\u003c/em\u003e, \u003cem\u003eCAB39\u003c/em\u003e, \u003cem\u003eUBE2D3\u003c/em\u003e, \u003cem\u003eRPS6KA3\u003c/em\u003e, \u003cem\u003eGSK3B\u003c/em\u003e, \u003cem\u003ePTPN11\u003c/em\u003e, \u003cem\u003eDDIT3\u003c/em\u003e, \u003cem\u003eVAV3\u003c/em\u003e, \u003cem\u003eRIT1\u003c/em\u003e, \u003cem\u003eTBK1\u003c/em\u003e, \u003cem\u003eCLTC\u003c/em\u003e, \u003cem\u003ePLCB1\u003c/em\u003e, \u003cem\u003eRALB\u003c/em\u003e, \u003cem\u003eGRB2\u003c/em\u003e, \u003cem\u003eRPS6KA1\u003c/em\u003e, \u003cem\u003eE2F1\u003c/em\u003e, \u003cem\u003eYWHAB\u003c/em\u003e, \u003cem\u003eDUSP3\u003c/em\u003e, \u003cem\u003eMKNK1\u003c/em\u003e, and \u003cem\u003eACTR2\u003c/em\u003e, alongside downregulation of genes like \u003cem\u003ePIN1\u003c/em\u003e, \u003cem\u003eCDK4\u003c/em\u003e, \u003cem\u003eCDK2\u003c/em\u003e, \u003cem\u003eLCK\u003c/em\u003e, \u003cem\u003eFASLG\u003c/em\u003e, \u003cem\u003eTRAF2\u003c/em\u003e, \u003cem\u003ePLCG1\u003c/em\u003e, \u003cem\u003eTHEM4\u003c/em\u003e, \u003cem\u003eNCK1\u003c/em\u003e, \u003cem\u003eUBE2N\u003c/em\u003e, \u003cem\u003eCSNK2B\u003c/em\u003e, \u003cem\u003eHRAS\u003c/em\u003e, and \u003cem\u003eCFL1\u003c/em\u003e. This differential expression pattern offers valuable biological insights into the molecular changes occurring in IS.\u003c/p\u003e \u003cp\u003eFurther analysis revealed significant correlations among the DEGs, including a positive correlation between \u003cem\u003ePIN1\u003c/em\u003e and \u003cem\u003eCDK4\u003c/em\u003e and a negative correlation between \u003cem\u003ePTEN\u003c/em\u003e and \u003cem\u003eCDK4\u003c/em\u003e (Supplementary Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). These correlations enhance our comprehension of the PI3K pathway's role in IS and suggest promising avenues for future investigations.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eFunctional enrichment analysis of PI3K signaling-related DEGs\u003c/h2\u003e \u003cp\u003eWe conducted GO and KEGG pathway enrichment analyses on the identified 39 genes, focusing on the top five terms for MF, BP, and CC from the GO analysis, as well as the top five pathways from the KEGG analysis.\u003c/p\u003e \u003cp\u003eThe GO analysis revealed that the top biological processes associated with the PI3K pathway included peptidyl-serine phosphorylation, peptidyl-serine modification, ERBB signaling pathway, immune response-activating cell surface receptor signaling, and immune response-regulating cell surface receptor signaling pathway (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). The most relevant cellular component was the serine/threonine protein kinase complex (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB), and the primary molecular function was phosphoprotein binding (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eKEGG pathway analysis further highlighted the enrichment of these PI3K-related DEGs in pathways associated with infectious diseases (both bacterial and viral), as well as nervous system and immune system signaling cascades (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD).\u003c/p\u003e \u003cp\u003eThese findings suggest that the PI3K signaling pathway may play a multifaceted regulatory role in IS, potentially through modulating immune responses and immune system-related processes. The enrichment in infectious disease pathways also implies a potential involvement of the PI3K pathway in the interplay between stroke pathogenesis and inflammatory/immune mechanisms triggered by microbial infections.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003ePI3K Signaling-Associated Gene Sets for Stroke Stratification\u003c/h2\u003e \u003cp\u003eIn our analysis of 107 IS samples, we employed consensus clustering to identify two distinct subclasses, termed cluster A and cluster B (Figs.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA-B, Supplementary Figure S2). To explore the correlation between different subclasses of IS patients and the immune microenvironment, we analyzed immune cells and stromal cells, which are two major non-tumor components that have been proven to be of significant value in the diagnosis and prognostic assessment of various diseases. In this research, we calculated the scores for immune cells and stromal cells (Supplementary Figure S3).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eWe quantified the presence of immune and stromal cells, two non-tumor components with significant diagnostic and prognostic value. Cluster A was characterized by an increased expression of key immune cells, including CD4 memory T cells, CD8 T cells, CD8 Central Memory T cells, naive B cells, plasma cells, smooth muscle cells, Th1, and Th2 cells. Additionally, cluster A demonstrated elevated ImmuneScore and MicroenvironmentScore, suggesting a more active immune profile (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC). In stark contrast, cluster B was defined by a higher expression of NKT cells and megakaryocytes, indicating a distinct immunological pattern.\u003c/p\u003e \u003cp\u003eFurther analysis using the CIBERSORT algorithm delineated the infiltration levels of 22 immune cell types across the IS subclasses. Cluster A showed a significant predominance of CD8 T cells and CD4 memory T cells, suggesting a robust immune response. Conversely, cluster B had a lower infiltration of M0 macrophages and neutrophils, pointing to a subdued inflammatory reaction (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eD).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eIdentification of immune microenvironment in different PI3K clusters\u003c/h2\u003e \u003cp\u003eTo investigate the biological nuances of IS subtypes, we pinpointed 112 DEGs distinguishing clusters A and B. Utilizing these DEGs, we applied unsupervised clustering to segment IS patients into two novel clusters, G-1 and G-2, closely associated with the DEGs' profiles (Supplementary Figure S4). Principal component analysis (PCA) confirmed the distinct separability of these clusters, as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA-B. To further elucidate the immunological differences caused by different subtypes, we conducted an immunological analysis.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe immunological analysis revealed that the expression levels of CD4 memory resting T cells, CD4 memory T cells, CD8 T cells, CD8 central memory T cells, plasma cells, smooth muscle cells, Th1 cells, and Th2 cells were significantly higher in cluster G-2. In contrast, cluster G-1 exhibited higher levels of expression for macrophages M0, neutrophils, mast cells, megakaryocytes, microvascular endothelial cells, and NKT cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC-D). Based on these findings, we concluded that the gene-based grouping method more accurately describes the characteristics of patients than the traditional PI3K grouping.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eConstruction and analysis of the prediction model\u003c/h2\u003e \u003cp\u003eTo screen candidate diagnostic biomarkers, we employed three distinct algorithms. The LASSO logistic regression initially identified 15 key variables, which were visually represented in Figs.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA-B. The SVM algorithm further refined the list to 38 features, detailed in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eC. The random forest algorithm complemented this by identifying nine candidate features, as shown in Figs.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eD-E. The intersection of these analyses yielded seven consensus genes: \u003cem\u003ePIN1\u003c/em\u003e, \u003cem\u003eCDK2\u003c/em\u003e, \u003cem\u003eVAV3\u003c/em\u003e, \u003cem\u003ePTPN11\u003c/em\u003e, \u003cem\u003eITPR2\u003c/em\u003e, \u003cem\u003eYWHAB\u003c/em\u003e, and \u003cem\u003eCFL1\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eF).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn the selection of biomarkers, we strictly adhered to two core criteria: significant correlation with the research outcomes and substantial impact on the prediction of stroke events[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Initially, we utilized the Forward Stepwise Likelihood Ratio method to successfully identify six genes. Subsequently, we eliminated those genes that did not significantly contribute to the predictive model statistically, while vigilantly monitoring the potential effects of multicollinearity. This strategic variable selection not only simplified the model structure and reduced the risk of overfitting but also significantly enhanced the model's interpretability and predictive power. After comprehensive review and rigorous refinement, we ultimately determined five robust biomarkers: \u003cem\u003ePIN1\u003c/em\u003e, \u003cem\u003eCDK2\u003c/em\u003e, \u003cem\u003eVAV3\u003c/em\u003e, \u003cem\u003eYWHAB\u003c/em\u003e, and \u003cem\u003eCFL1\u003c/em\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eClinical Nomogram for Stroke Risk Prediction: Development and Validation\u003c/h2\u003e \u003cp\u003eWe constructed a clinical nomogram utilizing multiple machine learning techniques to predict the risk of stroke based on diagnostic biomarkers (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eA). This tool provides a scoring system for individuals, translating biomarker data into stroke risk probabilities. The predictive accuracy was affirmed through ROC analysis, which showed an AUC of 0.984, indicative of the nomogram's high predictive power (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eB). Further validation with an independent test cohort (GSE22255) yielded an AUC of 0.77, substantiating the nomogram's reliability and robustness, Additionally, validation with another dataset resulted in an impressive AUC of 0.962, further confirming the model's predictive accuracy (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eC).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003ePrioritizing Drug Candidates with Bioinformatics Tools\u003c/h2\u003e \u003cp\u003eLeveraging the Enrichr platform, we harnessed the five core genes to identify potential drug candidates. We curated data from the DSigDB database to shortlist the top 20 candidate drugs, visually presented in a circular chart format (Supplementary Figure S5). A P-value-based analysis led us to prioritize five promising drugs\u0026mdash;EMBELIN, okadaic acid, luteolin, staurosporine, and Cyperquat\u0026mdash;that demonstrated significant interaction potential with the core genes under study.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003eMolecular docking results\u003c/h2\u003e \u003cp\u003eAmong the drug candidates, Embelin and Okadaic acid stood out for their potential interactions with our panel of five diagnostic genes. Our analysis revealed that both drugs exhibited binding energies below \u0026minus;\u0026thinsp;4 with these genes (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e), a threshold suggesting favorable molecular docking. The structural details of these interactions are vividly portrayed in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e, illustrating the local architecture of the molecular complexes. These preliminary yet compelling results call for deeper exploration of Embelin and Okadaic acid's therapeutic efficacy and their potential to shape future treatment strategies.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eThe binding energy of medications with diagnostic genes\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eEmbelin\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eOkadaic acid\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003ePIN1\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e-5.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e-10.1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eYWHAB\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e-5.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e-10.2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eVAV3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e-5.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e-9.2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eCDK2\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e-6.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e-8.4\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eCFL1\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e-5.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e-7.7\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eData are presented in kilocalories per mole (kcal/mol).\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eIS carries a high incidence and significant economic burden, posing substantial challenges to public health. Timely diagnosis is essential for better patient outcomes, as delays can lead to less effective treatments and poorer prognoses. The scientific community continues to advance in early detection and treatment strategies for IS, with a growing focus on microRNAs (miRNAs) and messenger RNAs (mRNAs) as potential biomarkers, as noted in references[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. The PI3K signaling pathway is a key player in post-stroke pathophysiology, and our study meticulously investigates its associated genes for new diagnostic insights. Our findings could significantly bolster clinical diagnostics in IS. Consequently, the present study is poised to pinpoint candidate biomarkers for the detection of IS and delve into the immunological mechanisms through which the PI3K signaling pathway exerts its influence on IS.\u003c/p\u003e \u003cp\u003eBuilding on the potential of the PI3K pathway as a diagnostic avenue, our research integrated a suite of machine learning algorithms to delve into the roles of PI3K-associated genes in IS. We conducted a rigorous analysis of the expression profiles across this pathway. Through GO term and KEGG pathway analyses, we identified that peptidyl-serine phosphorylation and modification can regulate protein functions[\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e], influencing cellular responses to ischemic injury. The ERBB signaling pathway, part of the cell surface receptor tyrosine kinase family, triggers downstream signaling including PI3K/Akt[\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e], which is crucial for cell proliferation and survival. Furthermore, immune response-activating and regulating cell surface receptor signaling pathways play key roles in neuroinflammation following IS, and the PI3K/Akt pathway may modulate these pathways to affect immune responses and neural repair in stroke. Our analysis established a molecular framework linking the PI3K pathway to immune responses and cellular metabolism, which is vital for understanding the pathophysiology of IS. KEGG analysis further revealed potential connections between PI3K pathway activation and immune microenvironment dynamics, emphasizing the close association between immune microenvironment exhaustion and the PI3K pathway. Based on the significant correlation between inflammation and the PI3K pathway, which was corroborated by both existing literature and our study[\u003cspan additionalcitationids=\"CR18\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e], our research focuses on further exploring the role of the PI3K signaling pathway in inflammation, especially in the context of immune infiltration following IS.\u003c/p\u003e \u003cp\u003eTo delve deeper into these mechanisms, we employed cluster analysis to identify distinct patient groups. This approach allowed us to scrutinize immune cell infiltration and to pinpoint specific genes with prognostic value, culminating in the development of a risk model tailored for IS. Comparative analysis of the transcriptomic profiles identified 39 DEGs associated with the PI3K signaling pathway when comparing IS patients to healthy controls. Of these, 26 genes were upregulated, and 13 were downregulated in the IS group. A detailed co-expression analysis among these genes highlighted significant interactions, particularly involving \u003cem\u003ePIN1\u003c/em\u003e with \u003cem\u003eCDK4\u003c/em\u003e and \u003cem\u003ePTEN\u003c/em\u003e with \u003cem\u003eCFL1\u003c/em\u003e. These findings underscored the heterogeneity within IS patients, suggesting that they could be classified into two subtypes based on gene expression patterns. By examining the immune infiltration profiles, we identified novel subtypes that may respond differently to therapeutic interventions. This classification could refine treatment strategies, offering a more personalized approach to IS management. Our findings thus introduce the PI3K gene set as a promising resource for both diagnosis and therapy in IS.\u003c/p\u003e \u003cp\u003eOur research further delves into the intricate dynamics of the immune response post-IS, with a particular focus on the adaptive immune system's role. Resting CD4\u0026thinsp;+\u0026thinsp;memory T cells, once activated by specific antigens, initiate a swift defense, highlighting their critical position within the immune response[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. Their interaction with P-selectin and platelets through P-selectin glycoprotein ligand-1[\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e] intertwines the immune reaction with hemostasis, a significant consideration in IS. Furthermore, CD8\u0026thinsp;+\u0026thinsp;T lymphocytes have been identified to promote the proliferation of microglia and oligodendrocytes, tmodulating the brain's post-injury immune response[\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. The CD8\u0026thinsp;+\u0026thinsp;central memory T cell subset, with its long-term memory capacity, may play a role in clearing necrotic cells post-stroke and mediating regulatory responses to protect brain tissue under autoimmune conditions[\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. Plasma cells also contribute significantly by secreting memory B cells and cytokines, influencing the stroke's injury and recovery processes[\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. The early stages of cerebral infarction may benefit from smooth muscle cell proliferation, which can support angiogenesis and collateral circulation development[\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. However, excessive proliferation can lead to macrophage-like differentiation, potentially leading to atherosclerosis and increased risk of recurrent vascular events. Th1 cells, secreting pro-inflammatory cytokines such as IFN-γ and TNF-α, are vital for combating infections but must be carefully balanced to prevent exacerbating brain damage[\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e, \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]. Conversely, a shift towards a Th2 phenotype post-CNS injury aids in wound healing and regeneration while guarding against autoimmune diseases within the CNS[\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e].Current research indicates that unstable plaques harbor a higher proportion of M0 macrophages compared to stable ones[\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. The formation of neutrophil extracellular traps (NETs) could also significantly affect the severity of brain injury and patient prognosis[\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. Mast cells are implicated in intensifying inflammatory responses in CNS injuries, potentially contributing to blood-brain barrier (BBB) breakdown and associated complications[\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. Megakaryocytes, as platelet progenitors, are involved in thrombosis promotion, while brain microvascular endothelial cells protect the BBB, playing a crucial role in brain tissue and neuron preservation[\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. NKT cells have been associated with increased cerebral infarction volume and neurological deficits within the critical first 24 hours post-stroke[\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIn our study, Group G-1's elevated levels of neutrophils, M0 macrophages, mast cells, megakaryocytes, and NKT cells hinted at a more severe prognosis, while Group G-2's higher counts of resting CD4 memory T cells, CD8\u0026thinsp;+\u0026thinsp;T cells, and other immune cells pointed to better neural repair and vascular regeneration, reducing the risk of complications. Our meticulous application of integrated bioinformatics and machine learning methods led to the identification of five key PI3K-associated genes\u0026mdash;\u003cem\u003ePIN1\u003c/em\u003e, \u003cem\u003eCDK2\u003c/em\u003e, \u003cem\u003eVAV3\u003c/em\u003e, \u003cem\u003eYWHAB\u003c/em\u003e, and \u003cem\u003eCFL1\u003c/em\u003e\u0026mdash;as potential diagnostic markers for IS. We developed a nomogram based on peripheral blood samples to evaluate these markers' diagnostic efficacy, offering an efficient and clinically relevant method for assessing the risk of IS. This approach enhances diagnostic precision and highlights the applicability of personalized medicine in stroke diagnostics, paving the way for tailored immunotherapies based on the specifics of a patient's immune profile.\u003c/p\u003e \u003cp\u003eExpanding on our exploration of personalized medicine in stroke diagnostics, our research has also concentrated on predicting gene-drug interactions, a critical step for identifying targeted treatments for IS. By employing Enrichr[\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e], we've mapped a range of drugs to five key PI3K-related biomarkers, enriching our understanding of potential therapeutic targets. Among these, embelin and okadaic acid have emerged as particularly promising candidates for further investigation[\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eOur investigation into gene-drug interactions has led us to focus on Embelin, a benzoquinone compound extracted from Embelia ribes, known for its diverse pharmacological profiles. Embelin's molecular complexity underpins its multifaceted biological activities, including anti-inflammatory, antioxidant, and antibacterial actions, alongside potential analgesic, anxiolytic, and contraceptive effects. Its noted antitumor and immunosuppressive capabilities make it a standout candidate for therapeutic development. Most notably, Embelin's neuroprotective properties have attracted significant interest, particularly its ability to cross the blood-brain barrier, positioning it as a potential therapeutic agent in neurodegenerative diseases, including stroke[\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. While its role in stroke treatment is contested, with evidence suggesting both exacerbation of injury in some cases[\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e] and neuroprotection in others[\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. the need for deeper exploration of Embelin's molecular mechanisms and pharmacological targets is clear. This will help clarify its potential as a stroke treatment, as illustrated in the mechanism Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eFollowing our examination of Embelin, we turn to Okadaic acid, a marine toxin with potent protein phosphatase inhibitory activity, notably against PP2A. Its role in neurodegenerative research, especially in Alzheimer's disease through tau protein phosphorylation, suggests potential relevance to post-stroke cellular recovery[\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. We hypothesize that Okadaic acid may positively modulate cellular signaling to reduce damage and promote neurological healing post-stroke, as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e. However, these hypotheses require rigorous scientific validation and clinical trials to confirm efficacy and safety.\u003c/p\u003e \u003cp\u003eWhile our study provides valuable insights, we recognize its limitations, including the need for external validation of our biomarkers for IS due to the retrospective design, further exploration in our wet lab section to validate our findings and understand the underlying mechanisms, and the unique advantages of traditional Chinese medicine (TCM) in IS treatment through its multi-component, multi-target, and multi-pathway effects. To leverage these benefits, Tian, S. et al. developed the Integrated Traditional Chinese Medicine (ITCM) platform[\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e], the largest herb-based pharmacotranscriptomics database, along with the COIMMR framework for rapid screening of active compounds[\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. We believe that employing pharmacotranscriptomic analysis to explore how TCM components and active agents contribute to IS treatment via the PI3K signaling pathway holds significant promise for enhancing therapeutic strategies and deepening our understanding of the underlying molecular mechanisms.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eIn conclusion, our research underscores the pivotal role of the PI3K signaling pathway in post-ischemic cerebrovascular and neurorestorative processes, offering a foundation for understanding the underlying pathophysiology of IS. The refined stratification of IS patients through secondary clustering has elucidated distinct immune profiles, which are essential for the development of personalized immunotherapeutic approaches. Our identification of the PI3K-related genes \u003cem\u003ePIN1\u003c/em\u003e, \u003cem\u003eCDK2\u003c/em\u003e, \u003cem\u003eVAV3\u003c/em\u003e, \u003cem\u003eYWHAB\u003c/em\u003e, and \u003cem\u003eCFL1\u003c/em\u003e introduces a robust diagnostic framework with significant implications for early detection and prognosis. Additionally, the discovery of Embelin and Okadaic acid as potential therapeutic agents presents a novel avenue for intervention, highlighting the need for further exploration into their molecular mechanisms and clinical efficacy. Looking forward, the imperative for extensive clinical trials is clear, with the goal of validating our findings and translating them into effective, personalized stroke management strategies.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eIS \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; Ischemic stroke\u0026nbsp;\u003c/p\u003e\n\u003cp\u003ePI3K \u0026nbsp; \u0026nbsp; \u0026nbsp; Phosphoinositide 3-kinase\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eVEGF \u0026nbsp; \u0026nbsp; Vascular Endothelial Growth Factor\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFAK \u0026nbsp; \u0026nbsp; \u0026nbsp; Focal Adhesion Kinase\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eCNS \u0026nbsp; \u0026nbsp; \u0026nbsp; Central nervous system\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eGEO \u0026nbsp; \u0026nbsp; \u0026nbsp; Gene Expression Omnibus\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eNCBI \u0026nbsp; \u0026nbsp; \u0026nbsp;National Center for Biotechnology Information\u003c/p\u003e\n\u003cp\u003eDEGs \u0026nbsp; \u0026nbsp; \u0026nbsp;Differentially Expressed Genes\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eGO \u0026nbsp; \u0026nbsp; \u0026nbsp; Gene Ontology\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eKEGG \u0026nbsp; \u0026nbsp;Kyoto Encyclopedia of Genes and Genomes\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMF \u0026nbsp; \u0026nbsp; \u0026nbsp; Molecular Function\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eBP \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;Biological Process\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eCC \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;Cellular Component\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eIOBR \u0026nbsp; \u0026nbsp; \u0026nbsp; Immuno-Oncology-Biological-Research\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eCDF \u0026nbsp; \u0026nbsp; \u0026nbsp; Cumulative distribution function\u0026nbsp;\u003c/p\u003e\n\u003cp\u003ePCA \u0026nbsp; \u0026nbsp; \u0026nbsp; Principal component analysis\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eSVM \u0026nbsp; \u0026nbsp; \u0026nbsp; Support Vector Machine\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eRF \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; Random Forest\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eLASSO \u0026nbsp; \u0026nbsp; Least Absolute Shrinkage and Selection Operator\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eROC \u0026nbsp; \u0026nbsp; \u0026nbsp; Receiver Operating Characteristic\u0026nbsp;\u003c/p\u003e\n\u003cp\u003ePDB \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;Protein Data Bank\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eNETs \u0026nbsp; \u0026nbsp; \u0026nbsp; Neutrophil extracellular traps\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eBBB \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; Blood-brain barrier\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eITCM \u0026nbsp; \u0026nbsp; \u0026nbsp; Integrated Traditional Chinese Medicine\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTCM \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; Traditional Chinese medicine\u0026nbsp;\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eAcknowledgements\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003eAuthor contributions\u003c/p\u003e\n\u003cp\u003eLSS and HZX designed the study. WZZ, HKL,OYDD and CZH conducted literature collection and summary. LSS drafted the manuscript. All authors critically revised the manuscript.\u003c/p\u003e\n\u003cp\u003eFunding\u003c/p\u003e\n\u003cp\u003eThis work was supported by the Science and Technology Program of Guangzhou, China [2024B03J0436]; and Research Funds of Centre for Leading Medicine and Advanced Technologies of IHM [No. 2023IHM01052]. The funding sources had no role in study design, data collection, analysis, or interpretation.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets generated during the current study are available in the GEO repository, https://www.ncbi.nlm.nih.gov/geo/\u003c/p\u003e\n\u003cp\u003eEthics approval and consent to participate\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003eConsent for publication\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003eCompeting interests\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eFeigin VL, Owolabi MO. Pragmatic solutions to reduce the global burden of stroke: A world stroke organization-lancet neurology commission. Lancet Neurol. 2023;22:1160\u0026ndash;206. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/s1474-4422(23)00277-6\u003c/span\u003e\u003cspan address=\"10.1016/s1474-4422(23)00277-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFan J, Li X, Yu X, et al. Global burden, risk factor analysis, and prediction study of ischemic stroke, 1990\u0026ndash;2030. Neurology. 2023;101:e137\u0026ndash;50. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1212/wnl.0000000000207387\u003c/span\u003e\u003cspan address=\"10.1212/wnl.0000000000207387\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003edel Zoppo GJ. The neurovascular unit in the setting of stroke. J Intern Med. 2010;267:156\u0026ndash;71. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/j.1365-2796.2009.02199.x\u003c/span\u003e\u003cspan address=\"10.1111/j.1365-2796.2009.02199.x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDoyle KP, Simon RP, Stenzel-Poore MP. Mechanisms of ischemic brain damage. Neuropharmacology. 2008;55:310\u0026ndash;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.neuropharm.2008.01.005\u003c/span\u003e\u003cspan address=\"10.1016/j.neuropharm.2008.01.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWoodruff TM, Thundyil J, Tang SC, et al. Pathophysiology, treatment, and animal and cellular models of human ischemic stroke. Mol neurodegeneration. 2011;6:11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/1750-1326-6-11\u003c/span\u003e\u003cspan address=\"10.1186/1750-1326-6-11\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang HJ, Ran HF, Yin Y, et al. Catalpol improves impaired neurovascular unit in ischemic stroke rats via enhancing vegf-pi3k/akt and vegf-mek1/2/erk1/2 signaling. Acta Pharmacol Sin. 2022;43:1670\u0026ndash;85. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41401-021-00803-4\u003c/span\u003e\u003cspan address=\"10.1038/s41401-021-00803-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShi G, Chen J, Zhang C, et al. Astragaloside iv promotes cerebral angiogenesis and neurological recovery after focal ischemic stroke in mice via activating pi3k/akt/mtor signaling pathway. Heliyon. 2023;9:e22800. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.heliyon.2023.e22800\u003c/span\u003e\u003cspan address=\"10.1016/j.heliyon.2023.e22800\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKerr N, Dietrich DW, Bramlett HM, et al. Sexually dimorphic microglia and ischemic stroke. CNS Neurosci Ther. 2019;25:1308\u0026ndash;17. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/cns.13267\u003c/span\u003e\u003cspan address=\"10.1111/cns.13267\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThe gene ontology resource. 20 years and still going strong. Nucleic Acids Res. 2019;47:D330. d338.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKanehisa M, Goto S, Kegg. Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27\u0026ndash;30. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/28.1.27\u003c/span\u003e\u003cspan address=\"10.1093/nar/28.1.27\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZeng D, Ye Z, Shen R, et al. Iobr: Multi-omics immuno-oncology biological research to decode tumor microenvironment and signatures. Front Immunol. 2021;12:687975. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fimmu.2021.687975\u003c/span\u003e\u003cspan address=\"10.3389/fimmu.2021.687975\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJaddoe VW, de Jonge LL, Hofman A, et al. First trimester fetal growth restriction and cardiovascular risk factors in school age children: Population based cohort study. BMJ (Clinical Res ed. 2014;348:g14. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1136/bmj.g14\u003c/span\u003e\u003cspan address=\"10.1136/bmj.g14\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRink C, Khanna S. Microrna in ischemic stroke etiology and pathology. Physiol Genom. 2011;43:521\u0026ndash;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1152/physiolgenomics.00158.2010\u003c/span\u003e\u003cspan address=\"10.1152/physiolgenomics.00158.2010\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKanki H, Matsumoto H, Togami Y, et al. Importance of micrornas by mrna-microrna integration analysis in acute ischemic stroke patients. J stroke Cerebrovasc diseases: official J Natl Stroke Association. 2023;32:107277. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jstrokecerebrovasdis.2023.107277\u003c/span\u003e\u003cspan address=\"10.1016/j.jstrokecerebrovasdis.2023.107277\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarmelstein AM, Moreno J, Fiedler D. Chemical approaches to studying labile amino acid phosphorylation. Top Curr Chem (Cham). 2017;375:22. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s41061-017-0111-1\u003c/span\u003e\u003cspan address=\"10.1007/s41061-017-0111-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMilton CK, Self AJ, Clarke PA, et al. A genome-scale crispr screen identifies the erbb and mtor signaling networks as key determinants of response to pi3k inhibition in pancreatic cancer. Mol Cancer Ther. 2020;19:1423\u0026ndash;35. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1158/1535-7163.Mct-19-1131\u003c/span\u003e\u003cspan address=\"10.1158/1535-7163.Mct-19-1131\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVanhaesebroeck B, Perry MWD, Brown JR, et al. Pi3k inhibitors are finally coming of age. Nat Rev Drug Discov. 2021;20:741\u0026ndash;69. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41573-021-00209-1\u003c/span\u003e\u003cspan address=\"10.1038/s41573-021-00209-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFruman DA, Chiu H, Hopkins BD, et al. The pi3k pathway in human disease. Cell. 2017;170:605\u0026ndash;35. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.cell.2017.07.029\u003c/span\u003e\u003cspan address=\"10.1016/j.cell.2017.07.029\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChu E, Mychasiuk R, Hibbs ML, et al. Dysregulated phosphoinositide 3-kinase signaling in microglia: Shaping chronic neuroinflammation. J Neuroinflamm. 2021;18:276. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12974-021-02325-6\u003c/span\u003e\u003cspan address=\"10.1186/s12974-021-02325-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChalmin F, Humblin E, Ghiringhelli F, et al. Transcriptional programs underlying cd4 t cell differentiation and functions. Int Rev cell Mol biology. 2018;341:1\u0026ndash;61. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/bs.ircmb.2018.07.002\u003c/span\u003e\u003cspan address=\"10.1016/bs.ircmb.2018.07.002\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalas-Perdomo A, Mir\u0026oacute;-Mur F, Urra X, et al. T cells prevent hemorrhagic transformation in ischemic stroke by p-selectin binding. Arterioscler Thromb Vasc Biol. 2018;38:1761\u0026ndash;71. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1161/atvbaha.118.311284\u003c/span\u003e\u003cspan address=\"10.1161/atvbaha.118.311284\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKaya T, Mattugini N, Liu L, et al. Cd8(+) t cells induce interferon-responsive oligodendrocytes and microglia in white matter aging. Nat Neurosci. 2022;25:1446\u0026ndash;57. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41593-022-01183-6\u003c/span\u003e\u003cspan address=\"10.1038/s41593-022-01183-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSu B, Ng LG. Immunological modulation in health and disease. Cell Mol Immunol. 2023;20:981\u0026ndash;2. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41423-023-01066-1\u003c/span\u003e\u003cspan address=\"10.1038/s41423-023-01066-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang R, Li H, Ling C, et al. A novel phenotype of b cells associated with enhanced phagocytic capability and chemotactic function after ischemic stroke. Neural regeneration Res. 2023;18:2413\u0026ndash;23. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.4103/1673-5374.371365\u003c/span\u003e\u003cspan address=\"10.4103/1673-5374.371365\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhai M, Gong S, Luan P, et al. Extracellular traps from activated vascular smooth muscle cells drive the progression of atherosclerosis. Nat Commun. 2022;13:7500. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41467-022-35330-1\u003c/span\u003e\u003cspan address=\"10.1038/s41467-022-35330-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePrass K, Meisel C, H\u0026ouml;flich C, et al. Stroke-induced immunodeficiency promotes spontaneous bacterial infections and is mediated by sympathetic activation reversal by poststroke t helper cell type 1-like immunostimulation. J Exp Med. 2003;198:725\u0026ndash;36. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1084/jem.20021098\u003c/span\u003e\u003cspan address=\"10.1084/jem.20021098\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGister\u0026aring; A, Hansson GK. The immunology of atherosclerosis. Nat Rev Nephrol. 2017;13:368\u0026ndash;80. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/nrneph.2017.51\u003c/span\u003e\u003cspan address=\"10.1038/nrneph.2017.51\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHendrix S, Nitsch R. The role of t helper cells in neuroprotection and regeneration. J Neuroimmunol. 2007;184:100\u0026ndash;12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jneuroim.2006.11.019\u003c/span\u003e\u003cspan address=\"10.1016/j.jneuroim.2006.11.019\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang J, Kang Z, Liu Y, et al. Identification of immune cell infiltration and diagnostic biomarkers in unstable atherosclerotic plaques by integrated bioinformatics analysis and machine learning. Front Immunol. 2022;13:956078. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fimmu.2022.956078\u003c/span\u003e\u003cspan address=\"10.3389/fimmu.2022.956078\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDenorme F, Portier I, Rustad JL et al. Neutrophil extracellular traps regulate ischemic stroke brain injury. J Clin Investig 2022;132. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1172/jci154225\u003c/span\u003e\u003cspan address=\"10.1172/jci154225\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eParrella E, Porrini V, Benarese M et al. The role of mast cells in stroke. Cells 2019;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/cells8050437\u003c/span\u003e\u003cspan address=\"10.3390/cells8050437\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLei W, Liu Z, Su Z, et al. Hyperhomocysteinemia potentiates megakaryocyte differentiation and thrombopoiesis via gh-pi3k-akt axis. J Hematol Oncol. 2023;16:84. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s13045-023-01481-x\u003c/span\u003e\u003cspan address=\"10.1186/s13045-023-01481-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang ZK, Xue L, Wang T, et al. Infiltration of invariant natural killer t cells occur and accelerate brain infarction in permanent ischemic stroke in mice. Neurosci Lett. 2016;633:62\u0026ndash;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.neulet.2016.09.010\u003c/span\u003e\u003cspan address=\"10.1016/j.neulet.2016.09.010\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKuleshov MV, Jones MR, Rouillard AD, et al. Enrichr: A comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/gkw377\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkw377\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. :W90-97.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSheng Z, Ge S, Gao M, et al. Synthesis and biological activity of embelin and its derivatives: An overview. Mini Rev Med Chem. 2020;20:396\u0026ndash;407. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2174/1389557519666191015202723\u003c/span\u003e\u003cspan address=\"10.2174/1389557519666191015202723\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAnika AR, Virendra SA, et al. Mechanistic study on the possible role of embelin in treating neurodegenerative disorders. CNS Neurol Disord Drug Target. 2024;23:55\u0026ndash;66. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2174/1871527322666230119100053\u003c/span\u003e\u003cspan address=\"10.2174/1871527322666230119100053\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSiegel C, Li J, Liu F, et al. Mir-23a regulation of x-linked inhibitor of apoptosis (xiap) contributes to sex differences in the response to cerebral ischemia. Proc Natl Acad Sci USA. 2011;108:11662\u0026ndash;7. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1073/pnas.1102635108\u003c/span\u003e\u003cspan address=\"10.1073/pnas.1102635108\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThippeswamy BS, Nagakannan P, Shivasharan BD, et al. Protective effect of embelin from embelia ribes burm. Against transient global ischemia-induced brain damage in rats. Neurotox Res. 2011;20:379\u0026ndash;86. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s12640-011-9258-7\u003c/span\u003e\u003cspan address=\"10.1007/s12640-011-9258-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKamat PK, Rai S, Swarnkar S, et al. Molecular and cellular mechanism of okadaic acid (oka)-induced neurotoxicity: A novel tool for alzheimer's disease therapeutic application. Mol Neurobiol. 2014;50:852\u0026ndash;65. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s12035-014-8699-4\u003c/span\u003e\u003cspan address=\"10.1007/s12035-014-8699-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTian S, Zhang J, Yuan S et al. Exploring pharmacological active ingredients of traditional chinese medicine by pharmacotranscriptomic map in itcm. Briefings in bioinformatics 2023;24. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bib/bbad027\u003c/span\u003e\u003cspan address=\"10.1093/bib/bbad027\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTian S, Li Y, Xu J et al. Coimmr: A computational framework to reveal the contribution of herbal ingredients against human cancer via immune microenvironment and metabolic reprogramming. Briefings in bioinformatics 2023;24. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bib/bbad346\u003c/span\u003e\u003cspan address=\"10.1093/bib/bbad346\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"journal-of-big-data","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bigd","sideBox":"Learn more about [Journal of Big Data](http://journalofbigdata.springeropen.com)","snPcode":"40537","submissionUrl":"https://submission.nature.com/new-submission/40537/3","title":"Journal of Big Data","twitterHandle":"@SpringerOpen","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Ischemic stroke, PI3K Signaling Pathway, Biomarkers, Immune infiltration, Machine Learning","lastPublishedDoi":"10.21203/rs.3.rs-6186335/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6186335/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e \u003cb\u003eSummary\u003c/b\u003e:High mortality and disability rates in ischemic stroke patients continue to pose substantial societal challenges, with the PI3K signaling pathway emerging as a critical mediator of post-stroke pathological processes. While this pathway's involvement in stroke pathophysiology is established, the complex interplay between PI3K-associated genes, stroke outcomes, and the immune microenvironment remains poorly understood, limiting the development of targeted immunotherapies. Here, we conducted a comprehensive analysis of PI3K pathway-related gene expression patterns in ischemic stroke samples, employing consensus clustering and immune infiltration analysis, coupled with machine learning algorithms and molecular docking experiments. Our analysis revealed two distinct patient subgroups with significant differences in immune infiltration profiles and identified five key diagnostic genes (PIN1, CDK2, VAV3, YWHAB, and CFL1). The developed predictive nomogram demonstrated high accuracy in disease onset prediction, validated through ROC analysis, while molecular docking experiments confirmed strong binding affinities between these genes and potential therapeutic compounds. These findings establish the PI3K signaling pathway as a crucial regulator of cerebrovascular and neural tissue repair following ischemic stroke, with the identified gene signature offering promising applications for early detection and prognostic assessment. Importantly, this classification system may enable the development of personalized immunotherapy strategies, potentially transforming the landscape of individualized stroke management.\u003c/p\u003e","manuscriptTitle":"Machine Learning-Driven Identification of Diagnostic Biomarkers in Ischemic Stroke: Focus on PI3K Pathway ","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-03-13 06:18:47","doi":"10.21203/rs.3.rs-6186335/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-09-22T17:33:44+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-19T23:46:54+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-19T04:53:39+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-10T05:06:32+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"114251200864598066778048792008021830339","date":"2025-08-30T02:43:18+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"335888077797805161526694433885107158688","date":"2025-08-27T21:02:41+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"284721579671659710820719047390953574178","date":"2025-08-23T12:48:22+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-04-11T20:03:11+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-04-06T21:05:31+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-03-10T09:17:20+00:00","index":"","fulltext":""},{"type":"submitted","content":"Journal of Big Data","date":"2025-03-09T01:57:43+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"journal-of-big-data","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bigd","sideBox":"Learn more about [Journal of Big Data](http://journalofbigdata.springeropen.com)","snPcode":"40537","submissionUrl":"https://submission.nature.com/new-submission/40537/3","title":"Journal of Big Data","twitterHandle":"@SpringerOpen","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"939be831-ed42-4a99-803d-5fbe2b565a6f","owner":[],"postedDate":"March 13th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2025-11-17T01:23:17+00:00","versionOfRecord":[],"versionCreatedAt":"2025-03-13 06:18:47","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6186335","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6186335","identity":"rs-6186335","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.