Identification of hub genes contributed to the negative correlation between the incidence of Alzheimer's disease and colorectal cancer via integrated bioinformatics analysis and machine learning

preprint OA: closed
Full text JSON View at publisher
Full text 127,201 characters · extracted from preprint-html · click to expand
Identification of hub genes contributed to the negative correlation between the incidence of Alzheimer's disease and colorectal cancer via integrated bioinformatics analysis and machine learning | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Identification of hub genes contributed to the negative correlation between the incidence of Alzheimer's disease and colorectal cancer via integrated bioinformatics analysis and machine learning Wanchang Wang, Qianqian Yang, Menglan Zhang, Yuxuan Xu, Yanhong Yang, and 7 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4806177/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Alzheimer's disease (AD) and colorectal cancer (CRC) are two kind of age-related diseases with a negative correlation in risk of prevalence. In this study, we aimed to identify the hub genes and immune-associated biomarkers contributing to the inverse relationship between AD and CRC. The gene expression data from public repositories and the bioinformatics techniques, including differentially expressed genes (DEGs) analysis, weighted gene co-expression network analysis (WGCNA), and machine learning algorithms, were integrated to screen the hub genes that are inversely expressed in AD and CRC. The immunohistochemistry (IHC) analysis was performed to validate the identified hub genes in the cancer tissues from CRC patients or brain tissues from 5×FAD mice. We have identified 6 hub genes, including EBNA1BP2, PPA1, CCT4, SLC39A10, RAN, and PPA1, which potentially play critical roles in the negative correlation between AD and CRC and might provide valuable insights for the diagnosis, therapy, and prognosis of AD or CRC. Functional enrichment analysis highlighted the immune system's crucial roles in connecting AD and CRC processes. Moreover, the percent of immune cell infiltration in brain or colorectal tissues were different in patients with AD or CRC, offering insights for targeted immunotherapies. Finally, the expression of EBNA1BP2, PPA1 and SLC39A10 were validated to be downregulated in AD, but upregulated in CRC. In conclusion, these results suggested that some hub genes, such as EBNA1BP2, PPA1 and SLC39A10, might contribute to the inverse relationship between AD and CRC, which lay a foundation for further investigating the underlying mechanism, as well as for the development of novel diagnostic and therapeutic strategies for this two diseases. Alzheimer's disease Colorectal cancer bioinformatics analysis immune cells infiltration Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Introduction Alzheimer's disease (AD) and cancers represent two kind of age-related diseases that pose significant challenges to global public health. AD, a progressive neurodegenerative disorder, is characterized by the accumulation of β-amyloid plaques and hyperphosphorylated tau proteins resulting in neurofibrillary tangles within the brain [1]. It is estimated that the number of individuals affected by AD will reach 131 million by 2050 [2]. Conversely, cancer is marked by abnormal cellular proliferation, spreading and metastasis within organ tissues under the persistent influence of harmful elements, making it one of the greatest threats to human health [3]. Growing epidemiologic studies indicate a negative correlation between the incidence of AD and multiple cancer types, wherein the risk of developing cancer in AD patients is significantly reduced and vice versa [4–8]. However, the underlying mechanisms involved in the inverse relationship between AD and CRC remain unclear. Among the multiple cancer types negatively correlated with AD in incidence, colorectal cancer (CRC) is the most prevalent common malignant tumor of the digestive system, with high mortality rates worldwide [9]. According to GLOBOCAN 2020 estimates by the International Agency for Research on Cancer (IARC) under the World Health Organization, there were 1.93 million new CRC cases and 935, 200 deaths globally in 2020, ranking the third and second highest among all malignant tumors, respectively [10]. In light of the inverse relationship between AD and CRC, understanding the potential molecular mechanisms underlying this correlation and identifying key biomarkers contributing to their negative association is critical for the development of novel prevention and treatment strategies. There is evidence suggesting that immune regulation may be a factor linking AD and CRC [11]. Few studies have systematically examined the immune-related genes that connect AD and CRC, using the machine learning methods. In this study, we performed a comprehensive bioinformatics analysis pipeline to identify hub genes involved in the negative relationship between AD and CRC. The gene microarray data were downloaded from the GEO database and analyzed using differential expression, WGCNA, functional enrichment analysis, and machine learning algorithms, including LASSO, Random Forest and SVM-RFE[12]. Furthermore, the immune cell infiltration in CRC and AD was explored using the CIBERSORT algorithm. These results will identify the potential immune-associated diagnostic markers, advance our understanding of their interplay and provide new perspectives for future diagnosis and therapeutic strategies of AD or CRC. Materials and methods Data download In this study, the RStudio software (version 4.3.3; URL: https://www.r-project . org/) was used to download the GSE5281[13] and GSE132903[14] AD datasets, as well as the GSE113513[15] and GSE74602[16] CRC datasets from the Gene Expression Omnibus (GEO) database ( https://www.ncbi.nlm.nih.gov/geo/ )[17]. All data processing and analyses were conducted in RStudio. During the data processing, we performed missing value imputation, data cleaning, and standardization for both datasets. The GSE113513 dataset (GPL15207 platform) comprises 14 pairs of cancerous and matched non-cancerous tissues, while the GSE74602 CRC dataset (GPL6104 platform) consists of 30 CRC samples and 30 healthy control samples. The GSE5281 AD dataset (GPL570 platform) includes a total of 161 samples, collected from six regions: entorhinal cortex, hippocampus, medial temporal gyrus, posterior cingulate, superior frontal gyrus, and primary visual cortex. However, for this study, we only selected three regions: entorhinal cortex, hippocampus, and posterior cingulate, which comprise 29 AD patients (EC: 10, HIP: 10, PC: 9) and 29 healthy controls (EC: 10, HIP: 10, PC: 9). The GSE132903 dataset (GPL16699 platform) features 97 AD patient samples and 98 healthy control samples. A simplified workflow of the current investigation is illustrated in Fig. 1 . Differentially expressed genes (DEGs) analysis In this study, we employed the 'limma' R package to screen the differentially expressed genes (DEGs) from the AD dataset (GSE5281) and CRC datasets (GSE113513). DEGs in the GSE5281 dataset were selected with the criteria: |log2 (fold change)| ≥ 1.5 and FDR < 0.01. We used FDR instead of p -values to account for multiple hypothesis testing and to efficiently control the false discovery rate[18]. Applying FDR helped reduce the occurrence of potential false positives in the results. The robustness and stability of the identified DEGs were improved by correcting for these potential false discoveries. Likewise, the CRC dataset (GSE113513) was processed and analyzed using the same criteria. The results were visualized using gene clustering heatmaps and volcano plots. A combined analysis of DEGs between GSE113513 and GSE5281 datasets was performed, and Venn diagrams were generated to identify intersecting DEGs. We focused on the down-regulated DEGs in the AD group overlapping with up-regulated DEGs in the CRC group, and the up-regulated DEGs in the AD group overlapping with down-regulated DEGs in the CRC group. Weighted Gene Co-Expression Network Analysis (WGCNA) and the identification of key module genes Using microarray specimens, Weighted Gene Co-expression Network Analysis (WGCNA) represents one of the most important and widely applied systems bioinformatics methods to describe the correlation patterns among genes. Genes can be grouped into modules based on their co-expression similarities across samples using the "WGCNA" R package[19]. Additionally, the WGCNA method can be used to connect modules to clinical elements outside the genome. In this way, relevant functional networks can be used to identify biomarkers and new molecules. As input files, normalized mRNA expression data (calculated using the R package "limma") were used to perform WGCNA to identify gene coexpression and the correlation between gene modules and clinical characteristics (AD or CRC compared to control groups). For each disease group, the following steps were followed: (1) By using the R package "gplots," hierarchical clustering analysis was performed to identify outliers in the sample[20]. (2) The "pickSoftThreshold" package function was utilized to screen out soft-power parameters ranging from 1 to 20[20]. (3) A topological overlap matrix (TOM) is created by converting the matrix of correlations with the most appropriate b value to an adjacency matrix and then into a topological overlap matrix. (4) Based on the average linkage hierarchical clustering, a hierarchical clustering tree (linked gene best fit) was constructed, and then the dynamic tree cut algorithm (minModuleSize = 30) was used to find different gene modules. Similar modules were merged by a cut height in each group. (5) Gene modules and clinical phenotypes (CTRL and AD or CRC) were correlated using the Pearson correlation coefficient. Functional enrichment analysis To explore the biological function and concrete mechanism of the DEGs that are involved in the negative correlation between AD and CRC, we carried out Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis by importing the genes into the DAVID database ( https://david.ncifcrf.gov/ )[21]. A threshold of p < 0.05 was regarded to be significant enrichment. Additionally, the findings of functional enrichment analysis were displayed via bubble diagram and circos plot. Machine learning In order to screen the vital genes in AD or CRC, three well-established machine learning algorithms (LASSO: Least Absolute Shrinkage and Selection Operator; SVM-RFE: Support Vector Machine- Recursive Feature Elimination; RF: Random Forest) were utilized. Firstly, the 20 shared genes obtained previously were input into the LASSO algorithm in separate disease groups[22]. The LASSO algorithm effectively identified relevant genes for each condition. Subsequently, we employed the Support Vector Machines Recursive Feature Elimination (SVM-RFE) to refine the feature selection. Utilizing the "e1071" and "caret" packages for SVM modeling, SVM-RFE systematically eliminated features in a backward regression process to identify the optimal hub genes, which were then incorporated into the SVM model for each disease group. The outcome of SVM-RFE was visualized as a plot, with the Root Mean Square Error (RMSE) as an evaluation metric in tenfold cross-validation.[23] This plot illustrated the relationship between the number of variables and RMSE, providing insights into the optimal number of genes for maximum diagnostic performance. Finally, we applied Random Forest using the R package “randomForest” to classify significant genes and identify the most important variables using a decision tree algorithm[24]. After constructing a random forest model with 500 trees on the discovery cohorts and determining the optimal number of trees using cross-validation errors, we ranked the genes by importance and selected the top 10 genes. We then obtained the final result by taking the intersection of the results from the three algorithms. Validation of the Expression of Hub Genes All the identified hub genes were further validated in GSE132903 and GSE89076 to avoid false positive rates. The comparison between the AD and healthy control groups, or CRC and normal control of the two sets was calculated with the T-test. P < 0.05 was considered a significant difference between the groups. Receiver Operating Characteristic (ROC) Curves of the Hub Genes To evaluate the predictive efficiency for disease of the identified critical genes, the accuracy of hub genes was evaluated by ROC validation and the area under the curve (AUC) values were calculated by an online website ( https://www.xiantao.love/products/ ). Efficacy evaluation: non-efficiency (AUC ≦ 0.5); modest-efficiency (0.5 < AUC 0.7). Immune infiltration analysis The “CIBERSORT” package was executed to assess the number of the immune cell infiltration from the CRC gene expression profile[25]. The abundance and proportion of the immune cell infiltration were presented for each sample as barplot using the “ggplot2” package. The differences of the proportion of 22 types of immune cells between CRC colorectal samples and control colorectal samples were compared by adopting Wilcoxon test, where P < 0.05 was regarded to be of statistical significance and was displayed by Stacked histogram based on the “ggplot2” package. Subsequently, the association of 22 types of invading immune cells was shown with the use of the “corrplot” package. Finally, Spearman’s rank correlation coefficient was adopted for the correlation analysis between the expression of diagnostic biomarkers and the content of infiltrated immune cells, and P < 0.05 was thought to be of statistical significance. Sample collection of patients with CRC and AD mouse model Paired tumor and adjacent non-tumor colorectal tissue samples were obtained from six CRC patients undergoing surgery at the First Hospital of Hebei Medical University, Hebei, China. These samples comprised six cancerous tissues and their matched para-cancerous tissues. The following samples were collected: Colorectal cancer tissues (n = 6) and matched adjacent non-tumor tissues (n = 6). Exclusion criteria entailed cases with previous malignant tumors, hereditary colorectal cancer, or familial adenomatous polyposis. The patients' clinical information is provided in Table 1. Approval for the study was granted by the Clinical Research Ethics Committee of the First Hospital of Hebei Medical University (No. 2023-00043). Experimental animals, including 3-month-old 5×FAD mice and their corresponding wild-type (WT) mice, were obtained from Shanghai Southern Model Biological Technology Co., Ltd. All mice were housed in an SPF-grade animal facility at Hebei Medical University under consistent conditions, including a regular light-dark cycle, controlled temperature (22–24℃), and humidity (50–60%). Five to six mice per cage were provided with ad libitum access to food and water. All experimental procedures involving animals were conducted in accordance with the ethical guidelines and approved by the Ethics Committee of the First Hospital of Hebei Medical University (No. 2023-00043). Immunohistochemistry For the IHC staining procedure, 3 µm thick sections were obtained from formalin-fixed paraffin-embedded tissues and placed onto poly-L-lysine-coated slides. The primary antibodies used and their respective techniques are detailed in Table 2, following the manufacturer's guidelines. Antigen retrieval was performed using the pressure cooker method, and diaminobenzidine (DAB) was employed as the chromogen. Quantification of IHC images was carried out using ImageJ software and the IHC Profiler plugin[26]. The software evaluates the average optical density (staining intensity) and the percentage of positively stained area (staining area) for each image, resulting in four levels of scoring: High positive (3+), Positive (2+), Low Positive (1+), and Negative (0). Statistical analysis The R software 3.6.5 and GraphPad Prism version 8.0.1 (GraphPad Software Inc., San Diego, CA, USA) were used for statistical analyses and visualization. For differences of gene expression levels or immunocyte fractions between different clinical groups, a two-sided Wilcoxon test was performed. Correlation analysis was conducted using the Spearman test. The p-value was adjusted by the FDR method for multiple hypothesis testing. Dichotomous variables were compared using the chi-square test. Unpaired Student′s t-test was utilized to compare differences between the two groups. P < 0.05 was regarded as statistical significance. Result Identification of potential key genes involved in the negative correlation between the incidence of AD and CRC Using the Limma method, we identified a total of 3821 DEGs in the AD dataset, including 2089 upregulated and 1732 downregulated genes. The heatmap and volcano plot of DEGs in AD are shown in Fig. 2 A and C. In the CRC dataset, we screened out 3274 DEGs, consisting of 1594 upregulated and 1680 downregulated genes (Fig. 2 B, D). Subsequently, we identified 144 genes at the intersection of upregulated DEGs in the AD group and downregulated DEGs in the CRC group, as well as 194 genes at the intersection of downregulated DEGs in the AD group and upregulated DEGs in the CRC group, which potentially involved in the negative correlation between the incidence of AD and CRC (Fig. 2 E, F). Weighted Gene Co-Expression Network Analysis and Key Module Identification To investigate the correlation between the identified key genes with AD or CRC, we performed Weighted Gene Co-expression Network Analysis (WGCNA) in addition to analyzing the differential expression between the two groups. Using the soft thresholding approach, this study constructed a co-expression network, with parameter b being vital for co-expression networks to maintain a scale-free topology. Gene expression-based biological networks were most likely to be scale-free. Accordingly, in the AD group, a fitting index greater than 0.88 was considered a scale-free topology, and b was set at 5 (Fig. 3 A). The adjacency matrix was generated by using the adjacency function. As shown in Fig. 3 C, hierarchical cluster was constructed using the TOM dissimilarity measure. Total of nine co-expression modules were identified, and the modules that had a P -value < 0.01 were regarded as the key modules. In the AD group, the green yellow, brown, black, and green modules showed a strong positive correlation with disease, while the blue and red modules showed a strong negative correlation with disease (Fig. 3 E). WGCNA was also applied to the CRC group, with b = 8 identified as the optimal value for soft power (Fig. 3 B). In total, 22 modules were identified, where brown and saddle brown modules showed a strong positive correlation with disease, and ivory, dark orange, green yellow, green, and medium purple 3 modules showed a strong negative correlation with disease (Figs. 3 D, F). In the key modules identified in both AD and CRC groups, we pinpointed hub genes based on the criterion of having an absolute module membership (|MM|) greater than 0.8. Subsequently, an intersection of genes from positively correlated modules with AD and negatively correlated modules with CRC yielded 15 gene candidates, while the intersection of genes from negatively correlated modules with AD and positively correlated modules with CRC uncovered 28 distinct genes (Fig. 3 G). These results suggested that the identified key modules above might play critical roles in the inverse relationship between AD and CRC. Functional enrichment analysis of the differentially expressed genes that are potentially involved in the negative correlation between AD and CRC In an effort to reveal the hub genes contributed to the negative correlation between AD and CRC, the intersection of DEGs and genes identified in the WGCNA modules were analyzed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses. Figure 4 A illustrates the overlap of 20 key genes, which represent potential candidates warranting further investigation in the context of AD and CRC associations. The GO analysis identified several enriched pathways, with a few significantly enriched Biological Process pathways, such as "mononuclear cell proliferation," "leukocyte proliferation," and "defense response to tumor cell," indicating that immune response may play a crucial role in the negative correlation between AD and CRC (Fig. 4 B). Moreover, the analysis highlighted other relevant pathways, including "cellular nitrogen compound biosynthetic process," "positive regulation of cellular metabolic process," "nucleotide binding," "leukocyte activation," "protein localization to nucleus," and "lymphocyte proliferation." The KEGG pathway analysis unveiled pathways, such as "Aminoacyl-tRNA biosynthesis," "Ribosome biogenesis in eukaryotes," "Purine metabolism," "Oxidative phosphorylation," "RNA transport," "Collecting duct acid secretion," "Rheumatoid arthritis," "Phagosome," "Human T-cell leukemia virus 1 infection," and "Metabolic pathways" (Fig. 4 C). These insights shed light on the potential molecular mechanisms and pathways that underlie the differences between AD and CRC gene expression profiles, laying the groundwork for future research to investigate potential therapeutic targets and drug development strategies. Identification of candidate hub genes via machine learning To further identify the most promising candidate diagnostic gene targets with significant discriminative value between disease and control groups, three different machine learning algorithms (LASSO, Random Forest and SVM-RFE) were applied to the 20 genes identified in our previous analysis. In the AD group, we selected lambda.1se = 0.101163531730007 for the LASSO analysis, resulting in 7 genes with non-zero coefficients (Figs. 5 A, B). Utilizing the Random Forest algorithm, we assessed the importance of all of 20 genes and selected the top 10 most important genes (Figs. 5 C, D). Simultaneously, the SVM-RFE method identified 8 genes based on the lowest RMSE (Figs. 5 E). By overlapping the results from the three algorithms, 6 shared biomarkers were discovered, including ATP8B1, ENC1, GARS, CDK5, EBNA1BP2, and PPA1, in the AD group (Fig. 5 F). Similarly, in the CRC group, we chose lambda.1se = 0.109850004658687 for the LASSO analysis, yielding 9 genes with non-zero coefficients (Figs. 5 G, H). The Random Forest algorithm evaluated the importance of all of 20 genes, selecting the top 10 most important genes (Figs. 5 I, J). The SVM-RFE method found 8 genes based on the lowest RMSE (Figs. 5 K). Overlapping the results revealed 6 shared biomarkers, such as ENC1, CCT4, MRPL3, SLC39A10, RAN, and PPA1, in the CRC group (Fig. 5 L). These findings present the most promising candidate biomarkers for diagnosing and classifying in both AD and CRC groups, paving the way for future investigations into potential therapeutic targets and drug development strategies. Identification and validation of the diagnostic biomarkers for AD or CRC To verify the reliability of the identified hub genes for AD and CRC, we assessed their expression in the discovery datasets (GSE5281 for AD and GSE113513 for CRC) as well as the validation datasets (GSE132903 for AD and GSE89076 for CRC). The significant differences were observed when compared to the corresponding control groups. In the AD dataset GSE5281, the hub genes CCT4, GARS, SLC39A10, EBNA1BP2, RAN, and PPA1 showed a significant downward trend (Fig. 6 A). In the AD validation dataset GSE132903, all genes exhibited a trend of down regulation, except for ATP8B1 (Fig. 6 B). In contrast, all genes demonstrated a trend of up regulation in the CRC datasets of GSE113513 and GSE89076, except for ATP8B1 (Fig. 6 C, D). Based on these findings, the AD hub genes were determined to be GARS, EBNA1BP2, and PPA1, while the CRC hub genes included CCT4, SLC39A10, RAN, and PPA1. Subsequently, the ROC curves were plotted for these hub genes using their expression data from the GSE5281, GSE132903, GSE113513, and GSE89076 datasets. For AD in the GSE5281 dataset, GARS (AUC = 0.668), EBNA1BP2 (AUC = 0.733), and PPA1 (AUC = 0.715) demonstrated potential utility as biomarkers (Fig. 6 E). In the GSE132903 dataset, GARS (AUC = 0.702), EBNA1BP2 (AUC = 0.740), and PPA1 (AUC = 0.719) displayed potential utility as biomarkers (Fig. 6 F). For CRC in the GSE113513 dataset, CCT4 (AUC = 1.000), SLC39A10 (AUC = 1.000), RAN (AUC = 1.000), and PPA1 (AUC = 0.995) showed high AUC scores (Fig. 6 G). Finally, in the GSE89076 dataset, CCT4 (AUC = 0.942), SLC39A10 (AUC = 1.000), RAN (AUC = 0.928), and PPA1 (AUC = 0.957) exhibited similar potential as biomarkers (Fig. 6 H). Ultimately, genes with AUC > 0.7 were deemed potential diagnostic biomarkers, including EBNA1BP2 and PPA1 for AD, and CCT4, SLC39A10, RAN, and PPA1 for CRC. Correlations between the hub genes and the immune cell infiltration in CRC or AD The functional and pathway analysis of DEGs that are potentially involved in the negative correlation between AD and CRC indicate a close link with inflammatory and immune processes. To further explore the crucial roles of the immune system in AD and CRC, the CIBERSORT algorithm was used to characterize immune cells, investigate immune regulation, and examine the relationship between diagnostic biomarkers and immune cell infiltration. Among the 22 types of immune cells in each sample, the relative percent of 9 immune cell subpopulations in AD group were significantly different from that in control samples (Fig. 7 A). Relative to the control group, AD patients display increased proportions of B cells naive, T cells CD4 memory resting, Macrophages M0, and Macrophages M1, and decreased proportions of Plasma cells, T cells gamma delta, Macrophages M2, and Dendritic cells activated (Figure. 7B). The correlation analysis of these 22 immune cell types showed that T cells CD4 memory activated were positively correlated with NK cells resting (r = 0.39, p < 0.001), and Macrophages M2 were positively associated with Eosinophils (r = 0.36, p < 0.01). Conversely, B cells naive had a negative association with B cells memory (r = -0.64, p < 0.001), and T cells CD4 naive were negatively correlated with T cells CD4 memory resting (r = -0.52, p < 0.001) (Figure. 7C). Furthermore, an investigation of the relationship between the expression of the 2 hub genes and the proportion of differentially infiltrated immune cell types reveals limited associations between hub genes EBNA1BP2 and PPA1 with immune cells in AD (Figure. 7D). The proportions of 22 immune cell types in each CRC and control colorectal sample were highlighted in Fig. 8 A, identifying significant differences in 9 immune cell subpopulations between the two group. Relative to the control group, CRC patients demonstrated higher proportions of Plasma cells, T cells CD4 memory activated, T cells gamma delta, Macrophages M0, Macrophages M2, Dendritic cells activated, Mast cells resting, Mast cells activated, and Neutrophils, and decreased proportions of Plasma cells, T cells gamma delta, Macrophages M2, and Dendritic cells activated (Figure. 8B). In addition, the correlation analysis of these 22 immune cell types in CRC reveals that NK cells resting are notably positively correlated with Macrophages M0 (r = 0.39, p < 0.001), while B cells naive have a positive association with T cells CD4 naive (r = 0.36, p < 0.01). In contrast, T cells follicular helper are negatively correlated with Mast cells activated (r = -0.64, p < 0.001), and Monocytes have a negative association with Macrophages M2 (r = -0.52, p < 0.001). The examination of the association between the expression of the 4 hub genes and the proportion of differentially infiltrated immune cell types indicates that all hub genes CCT4, SLC39A10, RAN, and PPA1 exhibit a significant correlation with immune cell accumulation in CRC (Figure. 8D). The comparisons of immune cell infiltration revealed less infiltration in the brains of AD patients than in the intestines of CRC patients. This disparity might be attributed to the blood-brain barrier and the unique immune environments in AD or CRC. Understanding these differences is essential for comprehending the immune characteristics of the diseases and provides crucial information for targeted immunotherapies for AD or CRC. Validation of hub genes by immunohistochemical analysis in hippocampal tissues of AD mouse and colorectal tissues of CRC patients Immunohistochemical analysis was conducted to investigate the expression levels of hub genes in mouse hippocampal tissues and human colorectal tissues. The representative images of immunohistochemical staining for EBNA1BP2, PPA1, CCT4, SLC39A10, and RAN in the hippocampus of 5×FAD mice and WT mice are shown in Fig. 9 A. The quantification results demonstrated a significant decrease in the expression of EBNA1BP2, PPA1, and SLC39A10 in the hippocampal tissues of 5×FAD mice compared to WT mice, whereas no differences were observed for CCT4 and RAN (Fig. 9 B). Furthermore, the representative images of immunohistochemical staining for EBNA1BP2, PPA1, CCT4, SLC39A10, and RAN in cancerous and adjacent non-cancerous tissues are presented in Fig. 9 C. Quantification results depicted in Fig. 9 D indicate significantly increased expression of EBNA1BP2, PPA1, and SLC39A10 as well as significantly reduced expression of RAN in cancerous tissues compared to adjacent non-cancerous tissues. However, the expression trend for RAN is not consistent with our previous findings. These results confirmed that the hub genes, including EBNA1BP2, PPA1, and SLC39A10, might be potential target for further investigation. Discussion In recent years, an increasing body of epidemiological evidence has suggested an inverse biological relationship between Alzheimer's disease (AD) and various types of cancer[4–8]. Despite some progress in understanding this relationship, mechanistic studies exploring its basis remain limited. A deeper understanding of the molecular mechanisms underlying the negative correlation between AD and cancer is essential for further investigation. This includes potential molecular pathways that might account for the negative association between AD and different cancer types. The advancements in biomedical technologies, extensive omics datasets, and powerful bioinformatics tools have now enabled a more comprehensive and large-scale examination of the complexity of these age-related diseases.In this study, a multifaceted bioinformatics approach was utilized to identify the hub genes contributing to the inverse relationship between AD and CRC. By integrating gene expression data from publicly available repositories and employing a range of bioinformatics techniques, including differential gene expression analysis, weighted gene co-expression network analysis (WGCNA), and machine learning algorithms, the key genes and immune-associated biomarkers were investigated in AD and CRC. Through our comprehensive analysis, we discovered two hallmark genes (EBNA1BP2 and PPA1) for AD, and four pivotal genes (CCT4, SLC39A10, RAN, and PPA1) for CRC, shedding light on the intricate interconnections between these two age-related diseases. Furthermore, our results of validation using immunohistochemical analysis confirmed that three hub genes, including EBNA1BP2, PPA1, and SLC39A10, hold substantial potential for further investigation and provide valuable insights into their roles in the inverse relationship between AD and CRC. In AD, an imbalanced immune response could aggravate neuroinflammation and lead to cognitive decline[27]. Comparing immune cell infiltration between AD and CRC reveals that the brains of AD patients show lower infiltration than the colons of CRC patients. This difference might be partly due to the blood-brain barrier (BBB) and the unique immune environments of the two diseases[28]. However, the influence of the BBB on this discrepancy warrants further investigation. The BBB helps maintain brain homeostasis by protecting it from the systemic immune system and inflammation. In AD, a disrupted BBB could result in increased immune cell infiltration, intensifying neuroinflammation and contributing to cognitive decline. In contrast, in CRC, an appropriate level of immune cell infiltration in the intestinal immune environment could play a protective role in maintaining gut homeostasis and preventing tumor development[29]. This balance suggests that immune cells might have differential effects on disease progression in different tissue contexts. In the brain, excessive immune cell infiltration could exacerbate neurodegeneration and cognitive impairment, while in the gut, a balanced infiltration might be beneficial for CRC prevention. The PPA1 gene, which encodes inorganic pyrophosphatase (PPA1), plays a crucial role in cellular energy metabolism, particularly in the context of phosphate metabolism. This enzyme catalyzes the conversion of inorganic pyrophosphate (PPi) to inorganic phosphate (Pi), providing an alternative energy source to ATP. Although no direct evidence currently links PPA1 to Alzheimer's disease (AD), its potential involvement in mechanisms that intersect with AD pathophysiology warrants exploration. For instance, the dysregulation of PPA1 could indirectly impact neuronal metabolic states, given the brain's high energy demands and the known energy metabolism disruptions in AD[30]. Furthermore, PPA1's participation in cellular stress responses, including those to oxidative and metabolic stress, positions it as a potential factor in the neuronal damage and death observed in AD, where oxidative stress is notably elevated[30]. In addition to its metabolic and stress response roles, PPA1's influence on cell signaling pathways such as JNK/p53, Wnt/β-catenin, and PI3K/AKT/GSK-3β, which are implicated in cell proliferation and migration, also suggests a connection to the neuroinflammatory processes and neuronal death associated with AD[30]. Moreover, PPA1's modulation of apoptosis pathways could indirectly affect neuronal survival in the context of AD, where apoptosis is a significant contributor to neurodegeneration[31].In the realm of oncology, PPA1's overexpression in various cancers, including colorectal cancer (CRC), and its involvement in tumorigenesis through the aforementioned signaling pathways, indicate its role in cancer cell dynamics and its association with patient prognosis[30]. The correlation between PPA1 overexpression and reduced survival rates, along with its impact on tumor cell proliferation and invasiveness, highlights PPA1 as a potential therapeutic target in CRC. Recent studies have also unveiled the complex interplay of genomic alterations and epigenetic modifications in CRC pathogenesis, further emphasizing the multifaceted role of PPA1 in disease processes[32]. EBNA1BP2 is involved in rRNA processing, which is crucial for maintaining cellular functions, as it impacts protein synthesis. In Alzheimer's Disease (AD), abnormal protein synthesis contributes to neurodegeneration[33], while in Colorectal Cancer (CRC), dysregulation of protein synthesis is linked to uncontrolled cell proliferation[34–35]. Besides, CCT4 plays a vital role in protein folding and quality control, which is essential for neuronal cell health[36]. In AD, the accumulation of misfolded proteins, such as amyloid precursor proteins and tau protein, contributes to neurodegeneration[37]. Moreover, CCT4 might play a part in the folding and stability of neurotransmitter receptors, which are crucial for neuronal signaling. Imbalances in the neurotransmitter system in AD could be linked to CCT4's function[38]. In the case of CRC, while direct links have not been established, previous research implies that CCT4 could have a role in other cancer types[39], sharing certain mechanisms with CRC. These shared mechanisms include protein folding, quality control, and regulation of the cell cycle, apoptosis, and tumor microenvironment signaling pathways[40]. SLC39A10, a zinc ion transporter belonging to the SLC39 family, plays a key role in regulating cellular zinc concentrations[41]. Though no direct evidence links SLC39A10 to Alzheimer's Disease (AD), the critical role of zinc ions in the nervous system[42] suggests that SLC39A10 might share common pathways or functions with AD. In the case of Colorectal Cancer (CRC), while direct links are yet to be established, the known involvement of SLC39A10 in other cancer types may offer clues for its potential association with CRC. These shared mechanisms might involve SLC39A10's function in gastric cancer[43], where it has been found to be upregulated and correlated with poor patient prognosis, as well as liver cancer[44], where it is proposed to promote tumor invasiveness. The major strengths of our study include the integrative bioinformatics approach, the application of machine learning techniques, and the experimental validation of hub genes through IHC analysis. However, the study faces some limitations. First, expanding the experimental validation and number of samples used for IHC analysis could enhance the robustness and generalizability of the findings. Second, although the two datasets used were extensive, incorporating more diverse datasets might improve the validity and applicability of our findings to different populations and conditions. In conclusion, our study contributes to a better understanding of the molecular basis linking AD and CRC, providing valuable insights into the pathophysiology of these diseases. Our findings may facilitate the development of novel diagnostic biomarkers and targeted therapeutics for AD and CRC, ultimately benefiting patients suffering from these devastating conditions. Further investigations should be undertaken to validate these hub genes' functional roles and uncover additional molecular mechanisms. Conclusion In this study, we identified three hub genes (EBNA1BP2, PPA1, and SLC39A10) that may contribute to the inverse relationship between Alzheimer's disease (AD) and colorectal cancer (CRC). We also highlighted the significant role of the immune system in connecting AD and CRC processes and described the distinct immune cell infiltration levels in these diseases. Our findings lay the foundation for developing novel diagnostic and therapeutic strategies for AD and CRC. Declarations Ethics approval and consent to participate The study, encompassing both human participants and animal experiments, was conducted under strict ethical guidelines. Approval for the research involving human subjects was obtained from the Clinical Research Ethics Committee of the First Hospital of Hebei Medical University, with the assigned approval number 2023-00043. Prior to their inclusion in the study, informed consent was secured from all individual participants. Concurrently, the animal experiments were carried out following the ethical standards set by the same institution, with the procedures receiving approval under the number No.20200683. Consent for publication Not applicable. Availability of data and materials The gene expression data used in this study were obtained from the Gene Expression Omnibus (GEO) database. The specific dataset identifiers will be provided in the manuscript upon acceptance. Competing interests The authors declare that they have no competing interests. Funding This work was supported by the Hebei Provincial Natural Science Foundation (Grant Nos. H2024206245 and H2024206280), Key Projects of Hebei Provincial Administration of Traditional Chinese Medicine (Grant No. Z2022015), Visiting Professor Collaboration Project (Grant No. 2024KZ03), and Outstanding Youth Science Foundation of XingHuo Project (Grant No. XH202401) of the First Hospital of Hebei Medical University. Authors' contributions Wanchang Wang performed the bioinformatics analysis, contributed to drafting the manuscript, and participated in interpretation of data. Qianqian Yang assisted in manuscript preparation, providing valuable input to the text. Menglan Zhang and Yuxuan Xu carried out the IHC experiments and analyzed the results. Yanhong Yang supplied the CRC patient samples, ensuring the quality of the specimens used in the study. Siyu Jiang, Lu Zhao, and Bingxin Li contributed to the experimental work and data acquisition. Zhaoyu Gao and Na Zhao provided guidance on methodology and experimental design, as well as contributed to manuscript editing. Rui Zhang and Shunjiang Xu, as the principal investigators, supervised the study and provided critical input on the manuscript, ensuring it adhered to high scientific standards. All authors reviewed and approved the final manuscript. Acknowledgements Not applicable. Authors' information Not applicable. References Franzmeier, N, Dewenter, A, Frontzkowski, L, et al. Patient-centered connectivity-based prediction of tau pathology spread in Alzheimer's disease. Sci Adv. 2020; 6 (48): doi: 10.1126/sciadv.abd1327 Estimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: an analysis for the Global Burden of Disease Study 2019. Lancet Public Health. 2022; 7 (2): e105-e125. doi: 10.1016/S2468-2667(21)00249-8 Siegel, RL, Miller, KD, Fuchs, HE, et al. Cancer statistics, 2022. CA-CANCER J CLIN. 2022; 72 (1): 7-33. doi: 10.3322/caac.21708 Yarchoan, M, James, BD, Shah, RC, et al. Association of Cancer History with Alzheimer's Disease Dementia and Neuropathology. J ALZHEIMERS DIS. 2017; 56 (2): 699-706. doi: 10.3233/JAD-160977 Shafi, O. Inverse relationship between Alzheimer's disease and cancer, and other factors contributing to Alzheimer's disease: a systematic review. BMC Neurol. 2016; 16 (1): 236. doi: 10.1186/s12883-016-0765-2 Dong, Z, Xu, M, Sun, X, et al. Mendelian randomization and transcriptomic analysis reveal an inverse causal relationship between Alzheimer's disease and cancer. J Transl Med. 2023; 21 (1): 527. doi: 10.1186/s12967-023-04357-3 Karanth, SD, Katsumata, Y, Nelson, PT, et al. Cancer diagnosis is associated with a lower burden of dementia and less Alzheimer's-type neuropathology. BRAIN. 2022; 145 (7): 2518-2527. doi: 10.1093/brain/awac035 Ospina-Romero, M, Glymour, MM, Hayes-Larson, E, et al. Association Between Alzheimer Disease and Cancer With Evaluation of Study Biases: A Systematic Review and Meta-analysis. JAMA Netw Open. 2020; 3 (11): e2025515. doi: 10.1001/jamanetworkopen.2020.25515 Kazemi, E, Zayeri, F, Baghestani, A, et al. Trends of Colorectal Cancer Incidence, Prevalence and Mortality in Worldwide From 1990 to 2017. IRAN J PUBLIC HEALTH. 2023; 52 (2): 436-445. doi: 10.18502/ijph.v52i2.11897 Ferlay J, Ervik M, Lam F, et al. Global cancer Observatory: cancer today. Lyon, France: international agency for research on cancer[J]. 2018. Bhardwaj, A, Liyanage, SI, Weaver, DF. Cancer and Alzheimer's Inverse Correlation: an Immunogenetic Analysis. MOL NEUROBIOL. 2023; 60 (6): 3086-3099. doi: 10.1007/s12035-023-03260-8 Yang, C, Delcher, C, Shenkman, E, et al. Machine learning approaches for predicting high cost high need patient expenditures in health care. Biomed Eng Online. 2018; 17 (Suppl 1): 131. doi: 10.1186/s12938-018-0568-3 Liang, WS, Dunckley, T, Beach, TG, et al. Gene expression profiles in anatomically and functionally distinct regions of the normal aged human brain. PHYSIOL GENOMICS. 2006; 28 (3): 311-22. doi: 10.1152/physiolgenomics.00208.2006 Piras, IS, Krate, J, Delvaux, E, et al. Transcriptome Changes in the Alzheimer's Disease Middle Temporal Gyrus: Importance of RNA Metabolism and Mitochondria-Associated Membrane Genes. J ALZHEIMERS DIS. 2019; 70 (3): 691-713. doi: 10.3233/JAD-181113 Shen, A, Liu, L, Huang, Y, et al. Down-Regulating HAUS6 Suppresses Cell Proliferation by Activating the p53/p21 Pathway in Colorectal Cancer. Front Cell Dev Biol. 2022; 9 772077. doi: 10.3389/fcell.2021.772077 Gao, P, He, M, Zhang, C, et al. Integrated analysis of gene expression signatures associated with colon cancer from three datasets. GENE. 2018; 654 95-102. doi: 10.1016/j.gene.2018.02.007 Barrett, T, Wilhite, SE, Ledoux, P, et al. NCBI GEO: archive for functional genomics data sets--update. NUCLEIC ACIDS RES. 2012; 41 (Database issue): D991-5. doi: 10.1093/nar/gks1193 Ritchie, ME, Phipson, B, Wu, D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. NUCLEIC ACIDS RES. 2015; 43 (7): e47. doi: 10.1093/nar/gkv007 Langfelder, P, Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008; 9 559. doi: 10.1186/1471-2105-9-559 Seo, J, Gordish-Dressman, H, Hoffman, EP. An interactive power analysis tool for microarray hypothesis testing and generation. BIOINFORMATICS. 2006; 22 (7): 808-14. doi: 10.1093/bioinformatics/btk052 Sherman, BT, Hao, M, Qiu, J, et al. DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). NUCLEIC ACIDS RES. 2022; 50 (W1): W216-W221. doi: 10.1093/nar/gkac194 Friedman, J, Hastie, T, Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33 (1): 1-22. PMID: 20808728 Lin, X, Li, C, Zhang, Y, et al. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics. Molecules. 2017; 23 (1): doi: 10.3390/molecules23010052 Liu, Y, Zhao, H. Variable importance-weighted Random Forests. QUANT BIOL. 2017; 5 (4): 338-351. PMID: 30034909 Newman, AM, Liu, CL, Green, MR, et al. Robust enumeration of cell subsets from tissue expression profiles. NAT METHODS. 2015; 12 (5): 453-7. doi: 10.1038/nmeth.3337 Varghese, F, Bukhari, AB, Malhotra, R, et al. IHC Profiler: an open source plugin for the quantitative evaluation and automated scoring of immunohistochemistry images of human tissue samples. PLoS One. 2014; 9 (5): e96801. doi: 10.1371/journal.pone.0096801 Lopez-Rodriguez, AB, Hennessy, E, Murray, CL, et al. Acute systemic inflammation exacerbates neuroinflammation in Alzheimer's disease: IL-1β drives amplified responses in primed astrocytes and neuronal network dysfunction. ALZHEIMERS DEMENT. 2021; 17 (10): 1735-1755. doi: 10.1002/alz.12341 McLarnon, JG. A Leaky Blood-Brain Barrier to Fibrinogen Contributes to Oxidative Damage in Alzheimer's Disease. Antioxidants (Basel). 2021; 11 (1): doi: 10.3390/antiox11010102 Park, EM, Chelvanambi, M, Bhutiani, N, et al. Targeting the gut and tumor microbiota in cancer. NAT MED. 2022; 28 (4): 690-703. doi: 10.1038/s41591-022-01779-2 Wang, S, Wei, J, Li, S, et al. PPA1, an energy metabolism initiator, plays an important role in the progression of malignant tumors. Front Oncol. 2022; 12 1012090. doi: 10.3389/fonc.2022.1012090 Niu, H, Zhu, J, Qu, Q, et al. Crystallographic and modeling study of the human inorganic pyrophosphatase 1: A potential anti-cancer drug target. PROTEINS. 2021; 89 (7): 853-865. doi: 10.1002/prot.26064 Menteş, M, Yandım, C. Identification of PPA1 inhibitor candidates for potential repurposing in cancer medicine. J CELL BIOCHEM. 2023; 124 (10): 1646-1663. doi: 10.1002/jcb.30475 Cozachenco, D, Ribeiro, FC, Ferreira, ST. Defective proteostasis in Alzheimer's disease. AGEING RES REV. 2023; 85 101862. doi: 10.1016/j.arr.2023.101862 Schmidt, S, Denk, S, Wiegering, A. Targeting Protein Synthesis in Colorectal Cancer. Cancers (Basel). 2020; 12 (5): doi: 10.3390/cancers12051298 Liao, P, Wang, W, Shen, M, et al. A positive feedback loop between EBP2 and c-Myc regulates rDNA transcription, cell proliferation, and tumorigenesis. Cell Death Dis. 2014; 5 e1032. doi: 10.1038/cddis.2013.536 Boudiaf-Benmammar, C, Cresteil, T, Melki, R. The cytosolic chaperonin CCT/TRiC and cancer cell proliferation. PLoS One. 2013; 8 (4): e60895. doi: 10.1371/journal.pone.0060895 Düzel, E, Ziegler, G, Berron, D, et al. Amyloid pathology but not APOE ε4 status is permissive for tau-related hippocampal dysfunction. BRAIN. 2022; 145 (4): 1473-1485. doi: 10.1093/brain/awab405 Ghozlan, H, Cox, A, Nierenberg, D, et al. The TRiCky Business of Protein Folding in Health and Disease. Front Cell Dev Biol. 2022; 10 906530. doi: 10.3389/fcell.2022.906530 Tang, C, Li, C, Chen, C, et al. LINC01234 promoted malignant behaviors of breast cancer cells via hsa-miR-30c-2-3p/CCT4/mTOR signaling pathway. TAIWAN J OBSTET GYNE. 2024; 63 (1): 46-56. doi: 10.1016/j.tjog.2023.09.019 Li, F, Liu, CS, Wu, P, et al. CCT4 suppression inhibits tumor growth in hepatocellular carcinoma by interacting with Cdc20. CHINESE MED J-PEKING. 2021; 134 (22): 2721-2729. doi: 10.1097/CM9.0000000000001851 He, X, Ge, C, Xia, J, et al. The Zinc Transporter SLC39A10 Plays an Essential Role in Embryonic Hematopoiesis. Adv Sci (Weinh). 2023; 10 (17): e2205345. doi: 10.1002/advs.202205345 Kumar, V, Kumar, A, Singh, K, et al. Neurobiology of zinc and its role in neurogenesis. EUR J NUTR. 2021; 60 (1): 55-64. doi: 10.1007/s00394-020-02454-3 Ren, X, Feng, C, Wang, Y, et al. SLC39A10 promotes malignant phenotypes of gastric cancer cells by activating the CK2-mediated MAPK/ERK and PI3K/AKT pathways. EXP MOL MED. 2023; 55 (8): 1757-1769. doi: 10.1038/s12276-023-01062-5 Ma, Z, Li, Z, Wang, S, et al. SLC39A10 Upregulation Predicts Poor Prognosis, Promotes Proliferation and Migration, and Correlates with Immune Infiltration in Hepatocellular Carcinoma. J Hepatocell Carcinoma. 2021; 8 899-912. doi: 10.2147/JHC.S320326 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4806177","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":338614217,"identity":"d61610af-7468-447c-ab06-e12221666580","order_by":0,"name":"Wanchang Wang","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Wanchang","middleName":"","lastName":"Wang","suffix":""},{"id":338614218,"identity":"fa157f57-b0b3-4a08-9e78-bb68645289a0","order_by":1,"name":"Qianqian Yang","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Qianqian","middleName":"","lastName":"Yang","suffix":""},{"id":338614219,"identity":"20af3f50-0179-475b-85e5-bed30eb456e7","order_by":2,"name":"Menglan Zhang","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Menglan","middleName":"","lastName":"Zhang","suffix":""},{"id":338614220,"identity":"a74c2c36-ee6c-4bb9-b46b-060d79ecd7dc","order_by":3,"name":"Yuxuan Xu","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Yuxuan","middleName":"","lastName":"Xu","suffix":""},{"id":338614221,"identity":"26b4c853-4142-4f6f-a4cc-463b78d1e890","order_by":4,"name":"Yanhong Yang","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Yanhong","middleName":"","lastName":"Yang","suffix":""},{"id":338614222,"identity":"5fdd9815-e77b-4986-bf21-2af4b9166c37","order_by":5,"name":"Siyu Jiang","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Siyu","middleName":"","lastName":"Jiang","suffix":""},{"id":338614223,"identity":"28f1d665-c852-413f-9007-07d957f9e61e","order_by":6,"name":"Lu Zhao","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Lu","middleName":"","lastName":"Zhao","suffix":""},{"id":338614224,"identity":"399382c4-eb1a-471f-8c21-5067a7c87960","order_by":7,"name":"Bingxin Li","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Bingxin","middleName":"","lastName":"Li","suffix":""},{"id":338614225,"identity":"3b691ba5-db21-4bbd-be9e-3e85100dfd51","order_by":8,"name":"Zhaoyu Gao","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Zhaoyu","middleName":"","lastName":"Gao","suffix":""},{"id":338614226,"identity":"58ea19cf-487b-4a3a-8066-3bdbf245edd4","order_by":9,"name":"Na Zhao","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Na","middleName":"","lastName":"Zhao","suffix":""},{"id":338614227,"identity":"8e16ee0d-c1d0-436a-9bea-a2246f13a305","order_by":10,"name":"Rui Zhang","email":"","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":false,"prefix":"","firstName":"Rui","middleName":"","lastName":"Zhang","suffix":""},{"id":338614228,"identity":"59d5f4b4-812d-4cbf-899f-947dfa3e2412","order_by":11,"name":"Shunjiang Xu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/klEQVRIiWNgGAWjYHACNhDmARKMjxkYDoCFJIjVwmxMihYIQ5ooLQY30p895t3BJ8M/u/1adUHFnWiDA8wHb/Mw2OXh0iI5I8fcmPcMG4/EnTNlt2eceZa74QBbsjUPQ3IxLi38Ejls0rxtQL/cyEm7zdt2GKiFx0yah+FAYgMuj0ikPwNrkQdqKeb9B9LC/w2vFn6JBDOwFqCnjjHzNoBtYcOrRbLnjZnkXKAWwxs5zNI8x57lzjzMZmw5xyAZpxaD4+nPJN62HbOXu5H+8DNPzZ3cvuPND2+8qbDDqQUKjgExjwGEzQw2Cr96IKgBYvYHBJWNglEwCkbByAQAPO5VHCjtLfUAAAAASUVORK5CYII=","orcid":"","institution":"The First Hospital of Hebei Medical University","correspondingAuthor":true,"prefix":"","firstName":"Shunjiang","middleName":"","lastName":"Xu","suffix":""}],"badges":[],"createdAt":"2024-07-26 07:36:16","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4806177/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4806177/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":63831255,"identity":"581f47ce-fa7c-4cc8-b4e2-7e753d625458","added_by":"auto","created_at":"2024-09-02 19:06:20","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":202844,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFlow chart of this study design\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/4e437e930ee4ef25f00ab07c.jpeg"},{"id":63830439,"identity":"3f0cf02b-ad11-43df-b0fb-8f3b8c2a162f","added_by":"auto","created_at":"2024-09-02 18:58:20","extension":"jpeg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":142963,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eIdentification of the differentially expressed genes and their intersection in AD and CRC.\u003c/strong\u003e (A, B) Volcano plots illustrating the differentially expressed genes (DEGs) in the AD (A) or CRC (B) datasets. (C, D) Heatmaps showing the expression patterns of DEGs in the AD (C) or CRC (D) datasets. (E, F) Venn diagrams displaying the intersection of genes between upregulated DEGs in the AD and downregulated DEGs in the CRC (E) or between downregulated DEGs in the AD and upregulated DEGs in the CRC (F). AD: Alzheimer's disease; CRC: colorectal cancer; DEGs: differentially expressed genes.\u003c/p\u003e","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/7bd96dbf81985d85a016b3a7.jpeg"},{"id":63832036,"identity":"5a852710-e7a8-4fb0-9f86-669c7bff217e","added_by":"auto","created_at":"2024-09-02 19:14:20","extension":"jpeg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":166920,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eThe\u003c/strong\u003e \u003cstrong\u003eWeighted Gene Co-expression Network Analysis (WGCNA) of AD and CRC.\u003c/strong\u003e (A, B) Determination of the soft-threshold power for AD (A) or CRC (B). (C, D) The cluster dendrograms of modules containing highly connected genes for AD (C) or CRC (D). (E, F) The relationships between modules and traits in AD (E) or in CRC (F). Correlations and \u003cem\u003eP\u003c/em\u003evalues are included in each cell. (G) Venn diagrams displaying the intersections of genes between the positively correlated modules for AD and the negatively correlated modules for CRC (left) or between the negatively correlated modules for AD and the positively correlated modules for CRC (right).\u003c/p\u003e","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/768cc95208c73afb6960f52a.jpeg"},{"id":63830433,"identity":"6c97e24e-3d44-4dc1-a540-0bd706c31d21","added_by":"auto","created_at":"2024-09-02 18:58:20","extension":"jpeg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":82226,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFunctional enrichment analysis of the identified key genes that are inversely expressed in AD and CRC.\u003c/strong\u003e (A) Venn diagram showing the overlap of key genes, representing the potential candidates for further investigation in the context of AD and CRC relationship. (B) Bar plot displaying the significantly enriched Gene Ontology (GO) Biological Process pathways for the inversely expressed genes between AD and CRC. (C) Circular plot illustrating the enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways for the inversely expressed genes between AD and CRC.\u003c/p\u003e","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/9e0e8eaa7b4e86bebb7d4505.jpeg"},{"id":63830438,"identity":"4cebe0e4-01a3-474d-9056-2f551d5863fa","added_by":"auto","created_at":"2024-09-02 18:58:20","extension":"jpeg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":161689,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eIdentification of the candidate biomarkers for AD or CRC using machine learning algorithms. \u003c/strong\u003e(A, G) Coefficient profiles of variables in the LASSO regression model for AD (A) or CRC (G). (B, H) Ten-fold cross-validation for tuning parameter selection in the LASSO regression model for AD (B) or CRC (H). (C, I) Genes are ranked based on their importance score in the Random Forest (RF) algorithm for AD (C) or CRC (I). (D, J) Display of error in the RF algorithm for AD (D) or CRC (J). (E, K) The optimal root mean squared error (RMSE) of the SVM-RFE method for AD (E) or CRC (K). (F, L) The Venn diagram illustrates the intersection of biomarkers identified by LASSO, RF, and SVM-RFE methods for AD (F) or CRC (L).\u003c/p\u003e","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/60929a6f5c009a1d75869741.jpeg"},{"id":63830434,"identity":"b0bb88ea-4122-4406-a478-37e87f81b1fb","added_by":"auto","created_at":"2024-09-02 18:58:20","extension":"jpeg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":126497,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSelection and validation of the diagnostic biomarkers for AD or CRC.\u003c/strong\u003e (A, C) Differential expression analysis of the identified hub genes in the AD discovery dataset (A: GSE5281) and CRC discovery dataset (C: GSE113513). (B, D) Differential expression evaluation of the hub genes in the AD validation dataset (B: GSE132903) and CRC validation dataset (D: GSE89076). (E, G) ROC curve comparison of the AD hub genes (GARS, EBNA1BP2, PPA1) in the discovery dataset (E: GSE5281) and the CRC hub genes (CCT4, SLC39A10, RAN, PPA1) in the discovery dataset (G: GSE113513). (F, H) ROC curve analysis of the AD hub genes (GARS, EBNA1BP2, PPA1) in the validation dataset (F: GSE132903) and the CRC hub genes (CCT4, SLC39A10, RAN, PPA1) in the validation dataset (H: GSE89076). AD, Alzheimer's disease; CRC, colorectal cancer; ROC, receiver operating characteristic. * \u003cem\u003ep \u003c/em\u003e\u0026lt; 0.05; ** \u003cem\u003ep \u003c/em\u003e\u0026lt; 0.01; *** \u003cem\u003ep \u003c/em\u003e\u0026lt; 0.001; **** \u003cem\u003ep \u003c/em\u003e\u0026lt; 0.0001; ns, not significant.\u003c/p\u003e","description":"","filename":"floatimage6.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/8e9821289da132aa63eec34b.jpeg"},{"id":63830441,"identity":"86cd8ac4-7071-46c7-85cb-1ece6aad6944","added_by":"auto","created_at":"2024-09-02 18:58:20","extension":"jpeg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":286629,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eImmune cell infiltration analysis in AD. \u003c/strong\u003e(A) Stacked histogram displaying the proportion of 22 types of immune cells in each sample, showing significant differences between AD and control samples in 9 immune cell subpopulations. (B) Box plot comparing 9 differentially infiltrated immune cell types between AD and control groups. (C) Heatmap revealing the correlation of 22 types of immune cells infiltration with a threshold of p \u0026lt; 0.05. (D) Correlation map representing the association between the expression of two hub genes (EBNA1BP2 and PPA1) and the proportion of differentially infiltrated immune cell types in AD. AD, Alzheimer's disease; * \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.05; ** \u003cem\u003ep\u003c/em\u003e\u0026lt; 0.01; *** \u003cem\u003ep \u003c/em\u003e\u0026lt; 0.001; **** \u003cem\u003ep \u003c/em\u003e\u0026lt; 0.0001; ns, not significant.\u003c/p\u003e","description":"","filename":"floatimage7.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/4bccbc4b912e123c9da93221.jpeg"},{"id":63830440,"identity":"0629934a-470b-4a12-8240-c3b1d3c09a54","added_by":"auto","created_at":"2024-09-02 18:58:20","extension":"jpeg","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":281659,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eImmune cell infiltration analysis in CRC.\u003c/strong\u003e (A) Stacked histogram displaying the proportion of 22 types of immune cells in each sample, showing significant differences between CRC and control colorectal samples in 9 immune cell subpopulations. (B) Box plot comparing 9 differentially infiltrated immune cell types between CRC and control groups. (C) Heatmap revealing the correlation of 22 types of immune cells infiltration with a threshold of p \u0026lt; 0.05. (D) Correlation map representing the association between the expression of four hub genes (CCT4, SLC39A10, RAN, and PPA1) and the proportion of differentially infiltrated immune cell types in CRC. CRC, colorectal cancer; * \u003cem\u003ep\u003c/em\u003e\u0026lt; 0.05; ** \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.01; *** \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001; **** \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.0001; ns, not significant.\u003c/p\u003e","description":"","filename":"floatimage8.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/3710d5db2ad3ee5281fdc0d8.jpeg"},{"id":63830442,"identity":"b49e8b4d-6d34-4af7-b975-210039c87182","added_by":"auto","created_at":"2024-09-02 18:58:20","extension":"jpeg","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":222660,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eValidation of hub genes in AD model 5×FAD mice and patients with CRC by immunohistochemical analysis. \u003c/strong\u003e(A, C) Representative images show immunohistochemical staining for EBNA1BP2, PPA1, CCT4, SLC39A10, and RAN in the hippocampus of 5×FAD mice and WT mice (A), as well as in cancerous and adjacent non-cancerous tissues of CRC patients (C). (B, D) Data reveal quantification of staining intensity in mouse hippocampal tissues (B) and human colorectal tissues (D). CRC, colorectal cancer; * \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.05; ** \u003cem\u003ep\u003c/em\u003e\u0026lt; 0.01; *** \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001; **** \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.0001; ns, not significant.\u003c/p\u003e","description":"","filename":"floatimage9.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/421647420fc75c6a95366701.jpeg"},{"id":65437315,"identity":"c3e7317f-c80a-4470-b4cb-71cc82743094","added_by":"auto","created_at":"2024-09-27 12:14:28","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2519203,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4806177/v1/bd3ce5d9-ab2d-4ff8-8022-eb04b7b3de78.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Identification of hub genes contributed to the negative correlation between the incidence of Alzheimer's disease and colorectal cancer via integrated bioinformatics analysis and machine learning","fulltext":[{"header":"Introduction","content":"\u003cp\u003eAlzheimer's disease (AD) and cancers represent two kind of age-related diseases that pose significant challenges to global public health. AD, a progressive neurodegenerative disorder, is characterized by the accumulation of β-amyloid plaques and hyperphosphorylated tau proteins resulting in neurofibrillary tangles within the brain [1]. It is estimated that the number of individuals affected by AD will reach 131\u0026nbsp;million by 2050 [2]. Conversely, cancer is marked by abnormal cellular proliferation, spreading and metastasis within organ tissues under the persistent influence of harmful elements, making it one of the greatest threats to human health [3]. Growing epidemiologic studies indicate a negative correlation between the incidence of AD and multiple cancer types, wherein the risk of developing cancer in AD patients is significantly reduced and vice versa [4\u0026ndash;8]. However, the underlying mechanisms involved in the inverse relationship between AD and CRC remain unclear.\u003c/p\u003e \u003cp\u003eAmong the multiple cancer types negatively correlated with AD in incidence, colorectal cancer (CRC) is the most prevalent common malignant tumor of the digestive system, with high mortality rates worldwide [9]. According to GLOBOCAN 2020 estimates by the International Agency for Research on Cancer (IARC) under the World Health Organization, there were 1.93\u0026nbsp;million new CRC cases and 935, 200 deaths globally in 2020, ranking the third and second highest among all malignant tumors, respectively [10]. In light of the inverse relationship between AD and CRC, understanding the potential molecular mechanisms underlying this correlation and identifying key biomarkers contributing to their negative association is critical for the development of novel prevention and treatment strategies. There is evidence suggesting that immune regulation may be a factor linking AD and CRC [11]. Few studies have systematically examined the immune-related genes that connect AD and CRC, using the machine learning methods.\u003c/p\u003e \u003cp\u003eIn this study, we performed a comprehensive bioinformatics analysis pipeline to identify hub genes involved in the negative relationship between AD and CRC. The gene microarray data were downloaded from the GEO database and analyzed using differential expression, WGCNA, functional enrichment analysis, and machine learning algorithms, including LASSO, Random Forest and SVM-RFE[12]. Furthermore, the immune cell infiltration in CRC and AD was explored using the CIBERSORT algorithm. These results will identify the potential immune-associated diagnostic markers, advance our understanding of their interplay and provide new perspectives for future diagnosis and therapeutic strategies of AD or CRC.\u003c/p\u003e"},{"header":"Materials and methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eData download\u003c/h2\u003e \u003cp\u003eIn this study, the RStudio software (version 4.3.3; URL: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.r-project\u003c/span\u003e\u003cspan address=\"https://www.r-project\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. org/) was used to download the GSE5281[13] and GSE132903[14] AD datasets, as well as the GSE113513[15] and GSE74602[16] CRC datasets from the Gene Expression Omnibus (GEO) database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/geo/\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/geo/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e)[17]. All data processing and analyses were conducted in RStudio. During the data processing, we performed missing value imputation, data cleaning, and standardization for both datasets. The GSE113513 dataset (GPL15207 platform) comprises 14 pairs of cancerous and matched non-cancerous tissues, while the GSE74602 CRC dataset (GPL6104 platform) consists of 30 CRC samples and 30 healthy control samples. The GSE5281 AD dataset (GPL570 platform) includes a total of 161 samples, collected from six regions: entorhinal cortex, hippocampus, medial temporal gyrus, posterior cingulate, superior frontal gyrus, and primary visual cortex. However, for this study, we only selected three regions: entorhinal cortex, hippocampus, and posterior cingulate, which comprise 29 AD patients (EC: 10, HIP: 10, PC: 9) and 29 healthy controls (EC: 10, HIP: 10, PC: 9). The GSE132903 dataset (GPL16699 platform) features 97 AD patient samples and 98 healthy control samples. A simplified workflow of the current investigation is illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eDifferentially expressed genes (DEGs) analysis\u003c/h2\u003e \u003cp\u003eIn this study, we employed the 'limma' R package to screen the differentially expressed genes (DEGs) from the AD dataset (GSE5281) and CRC datasets (GSE113513). DEGs in the GSE5281 dataset were selected with the criteria: |log2 (fold change)| ≥ 1.5 and FDR \u0026lt; 0.01. We used FDR instead of \u003cem\u003ep\u003c/em\u003e-values to account for multiple hypothesis testing and to efficiently control the false discovery rate[18]. Applying FDR helped reduce the occurrence of potential false positives in the results. The robustness and stability of the identified DEGs were improved by correcting for these potential false discoveries. Likewise, the CRC dataset (GSE113513) was processed and analyzed using the same criteria. The results were visualized using gene clustering heatmaps and volcano plots. A combined analysis of DEGs between GSE113513 and GSE5281 datasets was performed, and Venn diagrams were generated to identify intersecting DEGs. We focused on the down-regulated DEGs in the AD group overlapping with up-regulated DEGs in the CRC group, and the up-regulated DEGs in the AD group overlapping with down-regulated DEGs in the CRC group.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eWeighted Gene Co-Expression Network Analysis (WGCNA) and the identification of key module genes\u003c/h2\u003e \u003cp\u003eUsing microarray specimens, Weighted Gene Co-expression Network Analysis (WGCNA) represents one of the most important and widely applied systems bioinformatics methods to describe the correlation patterns among genes. Genes can be grouped into modules based on their co-expression similarities across samples using the \"WGCNA\" R package[19]. Additionally, the WGCNA method can be used to connect modules to clinical elements outside the genome. In this way, relevant functional networks can be used to identify biomarkers and new molecules. As input files, normalized mRNA expression data (calculated using the R package \"limma\") were used to perform WGCNA to identify gene coexpression and the correlation between gene modules and clinical characteristics (AD or CRC compared to control groups). For each disease group, the following steps were followed: (1) By using the R package \"gplots,\" hierarchical clustering analysis was performed to identify outliers in the sample[20]. (2) The \"pickSoftThreshold\" package function was utilized to screen out soft-power parameters ranging from 1 to 20[20]. (3) A topological overlap matrix (TOM) is created by converting the matrix of correlations with the most appropriate b value to an adjacency matrix and then into a topological overlap matrix. (4) Based on the average linkage hierarchical clustering, a hierarchical clustering tree (linked gene best fit) was constructed, and then the dynamic tree cut algorithm (minModuleSize = 30) was used to find different gene modules. Similar modules were merged by a cut height in each group. (5) Gene modules and clinical phenotypes (CTRL and AD or CRC) were correlated using the Pearson correlation coefficient.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eFunctional enrichment analysis\u003c/h2\u003e \u003cp\u003eTo explore the biological function and concrete mechanism of the DEGs that are involved in the negative correlation between AD and CRC, we carried out Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis by importing the genes into the DAVID database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://david.ncifcrf.gov/\u003c/span\u003e\u003cspan address=\"https://david.ncifcrf.gov/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e)[21]. A threshold of p \u0026lt; 0.05 was regarded to be significant enrichment. Additionally, the findings of functional enrichment analysis were displayed via bubble diagram and circos plot.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eMachine learning\u003c/h2\u003e \u003cp\u003eIn order to screen the vital genes in AD or CRC, three well-established machine learning algorithms (LASSO: Least Absolute Shrinkage and Selection Operator; SVM-RFE: Support Vector Machine- Recursive Feature Elimination; RF: Random Forest) were utilized. Firstly, the 20 shared genes obtained previously were input into the LASSO algorithm in separate disease groups[22]. The LASSO algorithm effectively identified relevant genes for each condition. Subsequently, we employed the Support Vector Machines Recursive Feature Elimination (SVM-RFE) to refine the feature selection. Utilizing the \"e1071\" and \"caret\" packages for SVM modeling, SVM-RFE systematically eliminated features in a backward regression process to identify the optimal hub genes, which were then incorporated into the SVM model for each disease group. The outcome of SVM-RFE was visualized as a plot, with the Root Mean Square Error (RMSE) as an evaluation metric in tenfold cross-validation.[23] This plot illustrated the relationship between the number of variables and RMSE, providing insights into the optimal number of genes for maximum diagnostic performance. Finally, we applied Random Forest using the R package “randomForest” to classify significant genes and identify the most important variables using a decision tree algorithm[24]. After constructing a random forest model with 500 trees on the discovery cohorts and determining the optimal number of trees using cross-validation errors, we ranked the genes by importance and selected the top 10 genes. We then obtained the final result by taking the intersection of the results from the three algorithms.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eValidation of the Expression of Hub Genes\u003c/h2\u003e \u003cp\u003eAll the identified hub genes were further validated in GSE132903 and GSE89076 to avoid false positive rates. The comparison between the AD and healthy control groups, or CRC and normal control of the two sets was calculated with the T-test. \u003cem\u003eP\u003c/em\u003e \u0026lt; 0.05 was considered a significant difference between the groups.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eReceiver Operating Characteristic (ROC) Curves of the Hub Genes\u003c/h2\u003e \u003cp\u003eTo evaluate the predictive efficiency for disease of the identified critical genes, the accuracy of hub genes was evaluated by ROC validation and the area under the curve (AUC) values were calculated by an online website (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.xiantao.love/products/\u003c/span\u003e\u003cspan address=\"https://www.xiantao.love/products/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Efficacy evaluation: non-efficiency (AUC ≦ 0.5); modest-efficiency (0.5 \u0026lt; AUC \u0026lt; 0.7); high-efficiency (AUC \u0026gt; 0.7).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eImmune infiltration analysis\u003c/h2\u003e \u003cp\u003eThe “CIBERSORT” package was executed to assess the number of the immune cell infiltration from the CRC gene expression profile[25]. The abundance and proportion of the immune cell infiltration were presented for each sample as barplot using the “ggplot2” package. The differences of the proportion of 22 types of immune cells between CRC colorectal samples and control colorectal samples were compared by adopting Wilcoxon test, where \u003cem\u003eP\u003c/em\u003e \u0026lt; 0.05 was regarded to be of statistical significance and was displayed by Stacked histogram based on the “ggplot2” package. Subsequently, the association of 22 types of invading immune cells was shown with the use of the “corrplot” package. Finally, Spearman’s rank correlation coefficient was adopted for the correlation analysis between the expression of diagnostic biomarkers and the content of infiltrated immune cells, and \u003cem\u003eP\u003c/em\u003e \u0026lt; 0.05 was thought to be of statistical significance.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eSample collection of patients with CRC and AD mouse model\u003c/h2\u003e \u003cp\u003ePaired tumor and adjacent non-tumor colorectal tissue samples were obtained from six CRC patients undergoing surgery at the First Hospital of Hebei Medical University, Hebei, China. These samples comprised six cancerous tissues and their matched para-cancerous tissues. The following samples were collected: Colorectal cancer tissues (n = 6) and matched adjacent non-tumor tissues (n = 6). Exclusion criteria entailed cases with previous malignant tumors, hereditary colorectal cancer, or familial adenomatous polyposis. The patients' clinical information is provided in Table\u0026nbsp;1. Approval for the study was granted by the Clinical Research Ethics Committee of the First Hospital of Hebei Medical University (No. 2023-00043).\u003c/p\u003e \u003cp\u003eExperimental animals, including 3-month-old 5×FAD mice and their corresponding wild-type (WT) mice, were obtained from Shanghai Southern Model Biological Technology Co., Ltd. All mice were housed in an SPF-grade animal facility at Hebei Medical University under consistent conditions, including a regular light-dark cycle, controlled temperature (22–24℃), and humidity (50–60%). Five to six mice per cage were provided with ad libitum access to food and water. All experimental procedures involving animals were conducted in accordance with the ethical guidelines and approved by the Ethics Committee of the First Hospital of Hebei Medical University (No. 2023-00043).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eImmunohistochemistry\u003c/h2\u003e \u003cp\u003eFor the IHC staining procedure, 3 µm thick sections were obtained from formalin-fixed paraffin-embedded tissues and placed onto poly-L-lysine-coated slides. The primary antibodies used and their respective techniques are detailed in Table\u0026nbsp;2, following the manufacturer's guidelines. Antigen retrieval was performed using the pressure cooker method, and diaminobenzidine (DAB) was employed as the chromogen. Quantification of IHC images was carried out using ImageJ software and the IHC Profiler plugin[26]. The software evaluates the average optical density (staining intensity) and the percentage of positively stained area (staining area) for each image, resulting in four levels of scoring: High positive (3+), Positive (2+), Low Positive (1+), and Negative (0).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eStatistical analysis\u003c/h2\u003e \u003cp\u003eThe R software 3.6.5 and GraphPad Prism version 8.0.1 (GraphPad Software Inc., San Diego, CA, USA) were used for statistical analyses and visualization. For differences of gene expression levels or immunocyte fractions between different clinical groups, a two-sided Wilcoxon test was performed. Correlation analysis was conducted using the Spearman test. The p-value was adjusted by the FDR method for multiple hypothesis testing. Dichotomous variables were compared using the chi-square test. Unpaired Student′s t-test was utilized to compare differences between the two groups. P \u0026lt; 0.05 was regarded as statistical significance.\u003c/p\u003e "},{"header":"Result","content":"\u003cp\u003e \u003cb\u003eIdentification of potential key genes involved in the negative correlation between the incidence of AD and CRC\u003c/b\u003e \u003c/p\u003e\u003cp\u003eUsing the Limma method, we identified a total of 3821 DEGs in the AD dataset, including 2089 upregulated and 1732 downregulated genes. The heatmap and volcano plot of DEGs in AD are shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA and C. In the CRC dataset, we screened out 3274 DEGs, consisting of 1594 upregulated and 1680 downregulated genes (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB, D). Subsequently, we identified 144 genes at the intersection of upregulated DEGs in the AD group and downregulated DEGs in the CRC group, as well as 194 genes at the intersection of downregulated DEGs in the AD group and upregulated DEGs in the CRC group, which potentially involved in the negative correlation between the incidence of AD and CRC (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eE, F).\u003c/p\u003e\u003ch2\u003eWeighted Gene Co-Expression Network Analysis and Key Module Identification\u003c/h2\u003e\u003cp\u003eTo investigate the correlation between the identified key genes with AD or CRC, we performed Weighted Gene Co-expression Network Analysis (WGCNA) in addition to analyzing the differential expression between the two groups. Using the soft thresholding approach, this study constructed a co-expression network, with parameter b being vital for co-expression networks to maintain a scale-free topology. Gene expression-based biological networks were most likely to be scale-free. Accordingly, in the AD group, a fitting index greater than 0.88 was considered a scale-free topology, and b was set at 5 (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). The adjacency matrix was generated by using the adjacency function. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC, hierarchical cluster was constructed using the TOM dissimilarity measure. Total of nine co-expression modules were identified, and the modules that had a \u003cem\u003eP\u003c/em\u003e-value \u0026lt; 0.01 were regarded as the key modules. In the AD group, the green yellow, brown, black, and green modules showed a strong positive correlation with disease, while the blue and red modules showed a strong negative correlation with disease (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eE). WGCNA was also applied to the CRC group, with b = 8 identified as the optimal value for soft power (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). In total, 22 modules were identified, where brown and saddle brown modules showed a strong positive correlation with disease, and ivory, dark orange, green yellow, green, and medium purple 3 modules showed a strong negative correlation with disease (Figs.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD, F). In the key modules identified in both AD and CRC groups, we pinpointed hub genes based on the criterion of having an absolute module membership (|MM|) greater than 0.8. Subsequently, an intersection of genes from positively correlated modules with AD and negatively correlated modules with CRC yielded 15 gene candidates, while the intersection of genes from negatively correlated modules with AD and positively correlated modules with CRC uncovered 28 distinct genes (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eG). These results suggested that the identified key modules above might play critical roles in the inverse relationship between AD and CRC.\u003c/p\u003e\u003cp\u003e \u003cb\u003eFunctional enrichment analysis of the differentially expressed genes that are potentially involved in the negative correlation between AD and CRC\u003c/b\u003e \u003c/p\u003e\u003cp\u003eIn an effort to reveal the hub genes contributed to the negative correlation between AD and CRC, the intersection of DEGs and genes identified in the WGCNA modules were analyzed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses. Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA illustrates the overlap of 20 key genes, which represent potential candidates warranting further investigation in the context of AD and CRC associations. The GO analysis identified several enriched pathways, with a few significantly enriched Biological Process pathways, such as \"mononuclear cell proliferation,\" \"leukocyte proliferation,\" and \"defense response to tumor cell,\" indicating that immune response may play a crucial role in the negative correlation between AD and CRC (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). Moreover, the analysis highlighted other relevant pathways, including \"cellular nitrogen compound biosynthetic process,\" \"positive regulation of cellular metabolic process,\" \"nucleotide binding,\" \"leukocyte activation,\" \"protein localization to nucleus,\" and \"lymphocyte proliferation.\" The KEGG pathway analysis unveiled pathways, such as \"Aminoacyl-tRNA biosynthesis,\" \"Ribosome biogenesis in eukaryotes,\" \"Purine metabolism,\" \"Oxidative phosphorylation,\" \"RNA transport,\" \"Collecting duct acid secretion,\" \"Rheumatoid arthritis,\" \"Phagosome,\" \"Human T-cell leukemia virus 1 infection,\" and \"Metabolic pathways\" (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC). These insights shed light on the potential molecular mechanisms and pathways that underlie the differences between AD and CRC gene expression profiles, laying the groundwork for future research to investigate potential therapeutic targets and drug development strategies.\u003c/p\u003e\u003ch2\u003eIdentification of candidate hub genes via machine learning\u003c/h2\u003e\u003cp\u003eTo further identify the most promising candidate diagnostic gene targets with significant discriminative value between disease and control groups, three different machine learning algorithms (LASSO, Random Forest and SVM-RFE) were applied to the 20 genes identified in our previous analysis. In the AD group, we selected lambda.1se = 0.101163531730007 for the LASSO analysis, resulting in 7 genes with non-zero coefficients (Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA, B). Utilizing the Random Forest algorithm, we assessed the importance of all of 20 genes and selected the top 10 most important genes (Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC, D). Simultaneously, the SVM-RFE method identified 8 genes based on the lowest RMSE (Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eE). By overlapping the results from the three algorithms, 6 shared biomarkers were discovered, including ATP8B1, ENC1, GARS, CDK5, EBNA1BP2, and PPA1, in the AD group (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eF).\u003c/p\u003e\u003cp\u003eSimilarly, in the CRC group, we chose lambda.1se = 0.109850004658687 for the LASSO analysis, yielding 9 genes with non-zero coefficients (Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eG, H). The Random Forest algorithm evaluated the importance of all of 20 genes, selecting the top 10 most important genes (Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eI, J). The SVM-RFE method found 8 genes based on the lowest RMSE (Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eK). Overlapping the results revealed 6 shared biomarkers, such as ENC1, CCT4, MRPL3, SLC39A10, RAN, and PPA1, in the CRC group (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eL). These findings present the most promising candidate biomarkers for diagnosing and classifying in both AD and CRC groups, paving the way for future investigations into potential therapeutic targets and drug development strategies.\u003c/p\u003e\u003ch2\u003eIdentification and validation of the diagnostic biomarkers for AD or CRC\u003c/h2\u003e\u003cp\u003eTo verify the reliability of the identified hub genes for AD and CRC, we assessed their expression in the discovery datasets (GSE5281 for AD and GSE113513 for CRC) as well as the validation datasets (GSE132903 for AD and GSE89076 for CRC). The significant differences were observed when compared to the corresponding control groups. In the AD dataset GSE5281, the hub genes CCT4, GARS, SLC39A10, EBNA1BP2, RAN, and PPA1 showed a significant downward trend (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA). In the AD validation dataset GSE132903, all genes exhibited a trend of down regulation, except for ATP8B1 (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eB). In contrast, all genes demonstrated a trend of up regulation in the CRC datasets of GSE113513 and GSE89076, except for ATP8B1 (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eC, D). Based on these findings, the AD hub genes were determined to be GARS, EBNA1BP2, and PPA1, while the CRC hub genes included CCT4, SLC39A10, RAN, and PPA1. Subsequently, the ROC curves were plotted for these hub genes using their expression data from the GSE5281, GSE132903, GSE113513, and GSE89076 datasets. For AD in the GSE5281 dataset, GARS (AUC = 0.668), EBNA1BP2 (AUC = 0.733), and PPA1 (AUC = 0.715) demonstrated potential utility as biomarkers (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eE). In the GSE132903 dataset, GARS (AUC = 0.702), EBNA1BP2 (AUC = 0.740), and PPA1 (AUC = 0.719) displayed potential utility as biomarkers (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eF). For CRC in the GSE113513 dataset, CCT4 (AUC = 1.000), SLC39A10 (AUC = 1.000), RAN (AUC = 1.000), and PPA1 (AUC = 0.995) showed high AUC scores (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eG). Finally, in the GSE89076 dataset, CCT4 (AUC = 0.942), SLC39A10 (AUC = 1.000), RAN (AUC = 0.928), and PPA1 (AUC = 0.957) exhibited similar potential as biomarkers (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eH). Ultimately, genes with AUC \u0026gt; 0.7 were deemed potential diagnostic biomarkers, including EBNA1BP2 and PPA1 for AD, and CCT4, SLC39A10, RAN, and PPA1 for CRC.\u003c/p\u003e\u003ch2\u003eCorrelations between the hub genes and the immune cell infiltration in CRC or AD\u003c/h2\u003e\u003cp\u003eThe functional and pathway analysis of DEGs that are potentially involved in the negative correlation between AD and CRC indicate a close link with inflammatory and immune processes. To further explore the crucial roles of the immune system in AD and CRC, the CIBERSORT algorithm was used to characterize immune cells, investigate immune regulation, and examine the relationship between diagnostic biomarkers and immune cell infiltration. Among the 22 types of immune cells in each sample, the relative percent of 9 immune cell subpopulations in AD group were significantly different from that in control samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eA). Relative to the control group, AD patients display increased proportions of B cells naive, T cells CD4 memory resting, Macrophages M0, and Macrophages M1, and decreased proportions of Plasma cells, T cells gamma delta, Macrophages M2, and Dendritic cells activated (Figure. 7B). The correlation analysis of these 22 immune cell types showed that T cells CD4 memory activated were positively correlated with NK cells resting (r = 0.39, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001), and Macrophages M2 were positively associated with Eosinophils (r = 0.36, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.01). Conversely, B cells naive had a negative association with B cells memory (r = -0.64, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001), and T cells CD4 naive were negatively correlated with T cells CD4 memory resting (r = -0.52, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001) (Figure. 7C). Furthermore, an investigation of the relationship between the expression of the 2 hub genes and the proportion of differentially infiltrated immune cell types reveals limited associations between hub genes EBNA1BP2 and PPA1 with immune cells in AD (Figure. 7D).\u003c/p\u003e\u003cp\u003eThe proportions of 22 immune cell types in each CRC and control colorectal sample were highlighted in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003eA, identifying significant differences in 9 immune cell subpopulations between the two group. Relative to the control group, CRC patients demonstrated higher proportions of Plasma cells, T cells CD4 memory activated, T cells gamma delta, Macrophages M0, Macrophages M2, Dendritic cells activated, Mast cells resting, Mast cells activated, and Neutrophils, and decreased proportions of Plasma cells, T cells gamma delta, Macrophages M2, and Dendritic cells activated (Figure. 8B). In addition, the correlation analysis of these 22 immune cell types in CRC reveals that NK cells resting are notably positively correlated with Macrophages M0 (r = 0.39, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001), while B cells naive have a positive association with T cells CD4 naive (r = 0.36, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.01). In contrast, T cells follicular helper are negatively correlated with Mast cells activated (r = -0.64, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001), and Monocytes have a negative association with Macrophages M2 (r = -0.52, \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001). The examination of the association between the expression of the 4 hub genes and the proportion of differentially infiltrated immune cell types indicates that all hub genes CCT4, SLC39A10, RAN, and PPA1 exhibit a significant correlation with immune cell accumulation in CRC (Figure. 8D). The comparisons of immune cell infiltration revealed less infiltration in the brains of AD patients than in the intestines of CRC patients. This disparity might be attributed to the blood-brain barrier and the unique immune environments in AD or CRC. Understanding these differences is essential for comprehending the immune characteristics of the diseases and provides crucial information for targeted immunotherapies for AD or CRC.\u003c/p\u003e\u003cp\u003e \u003cb\u003eValidation of hub genes by immunohistochemical analysis in hippocampal tissues of AD mouse and colorectal tissues of CRC patients\u003c/b\u003e \u003c/p\u003e\u003cp\u003eImmunohistochemical analysis was conducted to investigate the expression levels of hub genes in mouse hippocampal tissues and human colorectal tissues. The representative images of immunohistochemical staining for EBNA1BP2, PPA1, CCT4, SLC39A10, and RAN in the hippocampus of 5×FAD mice and WT mice are shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eA. The quantification results demonstrated a significant decrease in the expression of EBNA1BP2, PPA1, and SLC39A10 in the hippocampal tissues of 5×FAD mice compared to WT mice, whereas no differences were observed for CCT4 and RAN (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eB). Furthermore, the representative images of immunohistochemical staining for EBNA1BP2, PPA1, CCT4, SLC39A10, and RAN in cancerous and adjacent non-cancerous tissues are presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eC. Quantification results depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003eD indicate significantly increased expression of EBNA1BP2, PPA1, and SLC39A10 as well as significantly reduced expression of RAN in cancerous tissues compared to adjacent non-cancerous tissues. However, the expression trend for RAN is not consistent with our previous findings. These results confirmed that the hub genes, including EBNA1BP2, PPA1, and SLC39A10, might be potential target for further investigation.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn recent years, an increasing body of epidemiological evidence has suggested an inverse biological relationship between Alzheimer's disease (AD) and various types of cancer[4\u0026ndash;8]. Despite some progress in understanding this relationship, mechanistic studies exploring its basis remain limited. A deeper understanding of the molecular mechanisms underlying the negative correlation between AD and cancer is essential for further investigation. This includes potential molecular pathways that might account for the negative association between AD and different cancer types. The advancements in biomedical technologies, extensive omics datasets, and powerful bioinformatics tools have now enabled a more comprehensive and large-scale examination of the complexity of these age-related diseases.In this study, a multifaceted bioinformatics approach was utilized to identify the hub genes contributing to the inverse relationship between AD and CRC. By integrating gene expression data from publicly available repositories and employing a range of bioinformatics techniques, including differential gene expression analysis, weighted gene co-expression network analysis (WGCNA), and machine learning algorithms, the key genes and immune-associated biomarkers were investigated in AD and CRC. Through our comprehensive analysis, we discovered two hallmark genes (EBNA1BP2 and PPA1) for AD, and four pivotal genes (CCT4, SLC39A10, RAN, and PPA1) for CRC, shedding light on the intricate interconnections between these two age-related diseases. Furthermore, our results of validation using immunohistochemical analysis confirmed that three hub genes, including EBNA1BP2, PPA1, and SLC39A10, hold substantial potential for further investigation and provide valuable insights into their roles in the inverse relationship between AD and CRC.\u003c/p\u003e \u003cp\u003eIn AD, an imbalanced immune response could aggravate neuroinflammation and lead to cognitive decline[27]. Comparing immune cell infiltration between AD and CRC reveals that the brains of AD patients show lower infiltration than the colons of CRC patients. This difference might be partly due to the blood-brain barrier (BBB) and the unique immune environments of the two diseases[28]. However, the influence of the BBB on this discrepancy warrants further investigation. The BBB helps maintain brain homeostasis by protecting it from the systemic immune system and inflammation. In AD, a disrupted BBB could result in increased immune cell infiltration, intensifying neuroinflammation and contributing to cognitive decline. In contrast, in CRC, an appropriate level of immune cell infiltration in the intestinal immune environment could play a protective role in maintaining gut homeostasis and preventing tumor development[29]. This balance suggests that immune cells might have differential effects on disease progression in different tissue contexts. In the brain, excessive immune cell infiltration could exacerbate neurodegeneration and cognitive impairment, while in the gut, a balanced infiltration might be beneficial for CRC prevention.\u003c/p\u003e \u003cp\u003eThe PPA1 gene, which encodes inorganic pyrophosphatase (PPA1), plays a crucial role in cellular energy metabolism, particularly in the context of phosphate metabolism. This enzyme catalyzes the conversion of inorganic pyrophosphate (PPi) to inorganic phosphate (Pi), providing an alternative energy source to ATP. Although no direct evidence currently links PPA1 to Alzheimer's disease (AD), its potential involvement in mechanisms that intersect with AD pathophysiology warrants exploration. For instance, the dysregulation of PPA1 could indirectly impact neuronal metabolic states, given the brain's high energy demands and the known energy metabolism disruptions in AD[30]. Furthermore, PPA1's participation in cellular stress responses, including those to oxidative and metabolic stress, positions it as a potential factor in the neuronal damage and death observed in AD, where oxidative stress is notably elevated[30]. In addition to its metabolic and stress response roles, PPA1's influence on cell signaling pathways such as JNK/p53, Wnt/β-catenin, and PI3K/AKT/GSK-3β, which are implicated in cell proliferation and migration, also suggests a connection to the neuroinflammatory processes and neuronal death associated with AD[30]. Moreover, PPA1's modulation of apoptosis pathways could indirectly affect neuronal survival in the context of AD, where apoptosis is a significant contributor to neurodegeneration[31].In the realm of oncology, PPA1's overexpression in various cancers, including colorectal cancer (CRC), and its involvement in tumorigenesis through the aforementioned signaling pathways, indicate its role in cancer cell dynamics and its association with patient prognosis[30]. The correlation between PPA1 overexpression and reduced survival rates, along with its impact on tumor cell proliferation and invasiveness, highlights PPA1 as a potential therapeutic target in CRC. Recent studies have also unveiled the complex interplay of genomic alterations and epigenetic modifications in CRC pathogenesis, further emphasizing the multifaceted role of PPA1 in disease processes[32].\u003c/p\u003e \u003cp\u003eEBNA1BP2 is involved in rRNA processing, which is crucial for maintaining cellular functions, as it impacts protein synthesis. In Alzheimer's Disease (AD), abnormal protein synthesis contributes to neurodegeneration[33], while in Colorectal Cancer (CRC), dysregulation of protein synthesis is linked to uncontrolled cell proliferation[34\u0026ndash;35]. Besides, CCT4 plays a vital role in protein folding and quality control, which is essential for neuronal cell health[36]. In AD, the accumulation of misfolded proteins, such as amyloid precursor proteins and tau protein, contributes to neurodegeneration[37]. Moreover, CCT4 might play a part in the folding and stability of neurotransmitter receptors, which are crucial for neuronal signaling. Imbalances in the neurotransmitter system in AD could be linked to CCT4's function[38]. In the case of CRC, while direct links have not been established, previous research implies that CCT4 could have a role in other cancer types[39], sharing certain mechanisms with CRC. These shared mechanisms include protein folding, quality control, and regulation of the cell cycle, apoptosis, and tumor microenvironment signaling pathways[40].\u003c/p\u003e \u003cp\u003eSLC39A10, a zinc ion transporter belonging to the SLC39 family, plays a key role in regulating cellular zinc concentrations[41]. Though no direct evidence links SLC39A10 to Alzheimer's Disease (AD), the critical role of zinc ions in the nervous system[42] suggests that SLC39A10 might share common pathways or functions with AD. In the case of Colorectal Cancer (CRC), while direct links are yet to be established, the known involvement of SLC39A10 in other cancer types may offer clues for its potential association with CRC. These shared mechanisms might involve SLC39A10's function in gastric cancer[43], where it has been found to be upregulated and correlated with poor patient prognosis, as well as liver cancer[44], where it is proposed to promote tumor invasiveness.\u003c/p\u003e \u003cp\u003eThe major strengths of our study include the integrative bioinformatics approach, the application of machine learning techniques, and the experimental validation of hub genes through IHC analysis. However, the study faces some limitations. First, expanding the experimental validation and number of samples used for IHC analysis could enhance the robustness and generalizability of the findings. Second, although the two datasets used were extensive, incorporating more diverse datasets might improve the validity and applicability of our findings to different populations and conditions.\u003c/p\u003e \u003cp\u003eIn conclusion, our study contributes to a better understanding of the molecular basis linking AD and CRC, providing valuable insights into the pathophysiology of these diseases. Our findings may facilitate the development of novel diagnostic biomarkers and targeted therapeutics for AD and CRC, ultimately benefiting patients suffering from these devastating conditions. Further investigations should be undertaken to validate these hub genes' functional roles and uncover additional molecular mechanisms.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eIn this study, we identified three hub genes (EBNA1BP2, PPA1, and SLC39A10) that may contribute to the inverse relationship between Alzheimer's disease (AD) and colorectal cancer (CRC). We also highlighted the significant role of the immune system in connecting AD and CRC processes and described the distinct immune cell infiltration levels in these diseases. Our findings lay the foundation for developing novel diagnostic and therapeutic strategies for AD and CRC.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe study, encompassing both human participants and animal experiments, was conducted under strict ethical guidelines. Approval for the research involving human subjects was obtained from the Clinical Research Ethics Committee of the First Hospital of Hebei Medical University, with the assigned approval number 2023-00043. Prior to their inclusion in the study, informed consent was secured from all individual participants. Concurrently, the animal experiments were carried out following the ethical standards set by the same institution, with the procedures receiving approval under the number No.20200683.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe gene expression data used in this study were obtained from the Gene Expression Omnibus (GEO) database. The specific dataset identifiers will be provided in the manuscript upon acceptance.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by the Hebei Provincial Natural Science Foundation (Grant Nos. H2024206245 and H2024206280), Key Projects of Hebei Provincial Administration of Traditional Chinese Medicine (Grant No. Z2022015), Visiting Professor Collaboration Project (Grant No. 2024KZ03), and Outstanding Youth Science Foundation of XingHuo Project (Grant No. XH202401) of the First Hospital of Hebei Medical University.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors' contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWanchang Wang performed the bioinformatics analysis, contributed to drafting the manuscript, and participated in interpretation of data. Qianqian Yang assisted in manuscript preparation, providing valuable input to the text. Menglan Zhang and Yuxuan Xu carried out the IHC experiments and analyzed the results. Yanhong Yang supplied the CRC patient samples, ensuring the quality of the specimens used in the study. Siyu Jiang, Lu Zhao, and Bingxin Li contributed to the experimental work and data acquisition. Zhaoyu Gao and Na Zhao provided guidance on methodology and experimental design, as well as contributed to manuscript editing. Rui Zhang and Shunjiang Xu, as the principal investigators, supervised the study and provided critical input on the manuscript, ensuring it adhered to high scientific standards. All authors reviewed and approved the final manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors' information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eFranzmeier, N, Dewenter, A, Frontzkowski, L, et al. Patient-centered connectivity-based prediction of tau pathology spread in Alzheimer\u0026apos;s disease. Sci Adv. 2020; 6 (48): doi: 10.1126/sciadv.abd1327\u003c/li\u003e\n\u003cli\u003eEstimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: an analysis for the Global Burden of Disease Study 2019. Lancet Public Health. 2022; 7 (2): e105-e125. doi: 10.1016/S2468-2667(21)00249-8\u003c/li\u003e\n\u003cli\u003eSiegel, RL, Miller, KD, Fuchs, HE, et al. Cancer statistics, 2022. CA-CANCER J CLIN. 2022; 72 (1): 7-33. doi: 10.3322/caac.21708\u003c/li\u003e\n\u003cli\u003eYarchoan, M, James, BD, Shah, RC, et al. Association of Cancer History with Alzheimer\u0026apos;s Disease Dementia and Neuropathology. J ALZHEIMERS DIS. 2017; 56 (2): 699-706. doi: 10.3233/JAD-160977\u003c/li\u003e\n\u003cli\u003eShafi, O. Inverse relationship between Alzheimer\u0026apos;s disease and cancer, and other factors contributing to Alzheimer\u0026apos;s disease: a systematic review. BMC Neurol. 2016; 16 (1): 236. doi: 10.1186/s12883-016-0765-2\u003c/li\u003e\n\u003cli\u003eDong, Z, Xu, M, Sun, X, et al. Mendelian randomization and transcriptomic analysis reveal an inverse causal relationship between Alzheimer\u0026apos;s disease and cancer. J Transl Med. 2023; 21 (1): 527. doi: 10.1186/s12967-023-04357-3\u003c/li\u003e\n\u003cli\u003eKaranth, SD, Katsumata, Y, Nelson, PT, et al. Cancer diagnosis is associated with a lower burden of dementia and less Alzheimer\u0026apos;s-type neuropathology. BRAIN. 2022; 145 (7): 2518-2527. doi: 10.1093/brain/awac035\u003c/li\u003e\n\u003cli\u003eOspina-Romero, M, Glymour, MM, Hayes-Larson, E, et al. Association Between Alzheimer Disease and Cancer With Evaluation of Study Biases: A Systematic Review and Meta-analysis. JAMA Netw Open. 2020; 3 (11): e2025515. doi: 10.1001/jamanetworkopen.2020.25515\u003c/li\u003e\n\u003cli\u003eKazemi, E, Zayeri, F, Baghestani, A, et al. Trends of Colorectal Cancer Incidence, Prevalence and Mortality in Worldwide From 1990 to 2017. IRAN J PUBLIC HEALTH. 2023; 52 (2): 436-445. doi: 10.18502/ijph.v52i2.11897\u003c/li\u003e\n\u003cli\u003eFerlay J, Ervik M, Lam F, et al. Global cancer Observatory: cancer today. Lyon, France: international agency for research on cancer[J]. 2018.\u003c/li\u003e\n\u003cli\u003eBhardwaj, A, Liyanage, SI, Weaver, DF. Cancer and Alzheimer\u0026apos;s Inverse Correlation: an Immunogenetic Analysis. MOL NEUROBIOL. 2023; 60 (6): 3086-3099. doi: 10.1007/s12035-023-03260-8\u003c/li\u003e\n\u003cli\u003eYang, C, Delcher, C, Shenkman, E, et al. Machine learning approaches for predicting high cost high need patient expenditures in health care. Biomed Eng Online. 2018; 17 (Suppl 1): 131. doi: 10.1186/s12938-018-0568-3\u003c/li\u003e\n\u003cli\u003eLiang, WS, Dunckley, T, Beach, TG, et al. Gene expression profiles in anatomically and functionally distinct regions of the normal aged human brain. PHYSIOL GENOMICS. 2006; 28 (3): 311-22. doi: 10.1152/physiolgenomics.00208.2006\u003c/li\u003e\n\u003cli\u003ePiras, IS, Krate, J, Delvaux, E, et al. Transcriptome Changes in the Alzheimer\u0026apos;s Disease Middle Temporal Gyrus: Importance of RNA Metabolism and Mitochondria-Associated Membrane Genes. J ALZHEIMERS DIS. 2019; 70 (3): 691-713. doi: 10.3233/JAD-181113\u003c/li\u003e\n\u003cli\u003eShen, A, Liu, L, Huang, Y, et al. Down-Regulating HAUS6 Suppresses Cell Proliferation by Activating the p53/p21 Pathway in Colorectal Cancer. Front Cell Dev Biol. 2022; 9 772077. doi: 10.3389/fcell.2021.772077\u003c/li\u003e\n\u003cli\u003eGao, P, He, M, Zhang, C, et al. Integrated analysis of gene expression signatures associated with colon cancer from three datasets. GENE. 2018; 654 95-102. doi: 10.1016/j.gene.2018.02.007\u003c/li\u003e\n\u003cli\u003eBarrett, T, Wilhite, SE, Ledoux, P, et al. NCBI GEO: archive for functional genomics data sets--update. NUCLEIC ACIDS RES. 2012; 41 (Database issue): D991-5. doi: 10.1093/nar/gks1193\u003c/li\u003e\n\u003cli\u003eRitchie, ME, Phipson, B, Wu, D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. NUCLEIC ACIDS RES. 2015; 43 (7): e47. doi: 10.1093/nar/gkv007\u003c/li\u003e\n\u003cli\u003eLangfelder, P, Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008; 9 559. doi: 10.1186/1471-2105-9-559\u003c/li\u003e\n\u003cli\u003eSeo, J, Gordish-Dressman, H, Hoffman, EP. An interactive power analysis tool for microarray hypothesis testing and generation. BIOINFORMATICS. 2006; 22 (7): 808-14. doi: 10.1093/bioinformatics/btk052\u003c/li\u003e\n\u003cli\u003eSherman, BT, Hao, M, Qiu, J, et al. DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). NUCLEIC ACIDS RES. 2022; 50 (W1): W216-W221. doi: 10.1093/nar/gkac194\u003c/li\u003e\n\u003cli\u003eFriedman, J, Hastie, T, Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33 (1): 1-22. PMID: 20808728\u003c/li\u003e\n\u003cli\u003eLin, X, Li, C, Zhang, Y, et al. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics. Molecules. 2017; 23 (1): doi: 10.3390/molecules23010052\u003c/li\u003e\n\u003cli\u003eLiu, Y, Zhao, H. Variable importance-weighted Random Forests. QUANT BIOL. 2017; 5 (4): 338-351. PMID: 30034909\u003c/li\u003e\n\u003cli\u003eNewman, AM, Liu, CL, Green, MR, et al. Robust enumeration of cell subsets from tissue expression profiles. NAT METHODS. 2015; 12 (5): 453-7. doi: 10.1038/nmeth.3337\u003c/li\u003e\n\u003cli\u003eVarghese, F, Bukhari, AB, Malhotra, R, et al. IHC Profiler: an open source plugin for the quantitative evaluation and automated scoring of immunohistochemistry images of human tissue samples. PLoS One. 2014; 9 (5): e96801. doi: 10.1371/journal.pone.0096801\u003c/li\u003e\n\u003cli\u003eLopez-Rodriguez, AB, Hennessy, E, Murray, CL, et al. Acute systemic inflammation exacerbates neuroinflammation in Alzheimer\u0026apos;s disease: IL-1\u0026beta; drives amplified responses in primed astrocytes and neuronal network dysfunction. ALZHEIMERS DEMENT. 2021; 17 (10): 1735-1755. doi: 10.1002/alz.12341\u003c/li\u003e\n\u003cli\u003eMcLarnon, JG. A Leaky Blood-Brain Barrier to Fibrinogen Contributes to Oxidative Damage in Alzheimer\u0026apos;s Disease. Antioxidants (Basel). 2021; 11 (1): doi: 10.3390/antiox11010102\u003c/li\u003e\n\u003cli\u003ePark, EM, Chelvanambi, M, Bhutiani, N, et al. Targeting the gut and tumor microbiota in cancer. NAT MED. 2022; 28 (4): 690-703. doi: 10.1038/s41591-022-01779-2\u003c/li\u003e\n\u003cli\u003eWang, S, Wei, J, Li, S, et al. PPA1, an energy metabolism initiator, plays an important role in the progression of malignant tumors. Front Oncol. 2022; 12 1012090. doi: 10.3389/fonc.2022.1012090\u003c/li\u003e\n\u003cli\u003eNiu, H, Zhu, J, Qu, Q, et al. Crystallographic and modeling study of the human inorganic pyrophosphatase 1: A potential anti-cancer drug target. PROTEINS. 2021; 89 (7): 853-865. doi: 10.1002/prot.26064\u003c/li\u003e\n\u003cli\u003eMenteş, M, Yandım, C. Identification of PPA1 inhibitor candidates for potential repurposing in cancer medicine. J CELL BIOCHEM. 2023; 124 (10): 1646-1663. doi: 10.1002/jcb.30475\u003c/li\u003e\n\u003cli\u003eCozachenco, D, Ribeiro, FC, Ferreira, ST. Defective proteostasis in Alzheimer\u0026apos;s disease. AGEING RES REV. 2023; 85 101862. doi: 10.1016/j.arr.2023.101862\u003c/li\u003e\n\u003cli\u003eSchmidt, S, Denk, S, Wiegering, A. Targeting Protein Synthesis in Colorectal Cancer. Cancers (Basel). 2020; 12 (5): doi: 10.3390/cancers12051298\u003c/li\u003e\n\u003cli\u003eLiao, P, Wang, W, Shen, M, et al. A positive feedback loop between EBP2 and c-Myc regulates rDNA transcription, cell proliferation, and tumorigenesis. Cell Death Dis. 2014; 5 e1032. doi: 10.1038/cddis.2013.536\u003c/li\u003e\n\u003cli\u003eBoudiaf-Benmammar, C, Cresteil, T, Melki, R. The cytosolic chaperonin CCT/TRiC and cancer cell proliferation. PLoS One. 2013; 8 (4): e60895. doi: 10.1371/journal.pone.0060895\u003c/li\u003e\n\u003cli\u003eD\u0026uuml;zel, E, Ziegler, G, Berron, D, et al. Amyloid pathology but not APOE \u0026epsilon;4 status is permissive for tau-related hippocampal dysfunction. BRAIN. 2022; 145 (4): 1473-1485. doi: 10.1093/brain/awab405\u003c/li\u003e\n\u003cli\u003eGhozlan, H, Cox, A, Nierenberg, D, et al. The TRiCky Business of Protein Folding in Health and Disease. Front Cell Dev Biol. 2022; 10 906530. doi: 10.3389/fcell.2022.906530\u003c/li\u003e\n\u003cli\u003eTang, C, Li, C, Chen, C, et al. LINC01234 promoted malignant behaviors of breast cancer cells via hsa-miR-30c-2-3p/CCT4/mTOR signaling pathway. TAIWAN J OBSTET GYNE. 2024; 63 (1): 46-56. doi: 10.1016/j.tjog.2023.09.019\u003c/li\u003e\n\u003cli\u003eLi, F, Liu, CS, Wu, P, et al. CCT4 suppression inhibits tumor growth in hepatocellular carcinoma by interacting with Cdc20. CHINESE MED J-PEKING. 2021; 134 (22): 2721-2729. doi: 10.1097/CM9.0000000000001851\u003c/li\u003e\n\u003cli\u003eHe, X, Ge, C, Xia, J, et al. The Zinc Transporter SLC39A10 Plays an Essential Role in Embryonic Hematopoiesis. Adv Sci (Weinh). 2023; 10 (17): e2205345. doi: 10.1002/advs.202205345\u003c/li\u003e\n\u003cli\u003eKumar, V, Kumar, A, Singh, K, et al. Neurobiology of zinc and its role in neurogenesis. EUR J NUTR. 2021; 60 (1): 55-64. doi: 10.1007/s00394-020-02454-3\u003c/li\u003e\n\u003cli\u003eRen, X, Feng, C, Wang, Y, et al. SLC39A10 promotes malignant phenotypes of gastric cancer cells by activating the CK2-mediated MAPK/ERK and PI3K/AKT pathways. EXP MOL MED. 2023; 55 (8): 1757-1769. doi: 10.1038/s12276-023-01062-5\u003c/li\u003e\n\u003cli\u003eMa, Z, Li, Z, Wang, S, et al. SLC39A10 Upregulation Predicts Poor Prognosis, Promotes Proliferation and Migration, and Correlates with Immune Infiltration in Hepatocellular Carcinoma. J Hepatocell Carcinoma. 2021; 8 899-912. doi: 10.2147/JHC.S320326\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Alzheimer's disease, Colorectal cancer, bioinformatics analysis, immune cells infiltration","lastPublishedDoi":"10.21203/rs.3.rs-4806177/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4806177/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eAlzheimer's disease (AD) and colorectal cancer (CRC) are two kind of age-related diseases with a negative correlation in risk of prevalence. In this study, we aimed to identify the hub genes and immune-associated biomarkers contributing to the inverse relationship between AD and CRC. The gene expression data from public repositories and the bioinformatics techniques, including differentially expressed genes (DEGs) analysis, weighted gene co-expression network analysis (WGCNA), and machine learning algorithms, were integrated to screen the hub genes that are inversely expressed in AD and CRC. The immunohistochemistry (IHC) analysis was performed to validate the identified hub genes in the cancer tissues from CRC patients or brain tissues from 5\u0026times;FAD mice. We have identified 6 hub genes, including EBNA1BP2, PPA1, CCT4, SLC39A10, RAN, and PPA1, which potentially play critical roles in the negative correlation between AD and CRC and might provide valuable insights for the diagnosis, therapy, and prognosis of AD or CRC. Functional enrichment analysis highlighted the immune system's crucial roles in connecting AD and CRC processes. Moreover, the percent of immune cell infiltration in brain or colorectal tissues were different in patients with AD or CRC, offering insights for targeted immunotherapies. Finally, the expression of EBNA1BP2, PPA1 and SLC39A10 were validated to be downregulated in AD, but upregulated in CRC. In conclusion, these results suggested that some hub genes, such as EBNA1BP2, PPA1 and SLC39A10, might contribute to the inverse relationship between AD and CRC, which lay a foundation for further investigating the underlying mechanism, as well as for the development of novel diagnostic and therapeutic strategies for this two diseases.\u003c/p\u003e","manuscriptTitle":"Identification of hub genes contributed to the negative correlation between the incidence of Alzheimer's disease and colorectal cancer via integrated bioinformatics analysis and machine learning","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-09-02 18:58:15","doi":"10.21203/rs.3.rs-4806177/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"f7195a15-95f2-4fad-83a1-110141460595","owner":[],"postedDate":"September 2nd, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-09-27T12:10:34+00:00","versionOfRecord":[],"versionCreatedAt":"2024-09-02 18:58:15","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4806177","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4806177","identity":"rs-4806177","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00