A breast tissue-specific epigenetic clock provides accurate chronological age predictions and reveals de-correlation of age and DNA methylation in tumor-adjacent and tumor samples | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article A breast tissue-specific epigenetic clock provides accurate chronological age predictions and reveals de-correlation of age and DNA methylation in tumor-adjacent and tumor samples Miguel Quintela-Fandino, Leonardo Garma, Sonia Pernas, David Vicente Baz, and 7 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6222303/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Epigenetic clocks have been widely used to estimate biological age across various tissues, but their accuracy in breast tissue remains suboptimal. Pan-tissue models such as Horvath’s and Hannum’s clocks, perform poorly in predicting chronological age in breast tissue, underscoring the need for a tissue-specific approach. In this study, we introduce a Breast Tissue-specific Epigenetic Clock (BTEC), developed using DNA methylation data from 553 healthy breast tissue samples across seven different studies. BTEC significantly outperformed pan-tissue clocks, demonstrating superior correlation with chronological age (r = 0.88) and lower prediction errors (MAE = 3.27 years) without requiring for dataset-specific regressions adjustments. BTEC’s chronological age predictions for tumor-adjacent samples showed distortions, with an average deviation of -1.76 years, which was even more pronounced in tumor samples, where the average difference between predicted and chronological age was − 12.29 years. When analyzed by molecular subtype, the distortion was greater in the more aggressive HER2 + and TNBC tumors compared to HR + tumors. The probes used by BTEC were associated with known oncogenes, genes involved in estrogen metabolism, cadherin binding and fibroblast growth factor binding. Despite the general rejuvenation observed in tumor tissue compared to normal breast, the correlation between BTEC’s predictions and cancer-related survival indicated that TNBC tumors with increased epigenetic ages had significant lower survival. Biological sciences/Molecular biology/Epigenetics/DNA methylation Biological sciences/Cancer/Breast cancer Health sciences/Oncology/Cancer/Breast cancer Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Introduction Aging is a major risk factor for numerous diseases 1 , 2 , and the largest risk factor for cancer 3 . As women age, their risk of developing breast cancer increases substantially, with incidence rising sharply until menopause and continuing to climb at a slower rate thereafter 4 . This age-related risk is driven by a combination of factors, including accumulated genetic mutations, prolonged exposure to hormones, and changes in breast tissue 5 . Epigenetic changes have been found to have a strong relationship with aging 6 , and some researchers have even suggested that epigenetic modifications drive the aging process 7 . This relationship led to the development of epigenetic clocks or epiclocks 8 – 10 , which are mathematical models that predict chronological age based on the methylation state of specific regions in the DNA 11 . The difference or deviation between the predicted (or biological or epigenetic ) age and the actual chronological age (i.e., the prediction error) is commonly referred to as epigenetic age acceleration (EAA) and is often interpreted as a measure of aging rate or biological aging. Epigenetic clocks have been used to link EAA to multiple features and pathologies, from cancer 1213 to psychiatric disorders 14 – 16 or socio-economic status 17 , 18 . However, recent research 19 , 20 suggests that epigenetic clocks designed to estimate chronological age can be effectively constructed using blood DNA methylation patterns even at random CpG sites. In contrast, clocks aimed at capturing specific biological traits associated with aging would need to be based on measurement of non-stochastic, biologically regulated methylation events. Therefore, we hypothesize that accurately measuring biological age in breast tissue and uncovering the molecular changes associated with aging require the development of a novel, tissue-specific epigenetic clock. An accurate breast-specific biological aging clock could be valuable in several clinical contexts such as refining risk assessment in screening programs, predicting risk of breast cancer development or contributing to a better understanding of breast aging and hormonal influence. Popular epigenetic clocks, such as Hannum’s or Horvath’s models, were intended as multi-organ models 9 , 10 . Yet, their developers observed low performances on breast tissue: Horvath reported a correlation coefficient of 0.73 between the epigenetic age calculated by its model and chronological age in 23 normal breast tissue samples of the training set of his model, and noted that the model is poorly calibrated in breast tissue. Likewise, Hannum’s clock reached a correlation of 0.72 when tested on data from 83 tumor-adjacent breast tissue samples from the TCGA, and the authors suggested that the intercept and slope of the model needed to be re-adjusted for breast. Subsequent studies testing multiple clocks reported correlations between 0.35 and 0.78, confirming the limited capability of these models to work with breast tissue data 21 – 23 . Despite these limitations, these models have been applied to breast-tissue data (regressing them on age to provide a better fit) in order to address several clinical questions of relevance, such as whether DNA methylation age is or not elevated compared to other tissues in women, if there is epigenetic acceleration or not in the tumor or in the tumor-adjacent compartment compared to tumor-distant tissue, or determining whether breast tumors are epigenetically older or younger than the patient herself. To date, it has been suggested that DNA methylation age is elevated in breast tissue of healthy women 24 ; that there is significant epigenetic age acceleration (EAA) in tumor-adjacent compared to normal breast tissue 22 ; and that tumor tissue presents higher epigenetic age 25 than the host. Recently, a deep-learning pan-tissue epigenetic clock demonstrated improved accuracy in predicting chronological age when applied to breast tissue samples. However, its performance varied significantly across datasets, with r values ranging from − 0.703 to 0.858 across 5 test datasets comprising 100 samples 26 . This model found that breast tumors exhibited significantly increased EAA compared to normal breast tissue, with an average acceleration of 4.542 years. Castle et al. developed a tissue-specific model with high performance (r = 0.88, MAE = 4.2 years) in a test set of 91 healthy donor samples 27 . While they observed no difference in EAA between normal and tumor-adjacent samples, they reported increased EAA in tumor samples. Notably, they claimed that late-staged tumors exhibited a non-significant negative EAA. Unfortunately, their model was developed using genome-wide DNA methylation data, making it incompatible with most publicly available data, which was generated using methylation microarray platforms. Here, we evaluated the performance of various pan-tissue epigenetic clocks on a large and diverse DNA methylation dataset from breast tissue of healthy donors, consisting of 553 samples across 7 different studies. After all models produced low-accuracy chronological age predictions, we developed the Breast Tissue-specific Epigenetic Clock (BTEC), which significantly improved chronological age predictions for healthy tissue samples. When applied to tumor and tumor-adjacent samples, BTEC’s predictions were notably distorted, generally showing reduced epigenetic age acceleration (EAA) and contradicting findings from previous models. This distortion was more pronounced in tumor samples than in tumor-adjacent tissues, with the greatest effect observed in triple-negative breast cancer (TNBC) tumors, followed by HER2 + tumors, and the least in hormone receptor-positive (HR+) cases. Despite the overall decrease in EAA in tumors compared to normal breast tissue, the relationship between BTEC’s predictions and cancer survival indicated that TNBC tumors with increased epigenetic ages had significant lower survival. Results Breast tissue DNA methylation data Previous studies examining epigenetic age in breast tissue have relied on relatively small datasets, ranging from 23 to 200 9,22,26 subjects, limiting the robustness and generalizability of the models. This limitation is evident in the varying correlations between predicted and chronological age reported both in the original studies and in subsequent applications. To improve upon these models, we compiled a larger dataset by integrating breast tissue DNA methylation data from multiple sources, generated using Illumina’s 450k and EPIC DNA methylation arrays. Our dataset includes samples from female donors from 13 different studies, comprising 553 normal (healthy donor) samples, 362 tumor-adjacent samples, and 1,108 tumoral samples from female subjects (Fig. 1 A). The age of the healthy sample donors ranged from 13 to 90 years, while the tumor and tumor-adjacent samples were obtained from subjects with ages between 24 and 93 years. Out of the 1108 tumor samples, 700 were labeled with a specific molecular subtype: 256 hormone-receptor positive (HR+), 82 HER2 -positive (HER2+) and 362 triple-negative breast cancer (TNBC) samples (Fig. 1 C). Detailed information about the data sources and the characteristics of the corresponding cohorts are detailed in Supplementary Table 1. Performance of multi-organ epiclocks We used the combined dataset of 553 samples from healthy donors to evaluate the accuracy of chronological age predictions from four pan-tissue epigenetic clocks: Hannum 25 , Horvath 9 , PhenoAge 28 and AltumAge 26 . Each of these models predicted the age of the donors based on breast tissue DNA methylation data (“pan-to-breast” prediction). For comparison, we two baseline models: a naïve model that always predicted the average age of the donors (41.01 years) and a random model which made random predictions within the entire age range of the dataset (between 13 and 90 years). The results showed that the four epigenetic clocks produced predictions that were correlated with chronological age (r values ranging from 0.5 to 0.84; Fig. 2 A-D). However, all four models exhibited considerable root-mean squared error (RMSE) values, ranging from 9.17 to 17.58 (Table 1 ). AltumAge performed the best, with the lowest prediction error and highest correlation, followed by Horvath’s model. Both Hannum’s model and PhenoAge showed larger errors than the naïve model, both in terms of RMSE and median absolute error (MAE), indicating their relatively low performance. All models outperformed the random predictor. Table 1 Root-mean squared error (RMSE), median absolute error (MAE) and Pearsons’ correlation coefficient (r) for each model on the healthy donor samples Model RMSE MAE r Hannum 11.78 7.12 0.69 Horvath 14.77 13.63 0.82 PhenoAge 17.58 11.88 0.5 AltumAge 9.17 6.57 0.84 Naïve (predict the average age for all samples, 41.01 years) 13.93 9.01 N/A Random (random values between 13 and 90 years) 28.10 20.14 -0.06 A Breast Tissue specific Epigenetic Clock outperforms pan-tissue models To address the limitations of pan-tissue models, we developed the tissue-specific Breast Tissue-specific Epigenetic Clock (BTEC). For training this model, we utilized 406 healthy donor samples from five studies as the training set, and reserved data from two additional studies, comprising 147 samples, for validation (Supplementary Table 2). We employed linear and quadratic terms in constructing the model, using elastic net regression. To select the appropriate CpG probes, we focused on those conserved across the 450k, EPICv1, and EPICv2 microarrays that showed significant (corrected p-value < 0.05) linear (N = 201,527) or quadratic (N = 179,062) correlations with chronological age in the training set. From the resulting 380,589 linear and quadratic features, we selected the top 0.5% based on the highest absolute Pearson correlation with chronological age (|r| > 0.382, N = 1,962) to train the model using elastic net (Supplementary Table 3). We set the L1 ratio to 0.5 and optimized the alpha value to 0.0013 through 5-fold cross-validation on augmented training data (see Methods). Using these optimized parameters, we applied a Leave-One-Out Cohort (LOOC) approach, where models were trained on 4 out of 5 datasets (N-1) and used to predict chronological age on the excluded cohort. This approach resulted in a strong correlation (greater than 0.89 in all cases) between the predicted and actual chronological ages. The RMSE was below 7.5 years in all but one case, where a systematic error was observed (Fig. 3 A, Supplementary Table 4, Supplementary Fig. 1). We then re-trained the model using the same parameters and the entire training set. The resulting BTEC included 637 linear and 549 quadratic terms derived from 799 CpG probes, along with an intercept value (Supplementary Table 5). When applied to the validation data, BTEC produced epigenetic age predictions that were highly correlated with chronological age (r = 0.88) and demonstrated high accuracy (RMSE = 6.31, MAE = 3.27; Fig. 3 B). In the same validation data (N = 147), the prediction error distribution from BTEC was the only one that did not deviate significantly from zero (one-sample t-test, p < 0.01) (Fig. 3 C). Consequently, the absolute prediction error was significantly lower for BTEC compared to the other four models (Fig. 3 D). The correlation with chronological age, along with the MAE and RMSE values indicated that BTEC outperformed the other models overall (Table 2 ). Table 2 Results from each model on the validation dataset (N = 147, healthy donor samples) Model RMSE MAE r Hannum 11.09 7.09 0.71 Horvath 15.16 13.56 0.82 PhenoAge 16.03 10.65 0.54 AltumAge 7.28 4.26 0.87 BTEC 6.31 3.27 0.88 Epigenetic age alterations in tumor-adjacent and tumor samples EAA is typically calculated as the difference between the age predicted by an epigenetic clock and the chronological age. However, due to the poor performance of pan-tissue epigenetic clocks on breast tissue DNA methylation data, several studies have instead defined EAA as the residuals from a linear regression of the predicted epigenetic age on chronological age 10 , 21 – 23 . This ad-hoc solution introduces a bias, as the regression is computed on a per-study or per-dataset basis, making it difficult to compare results across different studies. Since BTEC produced accurate age predictions with prediction errors centered around zero for healthy tissue (i.e., it did not detect accelerated or decelerated aging in normal samples), we chose to compute EAA directly as the difference between predicted and chronological ages. The predictions for cancer patient samples were generally lower than the chronological ages (Fig. 4 A). Both tumor-adjacent and tumor samples had significantly lower epigenetic ages compared to normal samples (Fig. 4 B). The average EAAs were − 1.76 years (range: -41.74 to 30.91) for tumor-adjacent samples and − 12.29 years (range: -98.86 to 56.76) for tumor samples, whereas for healthy samples, the average EAA was 0.82 years. This suggests that, contrary to other studies, BTEC’s results indicate that tumors are “rejuvenated” relative to the host. When tumor samples were categorized by molecular subtype, HER2 + samples showed significantly lower EAA (median = -18.13 years, range: -56.03 to + 56.18) compared to HR + samples (median = -4.99 years, range: -66.28 to + 51.56), with TNBC cases exhibiting the largest negative EAA (median = -18.54 years, range: -71.86 to + 43.8; Figs. 4 C and D). To better understand the differences between tumors with accelerated (EAA > 0) or decelerated (EAA < 0) epigenetic ages as predicted by BTEC, we analyzed the transcriptomic data from Terunuma et al.'s study 29 . The differential expression analysis revealed that tumors with EAA < 0 over-expressed 69 genes (p-value 1), including several collagens ( COL8A1 , COL1A1 , COL12A1 , COL1A2 , COL11A1 , COL10A1 , CTHRC1 ) and fibronectins ( FN1 , FNDC1 ), which are involved in the epithelium-mesenchymal transition, as well as other proteins related to extracellular matrix remodeling ( CEMIP , MMP11 , BGN , SULF1 , FAP ) (Supplementary Table 6). These findings suggest that tumors with EAA 0 were enriched in 145 genes (p-value 1), including PI3k-Akt signaling kinases ( NRTK2 , PRKAA2 , KIT ) and genes with prognostic value ( EGFR , MET , SHC4 , PGR ) (Supplementary Table 6). Overall, this gene expression profile indicates that tumors with EAA > 0 exhibit patterns typically associated with poor prognosis. Tumor epigenetic age is not determined by the replication rate Previous studies have shown that tumors exhibit accelerated biological aging 25 . To explore a possible explanation for our contrasting findings, we investigated whether the biological age of tumors predicted by BTEC correlates with the number of cell replications, as proposed by Horvath 9 . The expression of Ki-67, a well-known biomarker of cell proliferation 30 , is widely used in oncology to assess tumor aggressiveness 31 . In breast cancer specifically, Ki-67 levels are used to classify hormone receptor-positive (HR+) tumors into the Luminal A and Luminal B subtypes 32 . Here, we used the Ki-67 annotations available in two of the datasets we collected (GSE69914 33 and GSE141441 34 ) to explore whether this biomarker was related to the chronological age of the donors, their epigenetic age predicted by BTEC, or their EAA. Ki67 data were not reported individually in the study from Gao et al. 33 ; instead, patients were grouped into two categories based on their Ki-67 status: low (patients with Ki-67 below 14%) and high (patients with Ki-67 at or above 14%). This classification, based on a 13% cut-off, comes from a previous large study in which luminal-type breast cancer showed distinct clinical courses and endocrine sensitivity depending on this threshold 35 . We observed that the distribution of chronological ages did not significantly differ between the groups (p > 0.05, two-sided Wilcoxon test). However, patients with higher Ki-67 values exhibited significantly lower epigenetic ages and EAAs (p < 0.05, two-sided Wilcoxon test; Fig. 5 A). Contrary to the assumption that accelerated biological aging is a result of repeated replication cycles, these findings suggest the opposite: tumors with lower epigenetic ages and EAAs may retain the highest replicative potential. In the dataset from Fackler et al. 34 , which provided detailed Ki-67 values for all subjects, we observed low correlations between Ki-67 value and age (r = 0.15), epigenetic age (r = 0.2), and EAA (r = 0.29) (Fig. 5 B). When we grouped subjects using the same Ki-67 threshold as in Gao et al.'s study (14%), the only significant difference (p < 0.05, two-sided Wilcoxon test) was observed in the EAA values. Specifically, the group with Ki-67 ≥ 14% exhibited higher EAA values (data not shown). The lack of consistency and contradictory results in these two studies suggest that there is no clear relationship between the epigenetic age or the EAA determined by BTEC and the Ki-67 values. This implies that the tumors’ biological age predicted by BTEC is not directly determined by the number of replications the tumor cells have undergone. BTEC’s EAA magnitude is associated with long-term prognosis in TNBC To investigate whether the tumors aging rates assigned by BTEC are linked to clinical outcomes, we analyzed data from two large datasets of TNBC patients: GSE141441 34 , which includes relapse-free times for 164 patients, and GSE78754 36 , which provides survival times for 63 TNBC patients. Our findings showed that patients with EAA > 0 based on BTEC’s predictions had an increased but statistically non-significant relapse hazard (mean HR = 1.58, p > 0.05 likelihood ratio test; Fig. 5 C). However, they exhibited a significantly lower survival probability (mean HR = 3.07, p < 0.05 likelihood ratio test; p < 0.05 log-rank test; Fig. 5 D). In contrast, classifying patients by accelerated or decelerated epigenetic age using the Horvath or AltumAge models did not yield significant differences in either relapse or survival outcomes. These results suggest that, unlike pan-tissue models, BTEC’s predictions from tumor tissue may have prognostic value (Fig. 5 C, D). Biological basis of BTEC predictions To gain insight into the differences observed in the epigenetic age predictions for healthy tissue, tumor-adjacent, and tumor samples, we examined the methylation state of the probes used by BTEC across the different sample types. Overall, the average methylation levels on the 799 probes were significantly higher in the tumor-adjacent and tumor samples compared to the healthy tissue samples (p < 0.05, two-sided Wilcoxon test; Fig. 6 A). Upon examining individual probes, we found that 751 probes in the tumor-adjacent samples and 663 probes in the tumor samples had significantly different methylation states when compared to the samples from healthy donors (adjusted p-value < 0.05, two-sided Wilcoxon test, Bonferroni corrected; Supplementary Table 7). Moving beyond the absolute methylation values, we observed that the relationship between the methylation states of BTEC's probes and chronological age was very different across the three sample groups. The feature pre-selection and the regularization applied during BTEC's training resulted in probes which had an average absolute correlation (|r|) with age of 0.4 in the healthy donor samples. However, these correlations were considerably weaker in the tumor-adjacent (mean |r|=0.15) and tumor samples (mean |r|=0.09; Fig. 6 B, Supplementary Table 8). In these altered tissues, the model’s probes lost the correlations with age that they exhibited in the healthy donor samples (Fig. 6 C). This loss of correlation likely explains the distortions observed in BTEC's age predictions for tumor-adjacent and tumor samples. We then explored the genes related to BTEC’s probes and their associated molecular biological functions. The 424 probes that were positively correlated with age mapped to a total of 300 genes (Supplementary Table 9). This gene set was enriched in 10 KEGG terms 37 , all of them due to the presence of UDP-glucuronosyltransferases (UGTs), except for the term "Chronic myeloid leukemia". Enrichment in this last term was due to the presence of multiple known oncogenes in the set (TP53, MYC, AKT1, ALB1). The set of 300 genes was enriched in 5 molecular functions, all related to UGTs (Fig. 5 D). The deregulation of UGTs in breast tissue has a well-established connection with breast cancer, as it interferes with cell’s ability to properly manage estrogen metabolism 38 , 39 . Among the 300 genes in this set, only two—TP53 and nuclear receptor corepressor 2 (NCOR2)—have direct evidence linking them to aging in mammalian models, according to the GenAge database 40 . The 376 probes negatively correlated with age mapped to 261 different genes (Supplementary Table 9). This gene set was not enriched in any KEGG terms, however they were enriched in two molecular functions: Cadherin binding and Fibroblast Growth Factor Binding (Fig. 5 E). These functions are known to be implicated in breast cancer 41 , 42 and their deregulation could be expected in transformed tissue. Among the genes in this set, only one—ARNTL—has direct evidence of involvement in aging, according to the GenAge database 40 . These observations suggest that BTEC predominantly relies on genes with no direct link to aging and, instead, involves genes related to estrogen metabolism and cancer pathways. Finally, we examined the RNA expression data from the dataset GSE102088, generated in the breast tissue DNA methylation study from Song et al. 43 , to identify genes involved in BTEC’s predictions that also exhibited expression changes correlated with the subjects' age. We identified a total of 1712 genes that showed a significant and non-weak (p 0.25) correlation between expression and age (Supplementary Table 10). Among these, 24 genes were associated with probes that had positive coefficients in BTEC (Supplementary Table 10). This set of 24 genes was enriched in phosphatase binding molecular functions (GO:0019903 and GO:0051721), with notable genes such as TP53 , KCNQ1 , and MFHAS1 . Another 37 genes matched those associated with probes that had negative coefficients in BTEC (Supplementary Table 10). This gene set was enriched in Cadherin Binding (GO:0045296) and Fibroblast Growth Factor Binding (GO:00171134) molecular functions, which mirrored the enrichment observed when analyzing all the genes associated with BTEC's negative coefficients. Although the overlap between the genes associated with age in BTEC and those identified in the expression data was relatively small, as expected due to the complex relationship between DNA methylation and gene expression 44 – 46 , both data sources highlighted similar pathways and proteins. These included oncogenes (e.g., TP53 , NXN , TRIM59 ), phosphatase binding proteins (e.g., KCNQ1 , MFHAS1 ), cadherin binding proteins (e.g., PAK6 , RPL6 ), and FGF binding proteins (e.g., RPS2 , FGFR2 ). These findings suggest a connection between these molecular functions and aging in breast tissue, further supporting the relevance of BTEC’s predictions in the context of breast cancer and aging. Discussion Epigenetic clocks have emerged as valuable tools to estimate biological age based on DNA methylation patterns 11 , 47 . However, pan-tissue epigenetic clocks have demonstrated poor performance in breast tissue: early models reported correlations below 0.75 between predicted and chronological ages and large chronological age prediction errors 9 , 25 ; and the accuracy of state-of-the-art epiclocks exhibited a wide range of variation across datasets, with reported r values between − 0.703 and 0.858 26 . One possible explanation for this limitation is the scarcity of breast tissue-specific DNA methylation data, which has likely hindered the development of models that accurately capture the epigenetic aging process in this tissue. Yet, despite their shortcomings, pan-tissue models continue to be widely used in studies involving both normal and pathological breast samples 21 – 24 . To address the limitations of existing models, we compiled a large and diverse dataset from 13 different studies, representing the most comprehensive collection of breast tissue DNA methylation data to date. Using this dataset, we tested four pan-tissue epigenetic clocks. Our results confirmed that these models exhibited poor predictive performance and systematic errors, with some models performing worse than a naïve approach that predicts a constant age. This highlights the limitations of applying generalized epigenetic clocks to breast tissue and underscores the necessity of tissue-specific models. To overcome these issues, we developed BTEC, a breast tissue-specific epigenetic clock, which significantly outperformed all tested pan-tissue models. BTEC provided age predictions in normal breast tissue with lower errors than any of the pan-tissue models, without the need for ad-hoc, dataset-specific regressions. The results indicate a lack of intrinsic age acceleration, in contrast with the results reported by Sehl et al. 24 . When applied to tumor-adjacent and breast tumor samples, BTEC detected a significantly decreased epigenetic age acceleration (EAA) in both conditions compared to normal breast tissue. The decreased EAA implies that unlike what was observed based on classical epiclocks 21 – 23 , BTEC determined that tumors are “rejuvenated” with respect to the host. BTEC identifies tumors as 'rejuvenated' relative to the host. While the tissue-specific model by Castle et al. indicated that epigenetic age is generally accelerated in tumors but decelerates in late-stage cases 27 , our findings show that BTEC consistently detects a strong trend toward tumor rejuvenation. Notably, when comparing different tumor subtypes, we observed that HER2 + tumors exhibited lower EAA than HR + tumors, and triple-negative breast cancer (TNBC) tumors displayed even lower EAA than HER2 + tumors. This suggests that tumor subtype influences epigenetic aging patterns in breast tissue. Our analysis of the methylation states of the probes included in the BTEC model revealed that it relies on only three genes with strong evidence of involvement in aging in mammals ( ARNTL , NCOR2 , TP53 ). Instead, BTEC predominantly utilizes the methylation state of genes and pathways linked to breast cancer, including UGTs, known oncogenes ( TP53 , MYC , AKT1 , ALB1 ), as well as FGF-binding and cadherin-binding proteins. This suggests that the aging process in breast tissue differs from patterns observed in other tissues. The distinction underscores the necessity of tissue-specific epigenetic clocks and implies that epigenetic aging in breast tissue may be more closely linked to oncogenic processes than to conventional aging pathways. We observed that the methylation states of the sites included in BTEC were significantly altered in both tumor-adjacent and tumor tissues. Overall, methylation levels were elevated, and the expected correlations between methylation beta values and chronological age were lost. The overlap between age-correlated probes in healthy tissue and probes located within cancer-related genes likely explains BTEC’s sensitivity to disease status. To further explore the biological relevance of these findings, we examined transcriptomic data. Although there was limited overlap between genes whose expression correlates with age and those associated with BTEC probes, our results suggest that age-related changes in DNA methylation and gene expression impact oncogenes, phosphatase-binding proteins, cadherin-binding proteins, and fibroblast growth factor (FGF)-binding proteins in breast tissue. Interestingly, BTEC’s EAA predictions were not associated with Ki-67, a widely used proliferation marker 30 , 31 , suggesting that epigenetic aging in breast tumors operates independently of proliferation rates. Tumors exhibiting accelerated epigenetic age displayed transcriptomic patterns indicative of poor prognosis. In contrast, tumors with decelerated epigenetic age appeared to undergo dedifferentiation. Clinically, TNBC patients whose tumors exhibited accelerated epigenetic aging had significantly lower survival rates, suggesting that epigenetic aging profiles could have prognostic value in this breast cancer subtype. The "rejuvenation" observed in breast tumors, as indicated by the decelerated epigenetic age and lower EAA, raises the possibility of targeted therapeutic interventions. The fact that tumors seem to undergo de-differentiation processes may suggest a potential avenue for treatments aimed at reversing such changes. Anti-aging therapies, such as senescence modulation or telomere uncapping, could hold promise in this context. Senescence is a known response to DNA damage and stress, often acting as a barrier to tumor progression. By targeting the senescent cells within tumors, it may be possible to influence their epigenetic age and potentially improve treatment outcomes. Similarly, telomere uncapping therapies could counteract the rejuvenation seen in tumors, as telomere length and maintenance are closely associated with aging and cellular senescence. Future studies should investigate how such interventions could be combined with traditional treatments to improve efficacy, particularly in the context of aggressive breast cancer subtypes like TNBC. Given that we have demonstrated that the epigenetic age of both healthy and tumor tissue has significant value, an interesting next step would be to explore the integration of epigenetic age with other risk factors, such as breast density. Breast density has long been recognized as an important risk factor for breast cancer, but it currently remains an incomplete predictor of individual risk 48 , 49 . Combining epigenetic age data with breast density measurements could significantly improve risk models, providing a more accurate picture of a patient’s cancer risk profile. This could prove valuable for both early detection and personalized screening strategies. Further research is needed to determine the best way to incorporate these combined biomarkers into routine clinical practice for risk stratification and monitoring. Taken together, our findings establish BTEC as an epigenetic clock capable of producing reliable chronological age predictions in healthy breast tissue. Its application to tumor-adjacent and tumor samples provides new insights into the epigenetic alterations associated with breast cancer. While breast cancer generally induces systemic increases in EAA, our results suggest that breast tumors and their surrounding tissue experience epigenetic changes in the opposite direction. Additionally, our findings emphasize the role of breast tissue-specific genes in the aging process and highlight BTEC’s potential utility as a prognostic tool. Future research should further explore the molecular mechanisms underlying these epigenetic alterations and assess the broader clinical implications of BTEC’s predictions. Methods Study cohorts We used publicly available methylation data from female subjects from 13 previous studies, which we retrieved from the GEO database 50 , 51 . We include the GEO accession number, the number of samples per condition and tumor subtype (when available), as well as the minimum, maximum, and median ages of the subjects in Supplementary Table 1. Data collection and preprocessing Methylation data from previous studies were obtained from the GEO database. In all cases, the data were preprocessed: we used the beta values provided by the original authors when available; otherwise, we calculated the beta value from the methylated and unmethylated signals using the following formula: $$\:\beta\:=\frac{M}{M+U+100}$$ where M and U are the intensities of the methylated and unmethylated signals respectively. We modified the original sample labels as follows: we labeled as “Tumor adjacent” all samples with the annotations “adjacent normal” and “ipsilateral normal”; samples with “reduction mammoplasty” and “prophylactic” were relabeled as “Healthy donor”; and samples annotated as “DCIS” were labeled “Tumor”. Samples labeled “controlateral” were discarded. On the tumor samples, 4.43% of the beta values from the 366863 cross-platform (450k, EPIC, EPICv2) probes were missing in at least one samples. We imputed the missing values with the median value across all samples for each probe. On the non-tumor samples, we imputed 4.4% of the beta values using the same strategy. Epigenetic age prediction using existing models We used four pan-tissue epigenetic clocks to predict chronological age using DNA methylation data from breast tissue of healthy donors using the PyAging Python library 52 : Horvath 9 , Hannum 25 , PhenoAge 28 , and AltumAge 26 . BTEC training We used elastic net to train a chronological age predictor based on the beta values of cross-platform (450k, EPIC, EPICv2) probes which exhibited high correlation with age in the set of heatlthy samples. Specifically, we considered the probes with the top 0.5% r values among all those which had a significant (corrected p-value < 0.05) linear or quadratic correlation with chronological age (a total of 1,962 features; Supplementary Table 3). We used 406 samples from healthy subjects across five datasets for training the model. To reduce the effect of the imbalance by dataset, we augmented the training data by resampling each dataset to a total of 200 samples. On the resulting 1000 samples, we optimized the alpha and L1 ratio values to 0.0013 and 0.5 respectively using a 5-fold cross-validation. The performance of the model on the training set was then assessed using a LOOC approach, and the model then was retrained on the whole augmented training data. The resulting model, consisting of 1187 parameters (an intercept value, 637 linear and 549 quadratic coefficients), was used for the age predictions on the validation dataset as well as on the tumor adjacent and tumor samples. Transcriptomic data analysis In total we used transcriptomic data from two datasets to 1) compare the transcriptomic profiles of tumors with EAA > 0 and EAA < 0 and 2) analyze the correlations between gene expression and age. To compare tumors with in different EAA groups, we used the dataset GSE37754 from Terunuma’s et al. study 29 , which contained microarray expression data (Affymetrix Human Gene 1.0 ST Array) from 108 tumor samples, out of which 55 matched tumor samples from the same study with available methylation data (GSE37751). The data was mean-centered and scaled to unit variance and then the two groups were compared using a two-sided Wilcoxon test. To study the correlation between gene expression and aging in breast tissue, we used the transcriptomic data from Song’s et al. study 43 (GSE102088), which included 104 samples from healthy donors. We used the normalized gene expression matrix provided by the authors to determine the Pearson’s correlation coefficient of each gene and chronological age. Gene set enrichment analysis In all cases, we used the GSEAPY Python library 53 to perform gene set enrichment analysis through the Enrichr 54 API. We included the KEGG 2021, GO Molecular Functions 2023 and GO Biological Process 2023 gene sets to perform the analysis on human genes, with no specific background and an adjusted p-value cutoff of 0.05. Survival analysis The hazard ratio estimations were obtained using a Cox proportional hazard model through the lifelines Python library 55 . Kaplan-Meier survival curves were obtained using the same library. Declarations Data availability All the data used in this study is publicly available. Data sources are detailed in Supplementary Table 1. Code availability This study did not generate any new computational methods. The specific Python libraries used are detailed in the Methods section above. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request: Miguel Quintela-Fandino ( [email protected] ). References Guo, J. et al. Aging and aging-related diseases: from molecular mechanisms to interventions and treatments. Signal Transduct. Target. Ther. 7 , 391 (2022). Niccoli, T. & Partridge, L. Ageing as a risk factor for disease. Curr. Biol. 22 , R741–R752 (2012). Laconi, E., Marongiu, F. & DeGregori, J. Cancer as a disease of old age: changing mutational and microenvironmental landscapes. Br. J. Cancer 122 , 943–952 (2020). Benz, C. C. Impact of aging on the biology of breast cancer. Crit. Rev. Oncol. Hematol. 66 , 65–74 (2008). Sun, Y.-S. et al. Risk factors and preventions of breast cancer. Int. J. Biol. Sci. 13 , 1387 (2017). Pal, S. & Tyler, J. K. Epigenetics and aging. Sci. Adv. 2 , e1600584 (2016). Yang, J.-H. et al. Loss of epigenetic information as a cause of mammalian aging. Cell 186 , 305–326 (2023). Horvath, S. & Raj, K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat. Rev. Genet. 19 , 371–384 (2018). Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol. 14 , 1–20 (2013). Hannum, G. et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol. Cell 49 , 359–367 (2013). Kabacik, S. et al. The relationship between epigenetic age and the hallmarks of aging in human cells. Nat. Aging 2 , 484–493 (2022). Zheng, Y. et al. Blood epigenetic age may predict cancer incidence and mortality. EBioMedicine 5 , 68–73 (2016). Perna, L. et al. Epigenetic age acceleration predicts cancer, cardiovascular, and all-cause mortality in a German case cohort. Clin. Epigenetics 8 , 1–7 (2016). Wu, X., Ye, J., Wang, Z. & Zhao, C. Epigenetic age acceleration was delayed in schizophrenia. Schizophr. Bull. 47 , 803–811 (2021). Han, L. K. et al. Epigenetic aging in major depressive disorder. Am. J. Psychiatry 175 , 774–782 (2018). Jeremian, R. et al. Epigenetic age dysregulation in individuals with bipolar disorder and schizophrenia. Psychiatry Res. 315 , 114689 (2022). Fiorito, G. et al. Social adversity and epigenetic aging: a multi-cohort study on socioeconomic differences in peripheral blood DNA methylation. Sci. Rep. 7 , 16266 (2017). Hughes, A. et al. Socioeconomic position and DNA methylation age acceleration across the life course. Am. J. Epidemiol. 187 , 2346–2354 (2018). Tong, H. et al. Quantifying the stochastic component of epigenetic aging. Nat. Aging 1–16 (2024). Meyer, D. H. & Schumacher, B. Aging clocks based on accumulating stochastic variation. Nat. Aging 1–15 (2024). Hofstatter, E. W. et al. Increased epigenetic age in normal breast tissue from luminal breast cancer patients. Clin. Epigenetics 10 , 1–11 (2018). Rozenblit, M. et al. Evidence of accelerated epigenetic aging of breast tissues in patients with breast cancer is driven by CpGs associated with polycomb-related genes. Clin. Epigenetics 14 , 30 (2022). Koka, H. et al. DNA methylation age in paired tumor and adjacent normal breast tissue in Chinese women with breast cancer. Clin. Epigenetics 15 , 55 (2023). Sehl, M. E., Henry, J. E., Storniolo, A. M., Ganz, P. A. & Horvath, S. DNA methylation age is elevated in breast tissue of healthy women. Breast Cancer Res. Treat. 164 , 209–219 (2017). Hannum, G. et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol. Cell 49 , 359–367 (2013). de Lima Camillo, L. P., Lapierre, L. R. & Singh, R. A pan-tissue DNA-methylation epigenetic clock based on deep learning. Npj Aging 8 , 4 (2022). Castle, J. R. et al. Estimating breast tissue-specific DNA methylation age using next-generation sequencing data. Clin. Epigenetics 12 , 1–14 (2020). Levine, M. E. et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging 10 , 573 (2018). Terunuma, A. et al. MYC-driven accumulation of 2-hydroxyglutarate is associated with breast cancer prognosis. J. Clin. Invest. 124 , 398–412 (2014). Scholzen, T. & Gerdes, J. The Ki-67 protein: from the known and the unknown. J. Cell. Physiol. 182 , 311–322 (2000). Uxa, S. et al. Ki-67 gene expression. Cell Death Differ. 28 , 3357–3370 (2021). Goldhirsch, A. et al. Strategies for subtypes—dealing with the diversity of breast cancer: highlights of the st gallen international expert consensus on the primary therapy of early breast cancer 2011. Ann. Oncol. 22 , 1736–1747 (2011). Gao, Y. et al. The integrative epigenomic-transcriptomic landscape of ER positive breast cancer. Clin. Epigenetics 7 , 1–16 (2015). Fackler, M. J. et al. DNA methylation markers predict recurrence-free interval in triple-negative breast cancer. NPJ Breast Cancer 6 , 3 (2020). Cheang, M. C. et al. Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer. JNCI J. Natl. Cancer Inst. 101 , 736–750 (2009). Mathe, A. et al. DNA methylation profile of triple negative breast cancer-specific genes comparing lymph node positive patients to lymph node negative patients. Sci. Rep. 6 , 33435 (2016). Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28 , 27–30 (2000). Zhou, X. et al. Disturbance of mammary UDP-glucuronosyltransferase represses estrogen metabolism and exacerbates experimental breast cancer. J. Pharm. Sci. 106 , 2152–2162 (2017). Guillemette, C., Bélanger, A. & Lépine, J. Metabolic inactivation of estrogens in breast tissue by UDP-glucuronosyltransferase enzymes: an overview. Breast Cancer Res. 6 , 1–9 (2004). de Magalhães, J. P. et al. Human Ageing Genomic Resources: updates on key databases in ageing research. Nucleic Acids Res. 52 , D900–D908 (2024). Cowin, P., Rowlands, T. M. & Hatsell, S. J. Cadherins and catenins in breast cancer. Curr. Opin. Cell Biol. 17 , 499–508 (2005). Dickson, C., Spencer-Dene, B., Dillon, C. & Fantl, V. Tyrosine kinase signalling in breast cancer: fibroblast growth factors and their receptors. Breast Cancer Res. 2 , 1–6 (2000). Song, M.-A. et al. Landscape of genome-wide age-related DNA methylation in breast tissue. Oncotarget 8 , 114648 (2017). Bhasin, J. M. et al. Methylome-wide sequencing detects DNA hypermethylation distinguishing indolent from aggressive prostate cancer. Cell Rep. 13 , 2135–2146 (2015). Moarii, M., Boeva, V., Vert, J.-P. & Reyal, F. Changes in correlation between promoter methylation and gene expression in cancer. BMC Genomics 16 , 1–14 (2015). Itai, Y., Rappoport, N. & Shamir, R. Integration of gene expression and DNA methylation data across different experiments. Nucleic Acids Res. 51 , 7762–7776 (2023). Duan, R., Fu, Q., Sun, Y. & Li, Q. Epigenetic clock: A promising biomarker and practical tool in aging. Ageing Res. Rev. 81 , 101743 (2022). Bodewes, F., Van Asselt, A., Dorrius, M., Greuter, M. & De Bock, G. Mammographic breast density and the risk of breast cancer: A systematic review and meta-analysis. The Breast 66 , 62–68 (2022). Boyd, N. F., Martin, L. J., Yaffe, M. J. & Minkin, S. Mammographic density and breast cancer risk: current understanding and future prospects. Breast Cancer Res. 13 , 1–12 (2011). Barrett, T. et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 41 , D991–D995 (2012). Edgar, R., Domrachev, M. & Lash, A. E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30 , 207–210 (2002). de Lima Camillo, L. P. pyaging: a Python-based compendium of GPU-optimized aging clocks. Bioinforma. Oxf. Engl. btae200 (2024). Fang, Z., Liu, X. & Peltz, G. GSEApy: a comprehensive package for performing gene set enrichment analysis in Python. Bioinformatics 39 , btac757 (2023). Xie, Z. et al. Gene set knowledge discovery with Enrichr. Curr. Protoc. 1 , e90 (2021). Davidson-Pilon, C. lifelines: survival analysis in Python. J. Open Source Softw. 4 , 1317 (2019). Additional Declarations There is NO Competing Interest. Supplementary Files Supplementaryfigures.docx Supplementary Figures Supplementarytables.xlsx Supplementary Tables Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6222303","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":437071697,"identity":"1e6820fc-ac74-4f6a-a585-685e397fa29e","order_by":0,"name":"Miguel Quintela-Fandino","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAxElEQVRIiWNgGAWjYJCCA0Asw8DewMCQQLyWBAYeBp4DJGhhAGuRIFa9bnuO4eHCHzY8/DMfP3vwoIJBTr6BgBazM28MDs9ISOORuJ1mbpBwhsHY4AAhLTdyDA7zJBzmYbidwyaR2MaQuIGQw6Ba/vPI3zwD1lI/n6DDIFoO8Bjc4AFrSWAg6LAzzwoO86Ql8xieSTOTSDgjYbiBoJbjyZs/89jYyckdP/xM8keFjTzBEEOPPgmC6jG0jIJRMApGwSjABAAmcT7YvonxngAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0003-1648-1964","institution":"Breast Cancer Clinical Research Unit \u0026#x2013","correspondingAuthor":true,"prefix":"","firstName":"Miguel","middleName":"","lastName":"Quintela-Fandino","suffix":""},{"id":437071698,"identity":"63f00b0e-16bd-4008-8c11-e0e560cce58d","order_by":1,"name":"Leonardo Garma","email":"","orcid":"","institution":"CNIO","correspondingAuthor":false,"prefix":"","firstName":"Leonardo","middleName":"","lastName":"Garma","suffix":""},{"id":437071699,"identity":"6052e95f-870f-42bd-b5ba-2a958cfec8d4","order_by":2,"name":"Sonia Pernas","email":"","orcid":"https://orcid.org/0000-0002-1485-5080","institution":"Institut Catala d'Oncologia","correspondingAuthor":false,"prefix":"","firstName":"Sonia","middleName":"","lastName":"Pernas","suffix":""},{"id":437071700,"identity":"13dfb395-8159-4484-8208-d20b0527a451","order_by":3,"name":"David Vicente Baz","email":"","orcid":"","institution":"Hospital Virgen de la Macarena, Sevilla, Spain","correspondingAuthor":false,"prefix":"","firstName":"David","middleName":"Vicente","lastName":"Baz","suffix":""},{"id":437071701,"identity":"f4e24dc7-b0e3-4332-9c6a-a890d3ad6d40","order_by":4,"name":"Rosario García Campelo","email":"","orcid":"","institution":"Medical Oncology Service, University Hospital A Coruña (XXIAC-SERGAS), A Coruña, Spain","correspondingAuthor":false,"prefix":"","firstName":"Rosario","middleName":"García","lastName":"Campelo","suffix":""},{"id":437071702,"identity":"6224f8be-2b5c-4232-9319-2fdebaea8390","order_by":5,"name":"Josefa Terrasa","email":"","orcid":"","institution":"Department of Medical Oncology, Hospital Son Espases, Palma de Mallorca","correspondingAuthor":false,"prefix":"","firstName":"Josefa","middleName":"","lastName":"Terrasa","suffix":""},{"id":437071703,"identity":"1ca0ba42-fd84-47d3-9ad2-1339e09c9f2f","order_by":6,"name":"Ramón Colomer","email":"","orcid":"https://orcid.org/0000-0002-6393-3444","institution":"Breast Cancer Clinical Research Unit, Centro Nacional de Investigaciones Oncologicas (CNIO), Madrid, Spain","correspondingAuthor":false,"prefix":"","firstName":"Ramón","middleName":"","lastName":"Colomer","suffix":""},{"id":437071704,"identity":"51b58d93-253b-4333-a3d3-91ac61ffeccc","order_by":7,"name":"Desirée Jiménez","email":"","orcid":"","institution":"Breast Cancer Clinical Research Unit, Centro Nacional de Investigaciones Oncologicas (CNIO), Madrid, Spain","correspondingAuthor":false,"prefix":"","firstName":"Desirée","middleName":"","lastName":"Jiménez","suffix":""},{"id":437071705,"identity":"62d5266b-49b6-46d1-a6e4-8230a5599151","order_by":8,"name":"Ruth Vera","email":"","orcid":"","institution":"Hospital Universitario de Navarra","correspondingAuthor":false,"prefix":"","firstName":"Ruth","middleName":"","lastName":"Vera","suffix":""},{"id":437071706,"identity":"0a70b684-656f-4080-b82e-a55d773797a0","order_by":9,"name":"Begoña Bermejo","email":"","orcid":"","institution":"Hospital Clínico Universitario de Valencia","correspondingAuthor":false,"prefix":"","firstName":"Begoña","middleName":"","lastName":"Bermejo","suffix":""},{"id":437071707,"identity":"fa55fb7a-9869-4cf8-885f-0d20c6b874aa","order_by":10,"name":"Santiago González Santiago","email":"","orcid":"","institution":"Hospital San Pedro Alcántara de Cáceres, Spain","correspondingAuthor":false,"prefix":"","firstName":"Santiago","middleName":"González","lastName":"Santiago","suffix":""}],"badges":[],"createdAt":"2025-03-13 18:55:37","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6222303/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6222303/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":79810563,"identity":"7f059d07-9ee5-4b47-b9e5-170da46d7379","added_by":"auto","created_at":"2025-04-03 06:31:03","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":477803,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA.\u003c/strong\u003e Number of samples per study and type included in the dataset. \u003cstrong\u003eB. \u003c/strong\u003eDistribution of subject ages per sample type. \u003cstrong\u003eC.\u003c/strong\u003eDistribution of tumor samples per breast cancer type.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/ac4cc4e257ea303de7596c64.png"},{"id":79811337,"identity":"1f70b1b3-ce6a-41ff-919c-28f0a270b0a5","added_by":"auto","created_at":"2025-04-03 06:39:03","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":660304,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA-D.\u003c/strong\u003e Chronological age predictions for the breast tissue samples from healthy donors produced by the four models tested: Horvath's 2013 model, Hannum's clock, PhenoAge and Altum age. Each dot indicates the actual and predicted age values of one sample and identity line is represented in red.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/7230ce7310325a836f56946e.png"},{"id":79810560,"identity":"2c48ca67-3f8f-4ff1-8c94-9ccfde10708f","added_by":"auto","created_at":"2025-04-03 06:31:03","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":181290,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA.\u003c/strong\u003e LOOC results; scatterplot of r vs MAE/MSE colored by dataset. \u003cstrong\u003eB. \u003c/strong\u003eBTEC results on validation data (scatterplot, colored by Dataset). \u003cstrong\u003eC.\u003c/strong\u003e Distribution of prediction error on validation dataset, colored by dataset. The boxes represent the quartiles of the distribution while the whiskers extend to the largest and the smallest datapoint within 1.5 times the inter-quartile range (Q3-Q1) above Q3 or below Q1 respectively. \u003cstrong\u003eD.\u003c/strong\u003e Distribution of absolute prediction error on validation dataset. The boxplot elements have the same interpretation as in C.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/e9d0b4efc8cb72e7563204b9.png"},{"id":79811338,"identity":"7fb65650-b206-4ab8-a86c-31e92419c335","added_by":"auto","created_at":"2025-04-03 06:39:03","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":304181,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA.\u003c/strong\u003e Epigenetic age predictions from BTEC on breast tissue samples from healthy donors (left), tumor adjacent samples (middle) and breast tumors (right). \u003cstrong\u003eB. \u003c/strong\u003eDistribution of epigenetic age accelerations in the three sample types. The boxes represent the quartiles of the distribution while the whiskers extend to the largest and the smallest datapoint within 1.5 times the inter-quartile range above Q3 or below Q1 respectively. \u003cstrong\u003eC.\u003c/strong\u003e BTEC's epigenetic age predictions on samples from hormone receptor positive (HR+), HER2 positive (HER2+) and triple negative (TNBC) tumors. \u003cstrong\u003eD.\u003c/strong\u003e Distribution of epigenetic age accelerations predicted by BTEC’s predictions broken down by tumor type. The boxplot elements have the same interpretation as in B.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/4b5a9e55b4d125f2209814a9.png"},{"id":79810562,"identity":"ca3c63d3-abb0-45aa-b3a9-1244bdbba3ca","added_by":"auto","created_at":"2025-04-03 06:31:03","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":603956,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA.\u003c/strong\u003e Distribution of patients’ Age (left), epigenetic age predicted by BTEC (middle) and EAA predicted by BTEC (right) on patients with Ki-67 values below or above 14% from the dataset of Gao et al.’s study\u003csup\u003e33\u003c/sup\u003e. The boxes represent the quartiles of the distribution while the whiskers extend to the largest and the smallest datapoint within 1.5 times the inter-quartile range above Q3 or below Q1 respectively. \u003cstrong\u003eB. \u003c/strong\u003eDistribution of patients’ Age (left), epigenetic age predicted by BTEC (middle) and EAA predicted by BTEC (right) and their corresponding Ki-67 values on patients from Fackler et al.’s study\u003csup\u003e34\u003c/sup\u003e . \u003cstrong\u003eC. \u003c/strong\u003eRelapse hazard ratio for patients with EAA \u0026gt;0 according to BTEC, Horvath’s model or Altumage on Fackler et al.’s study (left). Kaplan-Meier relapse curves for the subjects with EAA above and below zero according to BTEC for the patients in the same study (right).\u0026nbsp; \u003cstrong\u003eD.\u003c/strong\u003e Death hazard ratio for patients with EAA \u0026gt;0 according to BTEC, Horvath’s model or Altumage on Mathe et al.’s study\u003csup\u003e36\u003c/sup\u003e (left). Kaplan-Meier survival curve for the subjects with EAA above and below zero according to BTEC for the patients in the same study (right).\u0026nbsp;\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/b4cb16a081cf871265e8b747.png"},{"id":79810569,"identity":"72960fdf-b5db-4342-8298-7e7b7451c270","added_by":"auto","created_at":"2025-04-03 06:31:03","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":1373742,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA.\u003c/strong\u003e Distribution of average methylation beta values on the 799 probes used by BTEC on the three sample types. The boxes represent the quartiles of the distribution while the whiskers extend to the largest and the smallest datapoint within 1.5 times the inter-quartile range above Q3 or below Q1 respectively. \u003cstrong\u003eB. \u003c/strong\u003eDistribution of correlations between the beta values of the 799 probes used by BTEC and chronological age, colored by sample type. \u003cstrong\u003eC.\u003c/strong\u003e Examples of probes with divergent methylation-age correlations in the different sample types. Each dot represents a sample, indicating the beta value of the probe and the subject's age. The overlaid lines are the corresponding linear regressions. The left panel shows cg10729426, which has an r value of 0.71 in the healthy samples but only 0.32 and 0.22 in the tumor adjacent and tumor samples respectively. The right panel shows cg09650907, with r=-0.41 on the healthy donor samples and r=-0.12 and r=-0.08 on the tumor adjacent and tumor samples. \u003cstrong\u003eD. \u003c/strong\u003eKEGG (top) and GO Molecular Function (bottom) terms enriched in the genes related to the probes with positive coefficients in BTEC.\u003cstrong\u003e E. \u003c/strong\u003eGO Molecular Function terms enriched in the genes related to the probes with negative coefficients in BTEC.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/534515550e5b8d56bef93d8e.png"},{"id":100373545,"identity":"a09c50cd-eead-4c80-91ce-367b4c2102b1","added_by":"auto","created_at":"2026-01-16 08:14:49","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4538497,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/dd3f1b3b-c3cd-4018-a4b0-4afff202aed2.pdf"},{"id":79810565,"identity":"d527c801-88bc-40a7-85e3-1d182f72f659","added_by":"auto","created_at":"2025-04-03 06:31:03","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":110639,"visible":true,"origin":"","legend":"Supplementary Figures","description":"","filename":"Supplementaryfigures.docx","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/8c1c13e89c7e39ff24844261.docx"},{"id":79810587,"identity":"a23fb0ee-1474-41d2-a4a5-2db73cc5649c","added_by":"auto","created_at":"2025-04-03 06:31:05","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":61898403,"visible":true,"origin":"","legend":"Supplementary Tables","description":"","filename":"Supplementarytables.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-6222303/v1/2a9fe4bc08d29c080e0a2a5a.xlsx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"A breast tissue-specific epigenetic clock provides accurate chronological age predictions and reveals de-correlation of age and DNA methylation in tumor-adjacent and tumor samples","fulltext":[{"header":"Introduction","content":"\u003cp\u003eAging is a major risk factor for numerous diseases\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e, and the largest risk factor for cancer\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. As women age, their risk of developing breast cancer increases substantially, with incidence rising sharply until menopause and continuing to climb at a slower rate thereafter\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e. This age-related risk is driven by a combination of factors, including accumulated genetic mutations, prolonged exposure to hormones, and changes in breast tissue\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eEpigenetic changes have been found to have a strong relationship with aging\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e, and some researchers have even suggested that epigenetic modifications drive the aging process\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e. This relationship led to the development of epigenetic clocks or \u003cem\u003eepiclocks\u003c/em\u003e\u003csup\u003e\u003cspan additionalcitationids=\"CR9\" citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e, which are mathematical models that predict chronological age based on the methylation state of specific regions in the DNA\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. The difference or deviation between the predicted (or \u003cem\u003ebiological\u003c/em\u003e or \u003cem\u003eepigenetic\u003c/em\u003e) age and the actual chronological age (i.e., the prediction error) is commonly referred to as epigenetic age acceleration (EAA) and is often interpreted as a measure of aging rate or biological aging. Epigenetic clocks have been used to link EAA to multiple features and pathologies, from cancer\u003csup\u003e1213\u003c/sup\u003e to psychiatric disorders\u003csup\u003e\u003cspan additionalcitationids=\"CR15\" citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e or socio-economic status\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e,\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. However, recent research\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e,\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e suggests that epigenetic clocks designed to estimate chronological age can be effectively constructed using blood DNA methylation patterns even at random CpG sites. In contrast, clocks aimed at capturing specific biological traits associated with aging would need to be based on measurement of non-stochastic, biologically regulated methylation events. Therefore, we hypothesize that accurately measuring biological age in breast tissue and uncovering the molecular changes associated with aging require the development of a novel, tissue-specific epigenetic clock. An accurate breast-specific biological aging clock could be valuable in several clinical contexts such as refining risk assessment in screening programs, predicting risk of breast cancer development or contributing to a better understanding of breast aging and hormonal influence.\u003c/p\u003e \u003cp\u003ePopular epigenetic clocks, such as Hannum\u0026rsquo;s or Horvath\u0026rsquo;s models, were intended as multi-organ models\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e,\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. Yet, their developers observed low performances on breast tissue: Horvath reported a correlation coefficient of 0.73 between the epigenetic age calculated by its model and chronological age in 23 normal breast tissue samples of the training set of his model, and noted that the model is poorly calibrated in breast tissue. Likewise, Hannum\u0026rsquo;s clock reached a correlation of 0.72 when tested on data from 83 tumor-adjacent breast tissue samples from the TCGA, and the authors suggested that the intercept and slope of the model needed to be re-adjusted for breast. Subsequent studies testing multiple clocks reported correlations between 0.35 and 0.78, confirming the limited capability of these models to work with breast tissue data\u003csup\u003e\u003cspan additionalcitationids=\"CR22\" citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eDespite these limitations, these models have been applied to breast-tissue data (regressing them on age to provide a better fit) in order to address several clinical questions of relevance, such as whether DNA methylation age is or not elevated compared to other tissues in women, if there is epigenetic acceleration or not in the tumor or in the tumor-adjacent compartment compared to tumor-distant tissue, or determining whether breast tumors are epigenetically older or younger than the patient herself. To date, it has been suggested that DNA methylation age is elevated in breast tissue of healthy women\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e; that there is significant epigenetic age acceleration (EAA) in tumor-adjacent compared to normal breast tissue\u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e; and that tumor tissue presents higher epigenetic age\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e than the host.\u003c/p\u003e \u003cp\u003eRecently, a deep-learning pan-tissue epigenetic clock demonstrated improved accuracy in predicting chronological age when applied to breast tissue samples. However, its performance varied significantly across datasets, with r values ranging from \u0026minus;\u0026thinsp;0.703 to 0.858 across 5 test datasets comprising 100 samples\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e. This model found that breast tumors exhibited significantly increased EAA compared to normal breast tissue, with an average acceleration of 4.542 years.\u003c/p\u003e \u003cp\u003eCastle et al. developed a tissue-specific model with high performance (r\u0026thinsp;=\u0026thinsp;0.88, MAE\u0026thinsp;=\u0026thinsp;4.2 years) in a test set of 91 healthy donor samples\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e. While they observed no difference in EAA between normal and tumor-adjacent samples, they reported increased EAA in tumor samples. Notably, they claimed that late-staged tumors exhibited a non-significant negative EAA. Unfortunately, their model was developed using genome-wide DNA methylation data, making it incompatible with most publicly available data, which was generated using methylation microarray platforms.\u003c/p\u003e \u003cp\u003eHere, we evaluated the performance of various pan-tissue epigenetic clocks on a large and diverse DNA methylation dataset from breast tissue of healthy donors, consisting of 553 samples across 7 different studies. After all models produced low-accuracy chronological age predictions, we developed the Breast Tissue-specific Epigenetic Clock (BTEC), which significantly improved chronological age predictions for healthy tissue samples. When applied to tumor and tumor-adjacent samples, BTEC\u0026rsquo;s predictions were notably distorted, generally showing reduced epigenetic age acceleration (EAA) and contradicting findings from previous models. This distortion was more pronounced in tumor samples than in tumor-adjacent tissues, with the greatest effect observed in triple-negative breast cancer (TNBC) tumors, followed by HER2\u0026thinsp;+\u0026thinsp;tumors, and the least in hormone receptor-positive (HR+) cases. Despite the overall decrease in EAA in tumors compared to normal breast tissue, the relationship between BTEC\u0026rsquo;s predictions and cancer survival indicated that TNBC tumors with increased epigenetic ages had significant lower survival.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eBreast tissue DNA methylation data\u003c/h2\u003e \u003cp\u003ePrevious studies examining epigenetic age in breast tissue have relied on relatively small datasets, ranging from 23 to 200\u003csup\u003e9,22,26\u003c/sup\u003e subjects, limiting the robustness and generalizability of the models. This limitation is evident in the varying correlations between predicted and chronological age reported both in the original studies and in subsequent applications.\u003c/p\u003e \u003cp\u003eTo improve upon these models, we compiled a larger dataset by integrating breast tissue DNA methylation data from multiple sources, generated using Illumina\u0026rsquo;s 450k and EPIC DNA methylation arrays. Our dataset includes samples from female donors from 13 different studies, comprising 553 normal (healthy donor) samples, 362 tumor-adjacent samples, and 1,108 tumoral samples from female subjects (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA).\u003c/p\u003e \u003cp\u003eThe age of the healthy sample donors ranged from 13 to 90 years, while the tumor and tumor-adjacent samples were obtained from subjects with ages between 24 and 93 years. Out of the 1108 tumor samples, 700 were labeled with a specific molecular subtype: 256 hormone-receptor positive (HR+), 82 \u003cem\u003eHER2\u003c/em\u003e-positive (HER2+) and 362 triple-negative breast cancer (TNBC) samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). Detailed information about the data sources and the characteristics of the corresponding cohorts are detailed in Supplementary Table\u0026nbsp;1.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003ePerformance of multi-organ epiclocks\u003c/h3\u003e\n\u003cp\u003eWe used the combined dataset of 553 samples from healthy donors to evaluate the accuracy of chronological age predictions from four pan-tissue epigenetic clocks: Hannum\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e, Horvath\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e, PhenoAge\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e and AltumAge\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e. Each of these models predicted the age of the donors based on breast tissue DNA methylation data (\u0026ldquo;pan-to-breast\u0026rdquo; prediction). For comparison, we two baseline models: a na\u0026iuml;ve model that always predicted the average age of the donors (41.01 years) and a random model which made random predictions within the entire age range of the dataset (between 13 and 90 years).\u003c/p\u003e \u003cp\u003eThe results showed that the four epigenetic clocks produced predictions that were correlated with chronological age (r values ranging from 0.5 to 0.84; Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA-D). However, all four models exhibited considerable root-mean squared error (RMSE) values, ranging from 9.17 to 17.58 (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). AltumAge performed the best, with the lowest prediction error and highest correlation, followed by Horvath\u0026rsquo;s model. Both Hannum\u0026rsquo;s model and PhenoAge showed larger errors than the na\u0026iuml;ve model, both in terms of RMSE and median absolute error (MAE), indicating their relatively low performance. All models outperformed the random predictor.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eRoot-mean squared error (RMSE), median absolute error (MAE) and Pearsons\u0026rsquo; correlation coefficient (r) for each model on the healthy donor samples\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRMSE\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMAE\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003er\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHannum\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e11.78\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7.12\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.69\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHorvath\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e14.77\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e13.63\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.82\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003ePhenoAge\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e17.58\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e11.88\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAltumAge\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e9.17\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6.57\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0.84\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eNa\u0026iuml;ve (predict the average age for all samples, 41.01 years)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e13.93\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e9.01\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRandom (random values between 13 and 90 years)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e28.10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e20.14\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e-0.06\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e\n\u003ch3\u003eA Breast Tissue specific Epigenetic Clock outperforms pan-tissue models\u003c/h3\u003e\n\u003cp\u003eTo address the limitations of pan-tissue models, we developed the tissue-specific Breast Tissue-specific Epigenetic Clock (BTEC). For training this model, we utilized 406 healthy donor samples from five studies as the training set, and reserved data from two additional studies, comprising 147 samples, for validation (Supplementary Table\u0026nbsp;2).\u003c/p\u003e \u003cp\u003eWe employed linear and quadratic terms in constructing the model, using elastic net regression. To select the appropriate CpG probes, we focused on those conserved across the 450k, EPICv1, and EPICv2 microarrays that showed significant (corrected p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05) linear (N\u0026thinsp;=\u0026thinsp;201,527) or quadratic (N\u0026thinsp;=\u0026thinsp;179,062) correlations with chronological age in the training set. From the resulting 380,589 linear and quadratic features, we selected the top 0.5% based on the highest absolute Pearson correlation with chronological age (|r| \u0026gt; 0.382, N\u0026thinsp;=\u0026thinsp;1,962) to train the model using elastic net (Supplementary Table\u0026nbsp;3).\u003c/p\u003e \u003cp\u003eWe set the L1 ratio to 0.5 and optimized the alpha value to 0.0013 through 5-fold cross-validation on augmented training data (see Methods). Using these optimized parameters, we applied a Leave-One-Out Cohort (LOOC) approach, where models were trained on 4 out of 5 datasets (N-1) and used to predict chronological age on the excluded cohort. This approach resulted in a strong correlation (greater than 0.89 in all cases) between the predicted and actual chronological ages. The RMSE was below 7.5 years in all but one case, where a systematic error was observed (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA, Supplementary Table\u0026nbsp;4, Supplementary Fig.\u0026nbsp;1).\u003c/p\u003e \u003cp\u003eWe then re-trained the model using the same parameters and the entire training set. The resulting BTEC included 637 linear and 549 quadratic terms derived from 799 CpG probes, along with an intercept value (Supplementary Table\u0026nbsp;5). When applied to the validation data, BTEC produced epigenetic age predictions that were highly correlated with chronological age (r\u0026thinsp;=\u0026thinsp;0.88) and demonstrated high accuracy (RMSE\u0026thinsp;=\u0026thinsp;6.31, MAE\u0026thinsp;=\u0026thinsp;3.27; Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB).\u003c/p\u003e \u003cp\u003eIn the same validation data (N\u0026thinsp;=\u0026thinsp;147), the prediction error distribution from BTEC was the only one that did not deviate significantly from zero (one-sample t-test, p\u0026thinsp;\u0026lt;\u0026thinsp;0.01) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). Consequently, the absolute prediction error was significantly lower for BTEC compared to the other four models (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD). The correlation with chronological age, along with the MAE and RMSE values indicated that BTEC outperformed the other models overall (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eResults from each model on the validation dataset (N\u0026thinsp;=\u0026thinsp;147, healthy donor samples)\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRMSE\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMAE\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003er\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHannum\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e11.09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7.09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.71\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHorvath\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e15.16\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e13.56\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.82\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003ePhenoAge\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e16.03\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e10.65\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.54\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAltumAge\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e7.28\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4.26\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.87\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eBTEC\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e6.31\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3.27\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.88\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e\n\u003ch3\u003eEpigenetic age alterations in tumor-adjacent and tumor samples\u003c/h3\u003e\n\u003cp\u003eEAA is typically calculated as the difference between the age predicted by an epigenetic clock and the chronological age. However, due to the poor performance of pan-tissue epigenetic clocks on breast tissue DNA methylation data, several studies have instead defined EAA as the residuals from a linear regression of the predicted epigenetic age on chronological age\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan additionalcitationids=\"CR22\" citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e. This \u003cem\u003ead-hoc\u003c/em\u003e solution introduces a bias, as the regression is computed on a per-study or per-dataset basis, making it difficult to compare results across different studies. Since BTEC produced accurate age predictions with prediction errors centered around zero for healthy tissue (i.e., it did not detect accelerated or decelerated aging in normal samples), we chose to compute EAA directly as the difference between predicted and chronological ages.\u003c/p\u003e \u003cp\u003eThe predictions for cancer patient samples were generally lower than the chronological ages (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA). Both tumor-adjacent and tumor samples had significantly lower epigenetic ages compared to normal samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). The average EAAs were \u0026minus;\u0026thinsp;1.76 years (range: -41.74 to 30.91) for tumor-adjacent samples and \u0026minus;\u0026thinsp;12.29 years (range: -98.86 to 56.76) for tumor samples, whereas for healthy samples, the average EAA was 0.82 years. This suggests that, contrary to other studies, BTEC\u0026rsquo;s results indicate that tumors are \u0026ldquo;rejuvenated\u0026rdquo; relative to the host. When tumor samples were categorized by molecular subtype, HER2\u0026thinsp;+\u0026thinsp;samples showed significantly lower EAA (median = -18.13 years, range: -56.03 to +\u0026thinsp;56.18) compared to HR\u0026thinsp;+\u0026thinsp;samples (median = -4.99 years, range: -66.28 to +\u0026thinsp;51.56), with TNBC cases exhibiting the largest negative EAA (median = -18.54 years, range: -71.86 to +\u0026thinsp;43.8; Figs.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC and D).\u003c/p\u003e \u003cp\u003eTo better understand the differences between tumors with accelerated (EAA\u0026thinsp;\u0026gt;\u0026thinsp;0) or decelerated (EAA\u0026thinsp;\u0026lt;\u0026thinsp;0) epigenetic ages as predicted by BTEC, we analyzed the transcriptomic data from Terunuma et al.'s study\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e. The differential expression analysis revealed that tumors with EAA\u0026thinsp;\u0026lt;\u0026thinsp;0 over-expressed 69 genes (p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05, two-sided Wilcoxon test, log(fold change)\u0026thinsp;\u0026gt;\u0026thinsp;1), including several collagens (\u003cem\u003eCOL8A1\u003c/em\u003e, \u003cem\u003eCOL1A1\u003c/em\u003e, \u003cem\u003eCOL12A1\u003c/em\u003e, \u003cem\u003eCOL1A2\u003c/em\u003e, \u003cem\u003eCOL11A1\u003c/em\u003e, \u003cem\u003eCOL10A1\u003c/em\u003e, \u003cem\u003eCTHRC1\u003c/em\u003e) and fibronectins (\u003cem\u003eFN1\u003c/em\u003e, \u003cem\u003eFNDC1\u003c/em\u003e), which are involved in the epithelium-mesenchymal transition, as well as other proteins related to extracellular matrix remodeling (\u003cem\u003eCEMIP\u003c/em\u003e, \u003cem\u003eMMP11\u003c/em\u003e, \u003cem\u003eBGN\u003c/em\u003e, \u003cem\u003eSULF1\u003c/em\u003e, \u003cem\u003eFAP\u003c/em\u003e) (Supplementary Table\u0026nbsp;6). These findings suggest that tumors with EAA\u0026thinsp;\u0026lt;\u0026thinsp;0 may be undergoing de-differentiation processes. In contrast, tumors with EAA\u0026thinsp;\u0026gt;\u0026thinsp;0 were enriched in 145 genes (p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05, two-sided Wilcoxon test, log(fold change)\u0026thinsp;\u0026gt;\u0026thinsp;1), including PI3k-Akt signaling kinases (\u003cem\u003eNRTK2\u003c/em\u003e, \u003cem\u003ePRKAA2\u003c/em\u003e, \u003cem\u003eKIT\u003c/em\u003e) and genes with prognostic value (\u003cem\u003eEGFR\u003c/em\u003e, \u003cem\u003eMET\u003c/em\u003e, \u003cem\u003eSHC4\u003c/em\u003e, \u003cem\u003ePGR\u003c/em\u003e) (Supplementary Table\u0026nbsp;6). Overall, this gene expression profile indicates that tumors with EAA\u0026thinsp;\u0026gt;\u0026thinsp;0 exhibit patterns typically associated with poor prognosis.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eTumor epigenetic age is not determined by the replication rate\u003c/h3\u003e\n\u003cp\u003ePrevious studies have shown that tumors exhibit accelerated biological aging\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e. To explore a possible explanation for our contrasting findings, we investigated whether the biological age of tumors predicted by BTEC correlates with the number of cell replications, as proposed by Horvath\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e. The expression of Ki-67, a well-known biomarker of cell proliferation\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e, is widely used in oncology to assess tumor aggressiveness\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e. In breast cancer specifically, Ki-67 levels are used to classify hormone receptor-positive (HR+) tumors into the Luminal A and Luminal B subtypes\u003csup\u003e\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eHere, we used the Ki-67 annotations available in two of the datasets we collected (GSE69914\u003csup\u003e33\u003c/sup\u003e and GSE141441\u003csup\u003e34\u003c/sup\u003e) to explore whether this biomarker was related to the chronological age of the donors, their epigenetic age predicted by BTEC, or their EAA. Ki67 data were not reported individually in the study from Gao et al.\u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e; instead, patients were grouped into two categories based on their Ki-67 status: low (patients with Ki-67 below 14%) and high (patients with Ki-67 at or above 14%). This classification, based on a 13% cut-off, comes from a previous large study in which luminal-type breast cancer showed distinct clinical courses and endocrine sensitivity depending on this threshold\u003csup\u003e\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eWe observed that the distribution of chronological ages did not significantly differ between the groups (p\u0026thinsp;\u0026gt;\u0026thinsp;0.05, two-sided Wilcoxon test). However, patients with higher Ki-67 values exhibited significantly lower epigenetic ages and EAAs (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05, two-sided Wilcoxon test; Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA). Contrary to the assumption that accelerated biological aging is a result of repeated replication cycles, these findings suggest the opposite: tumors with lower epigenetic ages and EAAs may retain the highest replicative potential.\u003c/p\u003e \u003cp\u003eIn the dataset from Fackler et al.\u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e, which provided detailed Ki-67 values for all subjects, we observed low correlations between Ki-67 value and age (r\u0026thinsp;=\u0026thinsp;0.15), epigenetic age (r\u0026thinsp;=\u0026thinsp;0.2), and EAA (r\u0026thinsp;=\u0026thinsp;0.29) (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eB). When we grouped subjects using the same Ki-67 threshold as in Gao et al.'s study (14%), the only significant difference (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05, two-sided Wilcoxon test) was observed in the EAA values. Specifically, the group with Ki-67\u0026thinsp;\u0026ge;\u0026thinsp;14% exhibited higher EAA values (data not shown).\u003c/p\u003e \u003cp\u003eThe lack of consistency and contradictory results in these two studies suggest that there is no clear relationship between the epigenetic age or the EAA determined by BTEC and the Ki-67 values. This implies that the tumors\u0026rsquo; biological age predicted by BTEC is not directly determined by the number of replications the tumor cells have undergone.\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eBTEC\u0026rsquo;s EAA magnitude is associated with long-term prognosis in TNBC\u003c/h2\u003e \u003cp\u003eTo investigate whether the tumors aging rates assigned by BTEC are linked to clinical outcomes, we analyzed data from two large datasets of TNBC patients: GSE141441\u003csup\u003e34\u003c/sup\u003e, which includes relapse-free times for 164 patients, and GSE78754\u003csup\u003e36\u003c/sup\u003e, which provides survival times for 63 TNBC patients.\u003c/p\u003e \u003cp\u003eOur findings showed that patients with EAA\u0026thinsp;\u0026gt;\u0026thinsp;0 based on BTEC\u0026rsquo;s predictions had an increased but statistically non-significant relapse hazard (mean HR\u0026thinsp;=\u0026thinsp;1.58, p\u0026thinsp;\u0026gt;\u0026thinsp;0.05 likelihood ratio test; Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC). However, they exhibited a significantly lower survival probability (mean HR\u0026thinsp;=\u0026thinsp;3.07, p\u0026thinsp;\u0026lt;\u0026thinsp;0.05 likelihood ratio test; p\u0026thinsp;\u0026lt;\u0026thinsp;0.05 log-rank test; Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eD). In contrast, classifying patients by accelerated or decelerated epigenetic age using the Horvath or AltumAge models did not yield significant differences in either relapse or survival outcomes. These results suggest that, unlike pan-tissue models, BTEC\u0026rsquo;s predictions from tumor tissue may have prognostic value (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eC, D).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eBiological basis of BTEC predictions\u003c/h3\u003e\n\u003cp\u003eTo gain insight into the differences observed in the epigenetic age predictions for healthy tissue, tumor-adjacent, and tumor samples, we examined the methylation state of the probes used by BTEC across the different sample types. Overall, the average methylation levels on the 799 probes were significantly higher in the tumor-adjacent and tumor samples compared to the healthy tissue samples (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05, two-sided Wilcoxon test; Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA). Upon examining individual probes, we found that 751 probes in the tumor-adjacent samples and 663 probes in the tumor samples had significantly different methylation states when compared to the samples from healthy donors (adjusted p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05, two-sided Wilcoxon test, Bonferroni corrected; Supplementary Table\u0026nbsp;7).\u003c/p\u003e \u003cp\u003eMoving beyond the absolute methylation values, we observed that the relationship between the methylation states of BTEC's probes and chronological age was very different across the three sample groups. The feature pre-selection and the regularization applied during BTEC's training resulted in probes which had an average absolute correlation (|r|) with age of 0.4 in the healthy donor samples. However, these correlations were considerably weaker in the tumor-adjacent (mean |r|=0.15) and tumor samples (mean |r|=0.09; Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eB, Supplementary Table\u0026nbsp;8). In these altered tissues, the model\u0026rsquo;s probes lost the correlations with age that they exhibited in the healthy donor samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eC). This loss of correlation likely explains the distortions observed in BTEC's age predictions for tumor-adjacent and tumor samples.\u003c/p\u003e \u003cp\u003eWe then explored the genes related to BTEC\u0026rsquo;s probes and their associated molecular biological functions. The 424 probes that were positively correlated with age mapped to a total of 300 genes (Supplementary Table\u0026nbsp;9). This gene set was enriched in 10 KEGG terms\u003csup\u003e\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e, all of them due to the presence of UDP-glucuronosyltransferases (UGTs), except for the term \"Chronic myeloid leukemia\". Enrichment in this last term was due to the presence of multiple known oncogenes in the set (TP53, MYC, AKT1, ALB1). The set of 300 genes was enriched in 5 molecular functions, all related to UGTs (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eD). The deregulation of UGTs in breast tissue has a well-established connection with breast cancer, as it interferes with cell\u0026rsquo;s ability to properly manage estrogen metabolism\u003csup\u003e\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e,\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e. Among the 300 genes in this set, only two\u0026mdash;TP53 and nuclear receptor corepressor 2 (NCOR2)\u0026mdash;have direct evidence linking them to aging in mammalian models, according to the GenAge database\u003csup\u003e\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThe 376 probes negatively correlated with age mapped to 261 different genes (Supplementary Table\u0026nbsp;9). This gene set was not enriched in any KEGG terms, however they were enriched in two molecular functions: Cadherin binding and Fibroblast Growth Factor Binding (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eE). These functions are known to be implicated in breast cancer\u003csup\u003e\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e,\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e\u003c/sup\u003e and their deregulation could be expected in transformed tissue. Among the genes in this set, only one\u0026mdash;ARNTL\u0026mdash;has direct evidence of involvement in aging, according to the GenAge database\u003csup\u003e\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e. These observations suggest that BTEC predominantly relies on genes with no direct link to aging and, instead, involves genes related to estrogen metabolism and cancer pathways.\u003c/p\u003e \u003cp\u003eFinally, we examined the RNA expression data from the dataset GSE102088, generated in the breast tissue DNA methylation study from Song et al.\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e, to identify genes involved in BTEC\u0026rsquo;s predictions that also exhibited expression changes correlated with the subjects' age. We identified a total of 1712 genes that showed a significant and non-weak (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05, |r|\u0026gt;0.25) correlation between expression and age (Supplementary Table\u0026nbsp;10). Among these, 24 genes were associated with probes that had positive coefficients in BTEC (Supplementary Table\u0026nbsp;10). This set of 24 genes was enriched in phosphatase binding molecular functions (GO:0019903 and GO:0051721), with notable genes such as \u003cem\u003eTP53\u003c/em\u003e, \u003cem\u003eKCNQ1\u003c/em\u003e, and \u003cem\u003eMFHAS1\u003c/em\u003e. Another 37 genes matched those associated with probes that had negative coefficients in BTEC (Supplementary Table\u0026nbsp;10). This gene set was enriched in Cadherin Binding (GO:0045296) and Fibroblast Growth Factor Binding (GO:00171134) molecular functions, which mirrored the enrichment observed when analyzing all the genes associated with BTEC's negative coefficients. Although the overlap between the genes associated with age in BTEC and those identified in the expression data was relatively small, as expected due to the complex relationship between DNA methylation and gene expression\u003csup\u003e\u003cspan additionalcitationids=\"CR45\" citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u003c/sup\u003e, both data sources highlighted similar pathways and proteins. These included oncogenes (e.g., \u003cem\u003eTP53\u003c/em\u003e, \u003cem\u003eNXN\u003c/em\u003e, \u003cem\u003eTRIM59\u003c/em\u003e), phosphatase binding proteins (e.g., \u003cem\u003eKCNQ1\u003c/em\u003e, \u003cem\u003eMFHAS1\u003c/em\u003e), cadherin binding proteins (e.g., \u003cem\u003ePAK6\u003c/em\u003e, \u003cem\u003eRPL6\u003c/em\u003e), and FGF binding proteins (e.g., \u003cem\u003eRPS2\u003c/em\u003e, \u003cem\u003eFGFR2\u003c/em\u003e). These findings suggest a connection between these molecular functions and aging in breast tissue, further supporting the relevance of BTEC\u0026rsquo;s predictions in the context of breast cancer and aging.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eEpigenetic clocks have emerged as valuable tools to estimate biological age based on DNA methylation patterns\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e,\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e. However, pan-tissue epigenetic clocks have demonstrated poor performance in breast tissue: early models reported correlations below 0.75 between predicted and chronological ages and large chronological age prediction errors\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e,\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e; and the accuracy of state-of-the-art epiclocks exhibited a wide range of variation across datasets, with reported r values between − 0.703 and 0.858 \u003csup\u003e26\u003c/sup\u003e. One possible explanation for this limitation is the scarcity of breast tissue-specific DNA methylation data, which has likely hindered the development of models that accurately capture the epigenetic aging process in this tissue. Yet, despite their shortcomings, pan-tissue models continue to be widely used in studies involving both normal and pathological breast samples\u003csup\u003e\u003cspan additionalcitationids=\"CR22 CR23\" citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e–\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eTo address the limitations of existing models, we compiled a large and diverse dataset from 13 different studies, representing the most comprehensive collection of breast tissue DNA methylation data to date. Using this dataset, we tested four pan-tissue epigenetic clocks. Our results confirmed that these models exhibited poor predictive performance and systematic errors, with some models performing worse than a naïve approach that predicts a constant age. This highlights the limitations of applying generalized epigenetic clocks to breast tissue and underscores the necessity of tissue-specific models. To overcome these issues, we developed BTEC, a breast tissue-specific epigenetic clock, which significantly outperformed all tested pan-tissue models.\u003c/p\u003e \u003cp\u003eBTEC provided age predictions in normal breast tissue with lower errors than any of the pan-tissue models, without the need for ad-hoc, dataset-specific regressions. The results indicate a lack of intrinsic age acceleration, in contrast with the results reported by Sehl et al.\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e. When applied to tumor-adjacent and breast tumor samples, BTEC detected a significantly decreased epigenetic age acceleration (EAA) in both conditions compared to normal breast tissue. The decreased EAA implies that unlike what was observed based on classical epiclocks\u003csup\u003e\u003cspan additionalcitationids=\"CR22\" citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e–\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e, BTEC determined that tumors are “rejuvenated” with respect to the host. BTEC identifies tumors as 'rejuvenated' relative to the host. While the tissue-specific model by Castle et al. indicated that epigenetic age is generally accelerated in tumors but decelerates in late-stage cases\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e, our findings show that BTEC consistently detects a strong trend toward tumor rejuvenation.\u003c/p\u003e \u003cp\u003eNotably, when comparing different tumor subtypes, we observed that HER2 + tumors exhibited lower EAA than HR + tumors, and triple-negative breast cancer (TNBC) tumors displayed even lower EAA than HER2 + tumors. This suggests that tumor subtype influences epigenetic aging patterns in breast tissue.\u003c/p\u003e \u003cp\u003eOur analysis of the methylation states of the probes included in the BTEC model revealed that it relies on only three genes with strong evidence of involvement in aging in mammals (\u003cem\u003eARNTL\u003c/em\u003e, \u003cem\u003eNCOR2\u003c/em\u003e, \u003cem\u003eTP53\u003c/em\u003e). Instead, BTEC predominantly utilizes the methylation state of genes and pathways linked to breast cancer, including UGTs, known oncogenes (\u003cem\u003eTP53\u003c/em\u003e, \u003cem\u003eMYC\u003c/em\u003e, \u003cem\u003eAKT1\u003c/em\u003e, \u003cem\u003eALB1\u003c/em\u003e), as well as FGF-binding and cadherin-binding proteins. This suggests that the aging process in breast tissue differs from patterns observed in other tissues. The distinction underscores the necessity of tissue-specific epigenetic clocks and implies that epigenetic aging in breast tissue may be more closely linked to oncogenic processes than to conventional aging pathways.\u003c/p\u003e \u003cp\u003eWe observed that the methylation states of the sites included in BTEC were significantly altered in both tumor-adjacent and tumor tissues. Overall, methylation levels were elevated, and the expected correlations between methylation beta values and chronological age were lost. The overlap between age-correlated probes in healthy tissue and probes located within cancer-related genes likely explains BTEC’s sensitivity to disease status.\u003c/p\u003e \u003cp\u003eTo further explore the biological relevance of these findings, we examined transcriptomic data. Although there was limited overlap between genes whose expression correlates with age and those associated with BTEC probes, our results suggest that age-related changes in DNA methylation and gene expression impact oncogenes, phosphatase-binding proteins, cadherin-binding proteins, and fibroblast growth factor (FGF)-binding proteins in breast tissue. Interestingly, BTEC’s EAA predictions were not associated with Ki-67, a widely used proliferation marker\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e,\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e, suggesting that epigenetic aging in breast tumors operates independently of proliferation rates.\u003c/p\u003e \u003cp\u003eTumors exhibiting accelerated epigenetic age displayed transcriptomic patterns indicative of poor prognosis. In contrast, tumors with decelerated epigenetic age appeared to undergo dedifferentiation. Clinically, TNBC patients whose tumors exhibited accelerated epigenetic aging had significantly lower survival rates, suggesting that epigenetic aging profiles could have prognostic value in this breast cancer subtype.\u003c/p\u003e \u003cp\u003eThe \"rejuvenation\" observed in breast tumors, as indicated by the decelerated epigenetic age and lower EAA, raises the possibility of targeted therapeutic interventions. The fact that tumors seem to undergo de-differentiation processes may suggest a potential avenue for treatments aimed at reversing such changes. Anti-aging therapies, such as senescence modulation or telomere uncapping, could hold promise in this context. Senescence is a known response to DNA damage and stress, often acting as a barrier to tumor progression. By targeting the senescent cells within tumors, it may be possible to influence their epigenetic age and potentially improve treatment outcomes. Similarly, telomere uncapping therapies could counteract the rejuvenation seen in tumors, as telomere length and maintenance are closely associated with aging and cellular senescence. Future studies should investigate how such interventions could be combined with traditional treatments to improve efficacy, particularly in the context of aggressive breast cancer subtypes like TNBC.\u003c/p\u003e \u003cp\u003eGiven that we have demonstrated that the epigenetic age of both healthy and tumor tissue has significant value, an interesting next step would be to explore the integration of epigenetic age with other risk factors, such as breast density. Breast density has long been recognized as an important risk factor for breast cancer, but it currently remains an incomplete predictor of individual risk\u003csup\u003e\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e,\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e\u003c/sup\u003e. Combining epigenetic age data with breast density measurements could significantly improve risk models, providing a more accurate picture of a patient’s cancer risk profile. This could prove valuable for both early detection and personalized screening strategies. Further research is needed to determine the best way to incorporate these combined biomarkers into routine clinical practice for risk stratification and monitoring.\u003c/p\u003e \u003cp\u003eTaken together, our findings establish BTEC as an epigenetic clock capable of producing reliable chronological age predictions in healthy breast tissue. Its application to tumor-adjacent and tumor samples provides new insights into the epigenetic alterations associated with breast cancer. While breast cancer generally induces systemic increases in EAA, our results suggest that breast tumors and their surrounding tissue experience epigenetic changes in the opposite direction. Additionally, our findings emphasize the role of breast tissue-specific genes in the aging process and highlight BTEC’s potential utility as a prognostic tool. Future research should further explore the molecular mechanisms underlying these epigenetic alterations and assess the broader clinical implications of BTEC’s predictions.\u003c/p\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003cp\u003e\u003c/p\u003e \u003c/div\u003e "},{"header":"Methods","content":"\u003ch2\u003eStudy cohorts\u003c/h2\u003e\u003cp\u003eWe used publicly available methylation data from female subjects from 13 previous studies, which we retrieved from the GEO database\u003csup\u003e\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e,\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e\u003c/sup\u003e. We include the GEO accession number, the number of samples per condition and tumor subtype (when available), as well as the minimum, maximum, and median ages of the subjects in Supplementary Table\u0026nbsp;1.\u003c/p\u003e\u003ch2\u003eData collection and preprocessing\u003c/h2\u003e\u003cp\u003eMethylation data from previous studies were obtained from the GEO database. In all cases, the data were preprocessed: we used the beta values provided by the original authors when available; otherwise, we calculated the beta value from the methylated and unmethylated signals using the following formula:\u003c/p\u003e\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\n$$\\:\\beta\\:=\\frac{M}{M+U+100}$$\u003c/div\u003e\u003c/div\u003e\u003cp\u003ewhere M and U are the intensities of the methylated and unmethylated signals respectively.\u003c/p\u003e\u003cp\u003eWe modified the original sample labels as follows: we labeled as “Tumor adjacent” all samples with the annotations “adjacent normal” and “ipsilateral normal”; samples with “reduction mammoplasty” and “prophylactic” were relabeled as “Healthy donor”; and samples annotated as “DCIS” were labeled “Tumor”. Samples labeled “controlateral” were discarded.\u003c/p\u003e\u003cp\u003eOn the tumor samples, 4.43% of the beta values from the 366863 cross-platform (450k, EPIC, EPICv2) probes were missing in at least one samples. We imputed the missing values with the median value across all samples for each probe. On the non-tumor samples, we imputed 4.4% of the beta values using the same strategy.\u003c/p\u003e\u003ch2\u003eEpigenetic age prediction using existing models\u003c/h2\u003e\u003cp\u003eWe used four pan-tissue epigenetic clocks to predict chronological age using DNA methylation data from breast tissue of healthy donors using the PyAging Python library\u003csup\u003e\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e\u003c/sup\u003e: Horvath\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e, Hannum\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e, PhenoAge\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e, and AltumAge\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003ch2\u003eBTEC training\u003c/h2\u003e\u003cp\u003eWe used elastic net to train a chronological age predictor based on the beta values of cross-platform (450k, EPIC, EPICv2) probes which exhibited high correlation with age in the set of heatlthy samples. Specifically, we considered the probes with the top 0.5% r values among all those which had a significant (corrected p-value \u0026lt; 0.05) linear or quadratic correlation with chronological age (a total of 1,962 features; Supplementary Table\u0026nbsp;3).\u003c/p\u003e\u003cp\u003eWe used 406 samples from healthy subjects across five datasets for training the model. To reduce the effect of the imbalance by dataset, we augmented the training data by resampling each dataset to a total of 200 samples. On the resulting 1000 samples, we optimized the alpha and L1 ratio values to 0.0013 and 0.5 respectively using a 5-fold cross-validation.\u003c/p\u003e\u003cp\u003eThe performance of the model on the training set was then assessed using a LOOC approach, and the model then was retrained on the whole augmented training data. The resulting model, consisting of 1187 parameters (an intercept value, 637 linear and 549 quadratic coefficients), was used for the age predictions on the validation dataset as well as on the tumor adjacent and tumor samples.\u003c/p\u003e\u003ch2\u003eTranscriptomic data analysis\u003c/h2\u003e\u003cp\u003eIn total we used transcriptomic data from two datasets to 1) compare the transcriptomic profiles of tumors with EAA \u0026gt; 0 and EAA \u0026lt; 0 and 2) analyze the correlations between gene expression and age.\u003c/p\u003e\u003cp\u003eTo compare tumors with in different EAA groups, we used the dataset GSE37754 from Terunuma’s et al. study\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e, which contained microarray expression data (Affymetrix Human Gene 1.0 ST Array) from 108 tumor samples, out of which 55 matched tumor samples from the same study with available methylation data (GSE37751). The data was mean-centered and scaled to unit variance and then the two groups were compared using a two-sided Wilcoxon test.\u003c/p\u003e\u003cp\u003eTo study the correlation between gene expression and aging in breast tissue, we used the transcriptomic data from Song’s et al. study\u003csup\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e (GSE102088), which included 104 samples from healthy donors. We used the normalized gene expression matrix provided by the authors to determine the Pearson’s correlation coefficient of each gene and chronological age.\u003c/p\u003e\u003ch2\u003eGene set enrichment analysis\u003c/h2\u003e\u003cp\u003eIn all cases, we used the GSEAPY Python library\u003csup\u003e\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e\u003c/sup\u003e to perform gene set enrichment analysis through the Enrichr\u003csup\u003e\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e API. We included the KEGG 2021, GO Molecular Functions 2023 and GO Biological Process 2023 gene sets to perform the analysis on human genes, with no specific background and an adjusted p-value cutoff of 0.05.\u003c/p\u003e\u003ch2\u003eSurvival analysis\u003c/h2\u003e\u003cp\u003eThe hazard ratio estimations were obtained using a Cox proportional hazard model through the lifelines Python library\u003csup\u003e\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e\u003c/sup\u003e. Kaplan-Meier survival curves were obtained using the same library.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eData availability\u003c/p\u003e\n\u003cp\u003eAll the data used in this study is publicly available. Data sources are detailed in Supplementary Table 1.\u003c/p\u003e\n\u003cp\u003eCode availability\u003c/p\u003e\n\u003cp\u003eThis study did not generate any new computational methods. The specific Python libraries used are detailed in the Methods section above. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request: Miguel Quintela-Fandino (
[email protected]).\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eGuo, J. \u003cem\u003eet al.\u003c/em\u003e Aging and aging-related diseases: from molecular mechanisms to interventions and treatments. \u003cem\u003eSignal Transduct. Target. Ther.\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 391 (2022).\u003c/li\u003e\n \u003cli\u003eNiccoli, T. \u0026amp; Partridge, L. Ageing as a risk factor for disease. \u003cem\u003eCurr. Biol.\u003c/em\u003e \u003cstrong\u003e22\u003c/strong\u003e, R741\u0026ndash;R752 (2012).\u003c/li\u003e\n \u003cli\u003eLaconi, E., Marongiu, F. \u0026amp; DeGregori, J. Cancer as a disease of old age: changing mutational and microenvironmental landscapes. \u003cem\u003eBr. J. Cancer\u003c/em\u003e \u003cstrong\u003e122\u003c/strong\u003e, 943\u0026ndash;952 (2020).\u003c/li\u003e\n \u003cli\u003eBenz, C. C. Impact of aging on the biology of breast cancer. \u003cem\u003eCrit. Rev. Oncol. Hematol.\u003c/em\u003e \u003cstrong\u003e66\u003c/strong\u003e, 65\u0026ndash;74 (2008).\u003c/li\u003e\n \u003cli\u003eSun, Y.-S. \u003cem\u003eet al.\u003c/em\u003e Risk factors and preventions of breast cancer. \u003cem\u003eInt. J. Biol. Sci.\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 1387 (2017).\u003c/li\u003e\n \u003cli\u003ePal, S. \u0026amp; Tyler, J. K. Epigenetics and aging. \u003cem\u003eSci. Adv.\u003c/em\u003e \u003cstrong\u003e2\u003c/strong\u003e, e1600584 (2016).\u003c/li\u003e\n \u003cli\u003eYang, J.-H. \u003cem\u003eet al.\u003c/em\u003e Loss of epigenetic information as a cause of mammalian aging. \u003cem\u003eCell\u003c/em\u003e \u003cstrong\u003e186\u003c/strong\u003e, 305\u0026ndash;326 (2023).\u003c/li\u003e\n \u003cli\u003eHorvath, S. \u0026amp; Raj, K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. \u003cem\u003eNat. Rev. Genet.\u003c/em\u003e \u003cstrong\u003e19\u003c/strong\u003e, 371\u0026ndash;384 (2018).\u003c/li\u003e\n \u003cli\u003eHorvath, S. DNA methylation age of human tissues and cell types. \u003cem\u003eGenome Biol.\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 1\u0026ndash;20 (2013).\u003c/li\u003e\n \u003cli\u003eHannum, G. \u003cem\u003eet al.\u003c/em\u003e Genome-wide methylation profiles reveal quantitative views of human aging rates. \u003cem\u003eMol. Cell\u003c/em\u003e \u003cstrong\u003e49\u003c/strong\u003e, 359\u0026ndash;367 (2013).\u003c/li\u003e\n \u003cli\u003eKabacik, S. \u003cem\u003eet al.\u003c/em\u003e The relationship between epigenetic age and the hallmarks of aging in human cells. \u003cem\u003eNat. Aging\u003c/em\u003e \u003cstrong\u003e2\u003c/strong\u003e, 484\u0026ndash;493 (2022).\u003c/li\u003e\n \u003cli\u003eZheng, Y. \u003cem\u003eet al.\u003c/em\u003e Blood epigenetic age may predict cancer incidence and mortality. \u003cem\u003eEBioMedicine\u003c/em\u003e \u003cstrong\u003e5\u003c/strong\u003e, 68\u0026ndash;73 (2016).\u003c/li\u003e\n \u003cli\u003ePerna, L. \u003cem\u003eet al.\u003c/em\u003e Epigenetic age acceleration predicts cancer, cardiovascular, and all-cause mortality in a German case cohort. \u003cem\u003eClin. Epigenetics\u003c/em\u003e \u003cstrong\u003e8\u003c/strong\u003e, 1\u0026ndash;7 (2016).\u003c/li\u003e\n \u003cli\u003eWu, X., Ye, J., Wang, Z. \u0026amp; Zhao, C. Epigenetic age acceleration was delayed in schizophrenia. \u003cem\u003eSchizophr. Bull.\u003c/em\u003e \u003cstrong\u003e47\u003c/strong\u003e, 803\u0026ndash;811 (2021).\u003c/li\u003e\n \u003cli\u003eHan, L. K. \u003cem\u003eet al.\u003c/em\u003e Epigenetic aging in major depressive disorder. \u003cem\u003eAm. J. Psychiatry\u003c/em\u003e \u003cstrong\u003e175\u003c/strong\u003e, 774\u0026ndash;782 (2018).\u003c/li\u003e\n \u003cli\u003eJeremian, R. \u003cem\u003eet al.\u003c/em\u003e Epigenetic age dysregulation in individuals with bipolar disorder and schizophrenia. \u003cem\u003ePsychiatry Res.\u003c/em\u003e \u003cstrong\u003e315\u003c/strong\u003e, 114689 (2022).\u003c/li\u003e\n \u003cli\u003eFiorito, G. \u003cem\u003eet al.\u003c/em\u003e Social adversity and epigenetic aging: a multi-cohort study on socioeconomic differences in peripheral blood DNA methylation. \u003cem\u003eSci. Rep.\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 16266 (2017).\u003c/li\u003e\n \u003cli\u003eHughes, A. \u003cem\u003eet al.\u003c/em\u003e Socioeconomic position and DNA methylation age acceleration across the life course. \u003cem\u003eAm. J. Epidemiol.\u003c/em\u003e \u003cstrong\u003e187\u003c/strong\u003e, 2346\u0026ndash;2354 (2018).\u003c/li\u003e\n \u003cli\u003eTong, H. \u003cem\u003eet al.\u003c/em\u003e Quantifying the stochastic component of epigenetic aging. \u003cem\u003eNat. Aging\u003c/em\u003e 1\u0026ndash;16 (2024).\u003c/li\u003e\n \u003cli\u003eMeyer, D. H. \u0026amp; Schumacher, B. Aging clocks based on accumulating stochastic variation. \u003cem\u003eNat. Aging\u003c/em\u003e 1\u0026ndash;15 (2024).\u003c/li\u003e\n \u003cli\u003eHofstatter, E. W. \u003cem\u003eet al.\u003c/em\u003e Increased epigenetic age in normal breast tissue from luminal breast cancer patients. \u003cem\u003eClin. Epigenetics\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 1\u0026ndash;11 (2018).\u003c/li\u003e\n \u003cli\u003eRozenblit, M. \u003cem\u003eet al.\u003c/em\u003e Evidence of accelerated epigenetic aging of breast tissues in patients with breast cancer is driven by CpGs associated with polycomb-related genes. \u003cem\u003eClin. Epigenetics\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 30 (2022).\u003c/li\u003e\n \u003cli\u003eKoka, H. \u003cem\u003eet al.\u003c/em\u003e DNA methylation age in paired tumor and adjacent normal breast tissue in Chinese women with breast cancer. \u003cem\u003eClin. Epigenetics\u003c/em\u003e \u003cstrong\u003e15\u003c/strong\u003e, 55 (2023).\u003c/li\u003e\n \u003cli\u003eSehl, M. E., Henry, J. E., Storniolo, A. M., Ganz, P. A. \u0026amp; Horvath, S. DNA methylation age is elevated in breast tissue of healthy women. \u003cem\u003eBreast Cancer Res. Treat.\u003c/em\u003e \u003cstrong\u003e164\u003c/strong\u003e, 209\u0026ndash;219 (2017).\u003c/li\u003e\n \u003cli\u003eHannum, G. \u003cem\u003eet al.\u003c/em\u003e Genome-wide methylation profiles reveal quantitative views of human aging rates. \u003cem\u003eMol. Cell\u003c/em\u003e \u003cstrong\u003e49\u003c/strong\u003e, 359\u0026ndash;367 (2013).\u003c/li\u003e\n \u003cli\u003ede Lima Camillo, L. P., Lapierre, L. R. \u0026amp; Singh, R. A pan-tissue DNA-methylation epigenetic clock based on deep learning. \u003cem\u003eNpj Aging\u003c/em\u003e \u003cstrong\u003e8\u003c/strong\u003e, 4 (2022).\u003c/li\u003e\n \u003cli\u003eCastle, J. R. \u003cem\u003eet al.\u003c/em\u003e Estimating breast tissue-specific DNA methylation age using next-generation sequencing data. \u003cem\u003eClin. Epigenetics\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, 1\u0026ndash;14 (2020).\u003c/li\u003e\n \u003cli\u003eLevine, M. E. \u003cem\u003eet al.\u003c/em\u003e An epigenetic biomarker of aging for lifespan and healthspan. \u003cem\u003eAging\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 573 (2018).\u003c/li\u003e\n \u003cli\u003eTerunuma, A. \u003cem\u003eet al.\u003c/em\u003e MYC-driven accumulation of 2-hydroxyglutarate is associated with breast cancer prognosis. \u003cem\u003eJ. Clin. Invest.\u003c/em\u003e \u003cstrong\u003e124\u003c/strong\u003e, 398\u0026ndash;412 (2014).\u003c/li\u003e\n \u003cli\u003eScholzen, T. \u0026amp; Gerdes, J. The Ki-67 protein: from the known and the unknown. \u003cem\u003eJ. Cell. Physiol.\u003c/em\u003e \u003cstrong\u003e182\u003c/strong\u003e, 311\u0026ndash;322 (2000).\u003c/li\u003e\n \u003cli\u003eUxa, S. \u003cem\u003eet al.\u003c/em\u003e Ki-67 gene expression. \u003cem\u003eCell Death Differ.\u003c/em\u003e \u003cstrong\u003e28\u003c/strong\u003e, 3357\u0026ndash;3370 (2021).\u003c/li\u003e\n \u003cli\u003eGoldhirsch, A. \u003cem\u003eet al.\u003c/em\u003e Strategies for subtypes\u0026mdash;dealing with the diversity of breast cancer: highlights of the st gallen international expert consensus on the primary therapy of early breast cancer 2011. \u003cem\u003eAnn. Oncol.\u003c/em\u003e \u003cstrong\u003e22\u003c/strong\u003e, 1736\u0026ndash;1747 (2011).\u003c/li\u003e\n \u003cli\u003eGao, Y. \u003cem\u003eet al.\u003c/em\u003e The integrative epigenomic-transcriptomic landscape of ER positive breast cancer. \u003cem\u003eClin. Epigenetics\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 1\u0026ndash;16 (2015).\u003c/li\u003e\n \u003cli\u003eFackler, M. J. \u003cem\u003eet al.\u003c/em\u003e DNA methylation markers predict recurrence-free interval in triple-negative breast cancer. \u003cem\u003eNPJ Breast Cancer\u003c/em\u003e \u003cstrong\u003e6\u003c/strong\u003e, 3 (2020).\u003c/li\u003e\n \u003cli\u003eCheang, M. C. \u003cem\u003eet al.\u003c/em\u003e Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer. \u003cem\u003eJNCI J. Natl. Cancer Inst.\u003c/em\u003e \u003cstrong\u003e101\u003c/strong\u003e, 736\u0026ndash;750 (2009).\u003c/li\u003e\n \u003cli\u003eMathe, A. \u003cem\u003eet al.\u003c/em\u003e DNA methylation profile of triple negative breast cancer-specific genes comparing lymph node positive patients to lymph node negative patients. \u003cem\u003eSci. Rep.\u003c/em\u003e \u003cstrong\u003e6\u003c/strong\u003e, 33435 (2016).\u003c/li\u003e\n \u003cli\u003eKanehisa, M. \u0026amp; Goto, S. KEGG: kyoto encyclopedia of genes and genomes. \u003cem\u003eNucleic Acids Res.\u003c/em\u003e \u003cstrong\u003e28\u003c/strong\u003e, 27\u0026ndash;30 (2000).\u003c/li\u003e\n \u003cli\u003eZhou, X. \u003cem\u003eet al.\u003c/em\u003e Disturbance of mammary UDP-glucuronosyltransferase represses estrogen metabolism and exacerbates experimental breast cancer. \u003cem\u003eJ. Pharm. Sci.\u003c/em\u003e \u003cstrong\u003e106\u003c/strong\u003e, 2152\u0026ndash;2162 (2017).\u003c/li\u003e\n \u003cli\u003eGuillemette, C., B\u0026eacute;langer, A. \u0026amp; L\u0026eacute;pine, J. Metabolic inactivation of estrogens in breast tissue by UDP-glucuronosyltransferase enzymes: an overview. \u003cem\u003eBreast Cancer Res.\u003c/em\u003e \u003cstrong\u003e6\u003c/strong\u003e, 1\u0026ndash;9 (2004).\u003c/li\u003e\n \u003cli\u003ede Magalh\u0026atilde;es, J. P. \u003cem\u003eet al.\u003c/em\u003e Human Ageing Genomic Resources: updates on key databases in ageing research. \u003cem\u003eNucleic Acids Res.\u003c/em\u003e \u003cstrong\u003e52\u003c/strong\u003e, D900\u0026ndash;D908 (2024).\u003c/li\u003e\n \u003cli\u003eCowin, P., Rowlands, T. M. \u0026amp; Hatsell, S. J. Cadherins and catenins in breast cancer. \u003cem\u003eCurr. Opin. Cell Biol.\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 499\u0026ndash;508 (2005).\u003c/li\u003e\n \u003cli\u003eDickson, C., Spencer-Dene, B., Dillon, C. \u0026amp; Fantl, V. Tyrosine kinase signalling in breast cancer: fibroblast growth factors and their receptors. \u003cem\u003eBreast Cancer Res.\u003c/em\u003e \u003cstrong\u003e2\u003c/strong\u003e, 1\u0026ndash;6 (2000).\u003c/li\u003e\n \u003cli\u003eSong, M.-A. \u003cem\u003eet al.\u003c/em\u003e Landscape of genome-wide age-related DNA methylation in breast tissue. \u003cem\u003eOncotarget\u003c/em\u003e \u003cstrong\u003e8\u003c/strong\u003e, 114648 (2017).\u003c/li\u003e\n \u003cli\u003eBhasin, J. M. \u003cem\u003eet al.\u003c/em\u003e Methylome-wide sequencing detects DNA hypermethylation distinguishing indolent from aggressive prostate cancer. \u003cem\u003eCell Rep.\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 2135\u0026ndash;2146 (2015).\u003c/li\u003e\n \u003cli\u003eMoarii, M., Boeva, V., Vert, J.-P. \u0026amp; Reyal, F. Changes in correlation between promoter methylation and gene expression in cancer. \u003cem\u003eBMC Genomics\u003c/em\u003e \u003cstrong\u003e16\u003c/strong\u003e, 1\u0026ndash;14 (2015).\u003c/li\u003e\n \u003cli\u003eItai, Y., Rappoport, N. \u0026amp; Shamir, R. Integration of gene expression and DNA methylation data across different experiments. \u003cem\u003eNucleic Acids Res.\u003c/em\u003e \u003cstrong\u003e51\u003c/strong\u003e, 7762\u0026ndash;7776 (2023).\u003c/li\u003e\n \u003cli\u003eDuan, R., Fu, Q., Sun, Y. \u0026amp; Li, Q. Epigenetic clock: A promising biomarker and practical tool in aging. \u003cem\u003eAgeing Res. Rev.\u003c/em\u003e \u003cstrong\u003e81\u003c/strong\u003e, 101743 (2022).\u003c/li\u003e\n \u003cli\u003eBodewes, F., Van Asselt, A., Dorrius, M., Greuter, M. \u0026amp; De Bock, G. Mammographic breast density and the risk of breast cancer: A systematic review and meta-analysis. \u003cem\u003eThe Breast\u003c/em\u003e \u003cstrong\u003e66\u003c/strong\u003e, 62\u0026ndash;68 (2022).\u003c/li\u003e\n \u003cli\u003eBoyd, N. F., Martin, L. J., Yaffe, M. J. \u0026amp; Minkin, S. Mammographic density and breast cancer risk: current understanding and future prospects. \u003cem\u003eBreast Cancer Res.\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 1\u0026ndash;12 (2011).\u003c/li\u003e\n \u003cli\u003eBarrett, T. \u003cem\u003eet al.\u003c/em\u003e NCBI GEO: archive for functional genomics data sets\u0026mdash;update. \u003cem\u003eNucleic Acids Res.\u003c/em\u003e \u003cstrong\u003e41\u003c/strong\u003e, D991\u0026ndash;D995 (2012).\u003c/li\u003e\n \u003cli\u003eEdgar, R., Domrachev, M. \u0026amp; Lash, A. E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. \u003cem\u003eNucleic Acids Res.\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 207\u0026ndash;210 (2002).\u003c/li\u003e\n \u003cli\u003ede Lima Camillo, L. P. pyaging: a Python-based compendium of GPU-optimized aging clocks. \u003cem\u003eBioinforma. Oxf. Engl.\u003c/em\u003e btae200 (2024).\u003c/li\u003e\n \u003cli\u003eFang, Z., Liu, X. \u0026amp; Peltz, G. GSEApy: a comprehensive package for performing gene set enrichment analysis in Python. \u003cem\u003eBioinformatics\u003c/em\u003e \u003cstrong\u003e39\u003c/strong\u003e, btac757 (2023).\u003c/li\u003e\n \u003cli\u003eXie, Z. \u003cem\u003eet al.\u003c/em\u003e Gene set knowledge discovery with Enrichr. \u003cem\u003eCurr. Protoc.\u003c/em\u003e \u003cstrong\u003e1\u003c/strong\u003e, e90 (2021).\u003c/li\u003e\n \u003cli\u003eDavidson-Pilon, C. lifelines: survival analysis in Python. \u003cem\u003eJ. Open Source Softw.\u003c/em\u003e \u003cstrong\u003e4\u003c/strong\u003e, 1317 (2019).\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-6222303/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6222303/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eEpigenetic clocks have been widely used to estimate biological age across various tissues, but their accuracy in breast tissue remains suboptimal. Pan-tissue models such as Horvath\u0026rsquo;s and Hannum\u0026rsquo;s clocks, perform poorly in predicting chronological age in breast tissue, underscoring the need for a tissue-specific approach. In this study, we introduce a Breast Tissue-specific Epigenetic Clock (BTEC), developed using DNA methylation data from 553 healthy breast tissue samples across seven different studies. BTEC significantly outperformed pan-tissue clocks, demonstrating superior correlation with chronological age (r\u0026thinsp;=\u0026thinsp;0.88) and lower prediction errors (MAE\u0026thinsp;=\u0026thinsp;3.27 years) without requiring for dataset-specific regressions adjustments. BTEC\u0026rsquo;s chronological age predictions for tumor-adjacent samples showed distortions, with an average deviation of -1.76 years, which was even more pronounced in tumor samples, where the average difference between predicted and chronological age was \u0026minus;\u0026thinsp;12.29 years. When analyzed by molecular subtype, the distortion was greater in the more aggressive HER2\u0026thinsp;+\u0026thinsp;and TNBC tumors compared to HR\u0026thinsp;+\u0026thinsp;tumors. The probes used by BTEC were associated with known oncogenes, genes involved in estrogen metabolism, cadherin binding and fibroblast growth factor binding. Despite the general rejuvenation observed in tumor tissue compared to normal breast, the correlation between BTEC\u0026rsquo;s predictions and cancer-related survival indicated that TNBC tumors with increased epigenetic ages had significant lower survival.\u003c/p\u003e","manuscriptTitle":"A breast tissue-specific epigenetic clock provides accurate chronological age predictions and reveals de-correlation of age and DNA methylation in tumor-adjacent and tumor samples","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-03 06:30:59","doi":"10.21203/rs.3.rs-6222303/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"adfd29b3-c7cd-4054-9d7c-cbf82c8153e0","owner":[],"postedDate":"April 3rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":46540960,"name":"Biological sciences/Molecular biology/Epigenetics/DNA methylation"},{"id":46540961,"name":"Biological sciences/Cancer/Breast cancer"},{"id":46540962,"name":"Health sciences/Oncology/Cancer/Breast cancer"}],"tags":[],"updatedAt":"2026-01-15T17:12:45+00:00","versionOfRecord":[],"versionCreatedAt":"2025-04-03 06:30:59","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6222303","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6222303","identity":"rs-6222303","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.