Sex-specific non-linear DNA methylation aging trajectories reveal biomarkers of cancer risk and inflammation

preprint OA: closed
Full text JSON View at publisher
Full text 207,436 characters · extracted from preprint-html · click to expand
Sex-specific non-linear DNA methylation aging trajectories reveal biomarkers of cancer risk and inflammation | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Sex-specific non-linear DNA methylation aging trajectories reveal biomarkers of cancer risk and inflammation Robin Grolaux, Macsue Jacques, Bernadette Jones-Freeman, Steve Horvath, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7516867/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 04 Feb, 2026 Read the published version in Genome Biology → Version 1 posted 9 You are reading this latest preprint version Abstract Background: Aging is a multi-modal process, leaving distinct molecular signatures across the epigenome. DNA methylation is among the most robust biomarkers of biological aging, yet most studies assume linear age relationships and analyze mixed-sex cohorts, overlooking known sex differences. Such approaches risk obscuring critical non-linear transitions and sex-specific trajectories. Results: We developed SNITCH, a computational framework to detect complex non-linear methylation trajectories and disentangle shared from sex-divergent patterns. Applied to deconvoluted whole-blood methylomes from 252 females and 246 males (ages 19–90 years), SNITCH revealed convergent and divergent epigenetic aging pathways independent of immune cell composition. Non-linear trajectories were enriched for developmental transcription factor motifs, including NF1/CTF and REST, with known oncogenic roles. Importantly, a female-specific non-linear cluster was prospectively associated with cancer onset and systemic inflammation in an independent cohort, nominating clinically relevant biomarkers. Conclusion: Our results uncover sex-specific, non-linear aging programs that capture the dynamics of epigenetic change beyond linear models. These findings provide candidate biomarkers for early disease risk and advance understanding of how aging trajectories diverge between sexes, with potential applications across multi-omic studies of aging. Aging Non-linear DNA methylation Sex differences Epigenetic Biomarkers Computational biology Figures Figure 1 Figure 2 Figure 3 Figure 4 Background Biological aging is often modeled as a linear process; yet many molecular and physiological changes accelerate, decelerate, or shift phases with age rather than progressing uniformly 1 . This concept of biological non-linearity, changes that deviate from a constant rate over time, is increasingly supported across diverse molecular and physiological domains. Telomere attrition, a hallmark of aging, follows a non-linear trajectory, with faster shortening in early life and slower decline in later decades 2 . Transcriptomics studies in mice have identified cross-tissue non-linear gene expression patterns that correlate with protein expression 3 , as well as late-life shifts in skeletal muscle expression consistent with the ‘elbow’ (i.e. inflection point), typical of non-linear functions 4 . In humans, age-related non-linear patterns have been reported for circulating microRNAs 5 , and undulating changes have been observed in the proteome 6 , transcriptome 7 , and metabolome 1 . Together, these findings suggest that many molecular processes follow complex trajectories across the lifespan, challenging the notion of steady, uniform aging. DNA methylation (DNAm) is a primary hallmark of aging, and one of the most reliable molecular surrogates for estimating biological age 8 , 9 . While many epigenetic clocks appear to have a linear relationship with chronological age, their underlying regression models often assume a more complex, log-linear relationship: notably, the Horvath 2013 pan-tissue clock and the Lu 2023 pan-mammalian clock were built by regressing DNA methylation data on a log-transformed version of age, which suggests that methylation changes rapidly in early life and then slows to a constant rate after adulthood 10 , 11 . While these regression modeling approaches recognize the non-linear nature of methylation aging, they constrain trajectories to simple monotonic forms and cannot capture more complex patterns such as U-shaped curves, multi-phase dynamics, or abrupt inflection points. Indeed, lifespan analyses have documented non-linear DNAm changes at specific CpG sites 12 – 15 . Measurements of increased variance, such as variably methylated positions, also exhibit non-linear age-related patterns, highlighting that both the mean and variability of DNAm can shift in complex ways over time 13 , 16 , 17 . Despite this, most DNAm studies continue to rely on linear regression or other monotonic models, which preferentially detect features that increase or decrease steadily, while potentially missing, or mischaracterizing, non-linear signals 18 . Sex-specific differences in aging are widely recognized at physiological and clinical levels 19 – 21 , yet remain underexplored in the context of DNAm. When considered, sex is most often modeled as an interaction term in linear frameworks, limiting the detection of patterns that differ in shape, timing, or magnitude between sexes 1 , 14 , 20 , 22 . Non-linear approaches are particularly well-suited to uncover such differences, as they can reveal age windows of abrupt divergence, sex-specific inflection points, or distinct multi-phase dynamics. Here, we investigated sex-specific non-linear aging patterns of DNAm across the adult human lifespan. To enable this, we developed SNITCH (Semi-supervised Non-linear Identification and Trajectory Clustering for High-dimensional data), a robust framework for detecting and clustering CpG sites with shared linear and non-linear age-related changes. We first validated SNITCH using simulated data, then applied it to a high-quality whole-blood DNAm dataset (n = 238 males, and 256 females; 18–90 years old), accounting for immune cell composition. This analysis identified both sex-dependent and sex-independent non-linear aging trajectories. Replication in a large independent cohort confirmed that specific non-linear clusters are predictive of inflammation and cancer onset in a sex-specific manner. Our findings demonstrate the value of systematic non-linear analysis for uncovering previously hidden dimensions of epigenetic aging and provide new candidate biomarkers for disease risk stratification. Results SNITCH: Semi-supervised approach to cluster CpGs based on their aging pattern Given the widespread evidence of non-linear aging dynamics and the shortcomings of linear analyses, there is a clear need for new methods to detect and characterize non-linear patterns of aging. Understanding aging’s complex trajectory requires analytical approaches that can capture inflection points, accelerations/decelerations, and multiphase changes in biological data. By moving beyond linear models, researchers can unveil previously hidden aging signals. Previous attempts to identify non-linear changes in DNAm have relied on the binning of age categories 14 or a priori shape of aging patterns 13 . However, systematic tools to scan for arbitrary non-linear trajectories, without pre-specifying a particular model, are needed to truly let the data reveal aging’s patterns. The closest attempt to answer this need has been described by Okada et al. 18 where functional data analysis was used to cluster CpGs as linear increasing (LI), linear decreasing (LD), non-correlated (NC), or non-linear (NL). Nevertheless, this method remained limited in its ability to discriminate between non-linear trajectories and identify increased variance (VI). To address this gap, we developed and applied SNITCH (Semi-supervised Non-linear Identification and Trajectory Clustering for High-dimensional data), a heuristic-based statistical framework that distinguishes between linear, non-linear, variable, and non-correlated methylation trajectories. The method leverages both generalized linear modeling and generalized additive modeling to identify CpGs exhibiting distinct age-associated patterns (NC, LI, LD, NL, and VI) while controlling for potential confounders. Unsupervised clustering is then applied on functional principal components of the non-linear positions to highlight CpGs sharing similar non-linear trajectories (Fig. 1 A). We benchmarked the classification accuracy of SNITCH, in combination with clustering of non-linear patterns, against three stand-alone unsupervised clustering algorithms and the functional trajectory-based DICNAP method 18 . Performance was evaluated using Adjusted Rand Index (ARI) and Adjusted Mutual Information (AMI), using simulated methylation data (i.e., bound by 0–1) from a highly varied pool of distributions with known ground-truth labels (Fig. 1 B, Methods). We found that SNITCH outperformed stand-alone unsupervised clustering algorithms and DICNAP (Fig. 1 D & Supp. Figure 1 A). The best results were achieved by using SNITCH + Fuzzy/HDBSCAN (ARI: 0.97; AMI: 0.98). In this setting, we observed a robust concordance between predicted and ground truth labels, with the main missclassifications occurring between the logarithmic decreasing and linear decreasing groups (Fig. 1 C & Supp. Figure 1 B). Sex-specific Non-Linear aging patterns in blood We then applied SNITCH to a high-quality dataset in blood (GSE246337) containing 238 and 256 males and females, respectively. Blood DNAm is highly influenced by immune cell heterogeneity, and adjustment for cell type composition is essential for the identification of epigenetic modifications independent of the immune profile 23 . Immune cell fractions of whole blood can be effectively estimated by the use of DNAm-based deconvolution methods 24 . Thus, in both females and males, we built three different models of aging trajectories accounting for an increasing number of immune cell types. Our baseline model didn’t include any cell fractions. The 7 cell-model was corrected for B-, NK-, CD4T, and CD8T-cells, Monocytes, Neutrophils, and Eosinophils. Finally, the 12-cell model discriminated between naive and mature B-, CD4T-, and CD8T-cells, and added T-regulatory cells and Basophils. This rigorous approach allowed us to identify DNAm aging patterns occurring independently from changes in immune cell fractions. Overall, CpG classification was highly stable across models in both sexes: In females, 95.9% (N = 534,132) retained identical cluster assignments across all three models. This percentage changed only a little in males (95.6%). This was mainly driven by CpGs classified as Non-Correlated (NC) with age (94% in both males and females). Among the remaining CpGs that changed classification, transitions were most frequently directed toward the NC cluster upon inclusion of immune covariates in both females and males (Fig. 2 A, Supp. Table 1), highlighting how linear and non-linear trajectories are both confounded by changes in the immune profile. The following analyses describe the results from the 12-cell model if not otherwise stated. In both males and females, the majority of CpGs showed no correlation to age (N_fem = 543,972 (97.7%); N_m = 548,935 (98.6%)). Whereas the number of Non-Linear (NL) CpGs was markedly different between females (N = 1305) and males (N = 155) (Fig. 2 B, Supp. Figure 1 C). Those results were surprising in light of a recent meta-analysis showing that almost half of the blood CpGs showed differential methylation with age 17 . We investigated whether our analysis was underpowered by combining male and female participants, effectively doubling the size of the cohort, and accounting for sex as a covariate in the model. Consistent with our sex-specific analysis, we found that 93% of the CpGs (N = 516836) stayed non-correlated with age, suggesting that sample size plays a negligible part in our results. These contrasting findings may be explained by our strict QC and significance thresholds (Methods), the high quality of our dataset, and the correction for 12 immune cell fractions against the 5 included in the meta-analysis. To evaluate the concordance of CpG aging trajectory classifications between sexes, we compared SNITCH-assigned trajectory labels in males and females using a Chi-square test of independence. The analysis revealed a highly significant association (χ² = 403,770, df = 16, pvalue < 2.2 × 10⁻¹⁶), indicating that CpGs were classified into the same trajectory category in both sexes more frequently than expected by chance. Out of the total CpGs assessed, only 9,938 CpGs (~ 1.78%) changed trajectory class between males and females, confirming a high degree of overall consistency. This was further supported by the standardized residuals (Fig. 2 D), which showed strong positive values along the diagonal of the contingency matrix, reflecting substantial overlap in classification across sexes, including in the NL–NL cell, indicating an enrichment of the 39 CpGs classified as non-linear in both sexes despite the discrepancy in the total number of NL CpGs identified. Conversely, the most pronounced negative residuals were observed in off-diagonal cells where CpGs were classified as age-associated (LD or LI) in one sex but NC in the other. This pattern suggests a subset of CpGs with potential sex-specific sensitivity to age-related methylation changes. In addition, NL CpGs showed moderate positive residuals when aligned with LD (+ 140.3) and LI (+ 36.0) in the opposite sex, suggesting that a subset of CpGs classified as non-linear in one sex may appear more linear in the other. Finally, the NL–NC cells exhibited a residual of − 138.4 in females and − 61.2 in males, indicating that CpGs classified as non-linear in one sex were rarely non-correlated in the other, further supporting their functional relevance. To identify clusters of CpGs sharing similar aging trajectories, we applied the last step of our pipeline by performing unsupervised clustering on the functional components of the NL CpGs (Methods). This revealed similar-shaped trajectories in males and females, with 4 primary clusters identified in females and 5 in males (Supp. Figure 2 ). To avoid redundancy between similar clusters, we merged clusters with a Spearman correlation coefficient > 0.9 (Supp. Figure 3 ). This resulted in a final number of 4 principal NL patterns in females and males (Fig. 2 C). Within those, we observed different inflection points, marking a change of pace in the methylation trajectory. In females, clusters 3, 11, and 12 showed an elbow between the 70–80 years mark, where cluster 2 showed it earlier, around the 50 years mark. In males, the inflection points appeared around 60 years for clusters 1 and 3 and 50 years for clusters 2 and 4. This analysis highlighted the similar aging patterns seen in males and females, but hinted at different temporalities for the inflection points. Functional analysis of Age-related trajectories Trajectories of clocks’ CpGs The first step we took to understand the functional role of the clusters we identified was to investigate the classification of CpGs previously used to train epigenetic clocks. A ubiquitous tool in the field of aging, epigenetic clocks are biomarkers that show relevance in assessing the onset of several age-related conditions as well as the utility of therapeutic strategies. Most epigenetic clocks are machine-learning models based on linear regression 25 (e.g., elastic-net). By construction, this limits the granularity of the aging trajectories they capture by either overlooking non-linear patterns or over-simplifying them as linear, thus limiting their biological interpretability. Further complicating their interpretation, blood-based epigenetic clocks often ignore cell-type heterogeneity when considering age-related DNAm changes. Within our two cohorts, we looked at the classification labels of the CpGs underlying 9 of the most common clocks 11 , 26 – 31 across our 3 models (Fig. 3 A). As expected, we found that the proportions of CpGs classified as NC increased between each model iteration, reflecting that part of the signal captured by these clocks arises from age-driven changes in immune-cell fractions 8 , 25 . In addition, our results highlighted that the majority of the clocks’ CpGs remained classified as NC in our baseline model. Notably, the Hannum clock was the only clock that showed a higher proportion of CpG associated with age (LI & LD) compared to NC across all three models. This finding is consistent with the inherent designs of each clock: the 2nd-generation clocks (PhenoAge, GrimAge, DunedinPace, Zhang_10, and YingAge) were trained on phenotypic age or mortality risk 26 , 28 – 32 , overlooking chronological CpGs, and Horvath’s clock was trained across multiple tissues 11 , where Hannum’s clock was built to estimate chronological age in blood 27 . Finally, we observed that most of the clocks captured NL and VI CpGs in males and females. To complement this analysis, we examined a published set of 350 age-associated CpGs shared across immune cell types 33 . Applying the same approach to these loci yielded classifications of hyper- and hypo-methylation that aligned with those reported in the original study, reinforcing the validity of our modeling strategy, which incorporates immune cell composition (Supp. Figure 4 A). Moreover, our results revealed that a subset of these CpGs exhibited non-linear associations with age. Functional analyses of sex-specific aging patterns After successfully identifying sex-specific ageing methylation patterns, we performed separate functional analysis in males and females for all age-associated clusters. Chromatin enrichment analysis We performed chromatin state enrichment analysis in males and females to determine the epigenetic context of age-associated methylation clusters. Each cluster was tested for enrichment across 15 chromatin states from the Roadmap Epigenomics Project 34 using the NC cluster as a reference (Methods). Overall enrichment results were highly similar between males and females in the LD, LI, and VI clusters (Fig. 3 B, Supp. Table 2). Those similarities are consistent with the overlap of CpGs observed in those clusters between males and females (Fig. 2 C & 2 D). Among the remarkable trends, we observed a notable enrichment of the Repressed polycomb and bivalent chromatin states in clusters characterized by increased methylation or age-related variance (LI, VI, NL11, NL12 in females & LI, VI, NL0, NL1 in males). This observation is consistent with previous findings that age-related hypermethylation occurs preferentially in those domains 35 – 37 . Overall, the enrichment of chromatin states at NL clusters seemed mostly driven by the directionality of the trajectories (gain vs. loss of methylation) rather than by the individual patterns, as they were concordant with the LI or LD clusters, respectively. A noticeable deviation from this was the significant enrichment of “weak transcription” states in NL3 in females (hypomethylated with age) compared to its depletion in LD (NL3 log₂ OR = 0.7, LD log₂ OR = -0.14). This is supported by previous studies highlighting that hypomethylated sites are enriched in active or transcribed genomic regions, including the weak transcription chromatin state 38 , 39 . In particular, this suggests a breaking point around 60 years old in females, where erosion of DNA methylation in weakly transcribed or formerly silent chromatin regions could lead to leaky transcription, supporting the theory of increased transcriptional noise in aging 40 . Pathway enrichment analysis Next, we performed pathway enrichment analysis for our non-linear clusters using different databases (the Gene Ontology (GO) Molecular Function (MF) and Biological Process (BP), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Reactome). Enrichment analyses in DNA methylation datasets are inherently biased toward long genes, and due to CpGs mapping to multiple genes 41 , 42 . To account for those, we used missmethyl , a method that addresses these biases by leveraging prior probabilities 42 , 43 . Across all NL clusters, only NL12 in females was enriched for the term “Neuroactive ligand signaling” in KEGG (FDR = 0.023); no enrichment was observed for NL clusters in males (Supp. Table 3). Motif enrichment analysis In addition to the local chromatin context, the underlying DNA sequence also contains biologically relevant information that can help understand the function of a particular set of CpGs. Although the mechanisms haven’t been fully elucidated yet, it is widely accepted that the methylation context at a specific motif can positively or negatively impact binding affinity of transcription factors (TFs) 44 , 45 . Focusing on the NL CpGs, we performed cluster-wise motif enrichment analysis to identify the presence of TFs binding sites (TFBs) in their vicinity (Methods). In females, only NL12 and NL3 showed significant enriched motifs (Supp. Figure 4 B, Supp Table 3). NL12 was enriched in ZNF652 binding site (qval = 0.0249) where NL3 was enriched for both Nuclear Factor 1 (NF1/CTF) half- (qval = 1x10 − 5 ) and full motif (qval = 0.0026), Hoxc9 (qval = 0.0411) and Gata6 (qval = 0.0476). In males, only NL1 and NL4 were enriched in TFBs (Supp. Figure 4 C, Supp. Table 4). Similar to NL3 in females, the NL4 cluster in males was enriched for NF1 half- (qval = 0.0063) and full-site (qval = 1x10 − 4 ) along with REST/NRSF (qval = 1x10 − 5 ). The NF1-CTF family of transcription factors was shown to regulate cell development in the central nervous system, is associated with cancer, and plays a key role in the regulation of transcription 46 – 48 . Similarly, ZNF652 acts as a potent tumor suppressor in breast cancer by repressing the transcription of oncogenes 49 , 50 , while both Gata6 and Hoxc9 are involved in embryogenesis and have oncogenic properties 51 – 54 . Nonlinear clusters as biomarkers of diseases Cancer risk DNA methylation, or surrogate measures such as epigenetic age-acceleration, have been used as biomarkers to predict the onset of or diagnose a wide range of pathological conditions, ranging from rare developmental diseases to cancer 55 – 58 . Given the presence of transcription factor binding motifs known to regulate oncogenic pathways within the methylation clusters, we hypothesized that these epigenetic modules may capture pre-diagnostic signals of cancer risk. To evaluate the predictive value of the NL clusters for cancer onset, we assessed the association between their eigenvalues and cancer development using three complementary approaches: Cox proportional hazards models, Kaplan–Meier survival analysis, and logistic regression. Analyses were conducted in the EPIC-Italy cohort 59 , a large prospective study in which blood DNA methylation was profiled at baseline in healthy participants, with up to 15 years of follow-up for incident cancer diagnoses. The cohort includes time-to-diagnosis information for several cancer types, with breast cancer (C50) and colorectal cancer (C18) being the most prevalent. We first assessed associations across all cancer types. Stratifying by sex revealed a clear sex-specific predictive capacity. In females, NL3 emerged as the most robust predictor of cancer risk across all analyses. In Cox regression, NL3 eigenvalues were significantly associated with a shorter time to cancer diagnosis (HR 1.020, 95% CI: 1.007–1.032, FDR = 0.0058) (Supp. Table 4). This was supported by Kaplan–Meier survival analysis, where NL3 tertiles showed clear separation in cancer-free survival (log-rank p = 0.003, Fig. 3 C). Logistic regression confirmed the predictive effect (OR = 1.032, 95% CI: 1.012–1.052, FDR = 0.0083). The NL11 cluster also showed a potential association with cancer risk. While its Cox result was borderline (HR = 1.039, FDR = 0.063), logistic regression yielded a significant effect (OR = 1.065, 95% CI: 1.008–1.126, FDR = 0.049), suggesting a possible predictive role that warrants further investigation. No associations were detected for NL2 or NL12 in any models. In males, none of the tested clusters showed significant associations. Next, we stratified the cohort by cancer type to determine whether these associations were driven by specific tumor types. In the breast cancer sub-cohort (female participants only), NL3 remained a robust predictor of disease onset in both models (Cox HR = 1.023, 95% CI: 1.009–1.037, FDR = 0.0045; Logistic OR = 1.035, 95% CI: 1.013–1.058, FDR = 0.0051) (Supp. Table 4). Likewise, NL11 showed consistent predictive value (Cox HR = 1.065, 95% CI: 1.019–1.113, FDR = 0.0097; Logistic OR = 1.101, 95% CI: 1.034–1.172, FDR = 0.0051). These results suggest that NL3 and NL11 capture biologically relevant, pre-diagnostic methylation signals specifically associated with breast cancer development. In contrast, no non-linear cluster was significantly associated with colorectal cancer (C18) in either males or females. Both Cox and logistic regression models yielded non-significant associations across all clusters, indicating that the epigenetic trajectories captured by these modules do not predict colorectal cancer onset within this cohort. Together, these findings highlight the sex- and cancer-type-specificity of non-linear DNA methylation aging patterns. While NL3 and NL11 showed significant biomarker potential for breast cancer risk in females, their lack of association with colorectal cancer suggests these epigenetic trajectories reflect tissue-specific aging dynamics rather than generalized cancer susceptibility. Inflammation - CRP levels Senescence of the immune system is termed “inflammaging”, with chronic inflammation being one of the hallmarks of aging 60 , 61 . Particularly relevant to our blood-based analysis, we decided to examine the association between the non-linear clusters and inflammation. We explored whether some of the age-related clusters we identified shared this predictive capacity. First, we looked at inflammation. Separately in males and females, we calculated each cluster’s eigenvalues and looked at their association with estimated C-reactive protein (CRP) levels (Methods), a well-established marker of inflammation 62 . Three nested models were used to assess these associations while adjusting for age and NC CpGs (Methods). The analysis revealed sex-specific patterns of association, with several eigenvalues significantly predicting CRP independently of chronological age. In females, the full linear model including eigenvalues from all age-related clusters significantly improved CRP prediction compared to age + NC alone ( adjusted R² = 0.422; ANOVA p < 8.2 × 10 − 11 ). Several modules showed robust associations (Fig. 3 D, Supp. Table 4). The NL3 module displayed the strongest positive association with CRP (β = 0.0221, p = 2.06 × 10⁻⁶), suggesting that methylation patterns in this cluster closely align with inflammatory status. The LD module was significantly negatively associated with CRP (β = − 0.0098, p = 1.34 × 10⁻⁴), indicating a potential protective or anti-inflammatory methylation pattern. NL11 also revealed a significant negative association (β = − 0.0200, p = 0.023), while VI showed a significant positive association (β = 0.0179, p = 6.9 × 10⁻⁴). Notably, the NC eigenvalue was not significantly associated with CRP ( p = 0.266), although it contained 87% of the CpGs used in CRP estimations (Supp. Figure 4 D). This result likely stems from the fact that we accounted for 12 immune cells fractions in our model, including naïve and mature Tells which the CRP signature relies on 62 , and further shows that the NL patterns captured methylation variation independently of the immune profile. Together, these results suggest that inflammation in females is selectively captured by distinct age-related methylation patterns, some of which are positively associated with inflammatory burden (e.g., VI, NL3) and others inversely associated (e.g., LD, NL11), reflecting complex and potentially compensatory epigenetic dynamics. In contrast, males exhibited a more restricted profile of significant associations. While the full model remained statistically significant ( adjusted R² = 0.423; ANOVA p = 4.82 × 10 − ⁵), only the LI module showed a significant positive association with estimated CRP levels (β = 0.0059, p = 0.0037) (Fig. 3 D, Supp. Table 4). Associations with other clusters, such as NL1, NL2, and LD, did not reach significance, although some trends were observed. The NC module was again not predictive ( p = 0.65), reinforcing the specificity of the signal to age-related modules. These sex-specific patterns suggest that while methylation-based inflammation signatures exist in both sexes, females display a more modular and interpretable architecture. In contrast, the inflammatory signal in males may be either more diffuse across the methylome or less tightly coupled to the cluster structure identified in the present analysis. Together, these findings indicate that methylation clusters derived from non-linear age-related trajectories carry relevant information about the inflammatory burden in blood. Different waves of regulation in males and females In the analysis above, we identified functional clusters of age-related CpGs sharing similar linear or non-linear patterns. Although this analysis hinted at specific ages at which disruption of methylation trajectories occurs (i.e., inflection points), another analytical method has been previously used to address this question directly 1 , 6 . We applied the same method used by Shen & al. 1 to reveal peaks of disruption of methylation patterns in our male and female populations (Methods). Our analysis focused on age-related CpGs identified at the last step and revealed distinct waves of dysregulation in males and females (Fig. 4 A). Those peaks were deemed robust as they passed different significance thresholds (Fig. 4 B). In females, we observed three peaks at age 33, 51, and 73 that were consistent with 2 of the inflection points observed in the previous analysis. Whereas males only displayed two peaks at 47 and 63 years old, again consistent with the clustering analysis. The number of significantly dysregulated CpGs markedly differed between males and females, with females showing an overall higher number, similar to the discrepancy observed between their respective numbers of NL CpGs. Notably, the peaks at 33 and 51 in women are supported by previous findings highlighting those decades as transitional periods in women’s aging 63 and previous research in protein expression levels identified three peaks at 34, 60, and 78 years old 6 . Recent findings in a multi-omics setting also identified crests of dysregulation around 40 and 60 years old. Both those analyses were performed in a joint cohort of men and women and might represent a combination of the sex-specific peaks we identified. Thus, our results support previous claims that aging occurs in waves across multiple modalities rather than unique molecular components. In addition, we highlighted a distinct temporality in the aging process across sex, as underlined by the earlier onset of the crests in males. Integrating this analysis with the clusters we identified previously revealed that the main CpG overlaps in males and females occur with the linear clusters LI and LD (Fig. 4 C). Most of the NL CpGs in females intersected with wave 3, whereas in males, they were equally represented in waves 1 and 2. This analysis further highlighted conserved CpGs across waves (N_males = 2027 & N_females = 1453) and across sexes (N = 725). Motif enrichment analysis of those sets confirmed the enrichment of NF1 full- and half-binding motifs at age-related CpGs, and revealed the presence of CTCF and NRSF/REST binding sites, two consistent marks of epigenetic aging 10 , 33 , 64 – 67 , among other TFBs (Fig. 4 D). To understand the specific pathways that could be affected, we repeated our previous enrichment analysis, focusing on each peak’s unique set of CpGs (Methods). In females, waves 1, 2, and 3 showed, respectively, 2057, 873, and 2077 uniquely dysregulated CpGs. Wave 1 showed striking enrichment for neurodevelopmental processes, including nervous system development (FDR = 3.9 × 10 − 4 ), neuron differentiation (FDR = 8.5 × 10 − 4 ), and generation of neurons (FDR = 1.8 × 10 − 3 ) (Supp. Figure 5). Additionally, terms related to transcriptional control and chromatin binding, such as DNA-binding transcription factor activity (FDR = 5.3 × 10 − 4 ), and double-stranded DNA binding (FDR = 4.2 × 10 − 3 ) were significantly enriched. Wave 2 yielded a single enriched term: regulation of execution phase of apoptosis (FDR = 0.036) (Supp. Table). These findings recapitulate previous observations of neurodevelopmental and transcriptional gene dysregulation with age 10 , 68 . Interestingly, in males, among the unique CpGs in wave 1 (N = 1250) and wave 2 (N = 2426), no significant enrichment was found. Likewise, the 725 CpGs common to both sexes showed no detectable enrichment. This highlights sex-specific epigenetic changes in females, with broader and more functionally coherent methylation changes. Notably, REST, a transcriptional repressor enriched in our motif analysis, is known to silence neuronal genes in non-neuronal lineages and is a key regulator of the epigenetic program that maintains cellular identity 67 , 69 . Its involvement, alongside the dysregulation of neurodevelopmental pathways, supports a model in which age-related methylation drift compromises the fidelity of cell-type–specific gene regulation, allowing partial re-expression of lineage-inappropriate developmental programs. This aligns with emerging evidence from aging transcriptomes and methylomes indicating that immune cells in older individuals exhibit loss of identity and ectopic activation of developmental gene networks, including those tied to nervous system formation 70 , 71 . Discussion By leveraging a novel analysis pipeline and a high-quality DNA methylation (DNAm) dataset, we identified sex-specific non-linear aging trajectories in blood. Fine-tuning our clustering pipeline on simulated data enabled us to establish robust heuristics facilitating the identification of diverging linear and non-linear patterns (Figs. 1 A & 1 B). We showed that our new pipeline, SNITCH, outperforms stand-alone unsupervised clustering methods in discriminating between Variance Increase (VI), Linear Increasing and Decreasing (LI & LD), Non-Correlated (NC), and various Non-Linear (NL) functions (Figs. 1 C & 1 D). Nevertheless, we observed misclassification between the linear and logarithmic patterns due to their close resemblance. By providing SNITCH as a user-friendly framework, our approach is well-positioned to uncover non-linear, tissue- or context-specific aging dynamics in large cross-sectional or longitudinal datasets, including EWAS, transcriptomic time series, or multi-omic aging studies. The immune profile reshapes with age, and analyses in whole blood are particularly sensitive to those changes 72 . Applying our pipeline to a blood DNA methylation dataset while accounting for an increasing number of immune cell types highlighted the confounding effect of the immune profile on age-related DNAm changes (Fig. 2 A). Stratifying our analyses across sexes revealed a notably higher number of NL CpGs identified in Females (Fig. 2 B). Nevertheless, CpGs generally showed concordant aging trajectories across sexes, with significant enrichment for matching classifications (Fig. 2 D). A surprising result was the small number of age-related CpGs identified (< 4%), seemingly contradicting findings of a meta-analysis of age-related DNAm in blood 17 . We argue that our results are a conservative but robust representation of true changes occurring during aging. We used a gold-standard reference DNAm dataset that minimises batch effects, covers an equivalent number of males and females, and has a homogeneous age distribution 73 . This, combined with our strict quality control and FDR threshold and the inclusion of 12 immune cell types as covariates in our model, ensured the identification of robust changes and explains those numbers. However, we acknowledge that due to our limited sample size, we might have missed smaller effects or population-specific changes. Our study helped characterise CpGs underlying epigenetic clocks. We assessed the distribution of clock CpGs within our different aging clusters and found that most models relied on CpGs classified as NC in our model (Fig. 3 A). Although counterintuitive, we focused on changes independent of the immune profile, where clocks are susceptible to cell fraction changes. This suggests that many clocks rely on CpGs that are stable across age when accounting for immune composition, potentially reflecting their role as proxies for cellular fraction rather than methylation-only age-related change. This was confirmed by our baseline model (not accounting for immune cells), which showed fewer CpGs classified as non-correlated. Additional explanation comes from the data used to train the clocks. Indeed, among the 9 models we investigated, only the Hannum clock was trained solely on blood DNAm using chronological age as the target variable, where the other clocks were trained on multiple tissues (Horvath’s) or phenotypic variables (PhenoAge, GrimAge, DunedinPace, Zhang_10, and YingAge). We found that most of the clocks included VI and NL CpGs. Further analyses should investigate the weight associated with those in each model to understand the dependency of the clocks on non-linear trajectories, and potentially fine-tune them to reflect the non-linear phases of aging. Our analysis confirmed the enrichment of Polycomb Repressed chromatin states at hypermethylated CpGs 35 – 37 . Overall chromatin state enrichments were dependent on the gain or loss of methylation, irrespective of sex or linearity (Fig. 4 B). Only the NL3 cluster in females differed from this trend and showed a small enrichment for the weak transcription state. Consistent with other results, the motif enrichment analysis revealed an enrichment in the motif of REST-NRSF 66 in the male hypermethylated NL1 cluster. The loss of REST is associated with cognitive impairment and Alzheimer's disease 67 , and hypermethylation at its binding sites could disrupt its function. The relevance of this finding in blood should be investigated, as is the biomarker potential of the males' NL1 cluster for Alzheimer’s disease. This analysis also identified new enriched motifs at NL CpGs for the NF1-CTF family, both in males and females, suggesting a potential disruption of their function. Additional transcription factors (TFs) having their binding sites enriched included the STAT family in males (STAT3 & STAT4), underlying a potential disruption of the immune system 74 and developmental TFs in females (HOXC9, GATA6) 51 , 52 . Notably, the enrichment of motifs from developmental TFs at hypomethylated sites could indicate the loss of a safeguarding mechanism consistent with the epigenetic drift view. Nevertheless, our analyses suggest that this drift is not constant but suffers from episodes of dysregulation. The functional role of our NL clusters was further explored by looking at their relationship with inflammation and predictive capabilities for cancer onset in an independent cohort. With a robust statistical model accounting for immune cell fractions, we identified NL3 in females as being strongly associated with both inflammation (Fig. 3 D) and cancer development probabilities (Fig. 3 C), and the female NL11 cluster as mildly associated. The functional specificity of those clusters was underscored by the absence of predictive power from the NC module across sexes. Those results showed the relevance of our identified NL clusters by validating their functionality in an independent cohort. Furthermore, the more robust and multifaceted associations in females strongly support the argument for sex-stratified biomarkers, particularly in the context of aging, cancer, and inflammation. Our motif enrichment analysis laid the ground for further investigating the mechanisms underlying the predictive capacity of females NL3. Notably, NF1/CTF, Hoxc9, and Gata6 binding motifs were overrepresented in this cluster. Gata6 was recently shown to be part of a central mechanism promoting cancer-associated fibroblasts in breast cancer 54 . The study found that the expression of Gata6 was enhanced by the action of TET1, a protein that removes methylation. Evidence shows that binding of members of the Gata family is disrupted by methylation marks 75 . Thus, the hypomethylation observed in NL3 could be associated with increased Gata6 oncogenic activity. Both Hoxc9 and members of the NF1 family have also been associated with breast cancer, although the mechanisms have not been fully elucidated and are not well documented 53 , 76 , 77 . Nevertheless, how the occurrence of these epigenetic marks in immune cells may affect cancer onset remains to be investigated. Overall, our clustering analysis revealed sex-specific functional clusters of NL CpGs. They hinted at sex-specific peaks of dysregulation by showing inflection points around the 75 (NL3, NL11, NL12) and 50 (NL2) years mark in females against the 50 (NL2, NL4) and 60 (NL1, NL30) years mark in males. We used the previously described DEswan analysis 1 to formally identify peaks of dysregulation in a sex-specific manner. The DEswan results reinforced the findings of our first analysis by revealing peaks of dysregulation at 51 and 73 years old in females, and 47 and 63 years old in males (Figs. 4 A & 4 B), plus an earlier peak at 33yo in females. Those results are strongly supported by the literature, with studies in proteomics and multi-omics identifying peaks at similar ages 1 , 6 , 63 . The sex-specificity of our findings indicates an earlier onset of dysregulation in males, with the last wave occurring a decade before the one in females. This also suggests a peak of dysregulation in mid-adulthood occurring in females but not males, although this last finding should be tempered by the minimum age of 25 in our cohort, probably missing prior peaks during adolescence 78 . In both males and females, CpGs identified at those peaks showed a partial overlap with the clusters (Fig. 4 C). In addition, both males and females showed CpGs consistently dysregulated at all peaks, with some of those conserved across sexes. Motifs enrichment for sex-specific and non-sex-specific conserved CpGs supported our previous findings with NF1/CTF and REST amongst the most enriched binding sites. Consistent with other findings 65 , 66 but absent from our cluster analysis, the CTCF motif was also enriched. Our functional analysis for the waves of dysregulation pointed toward a larger and more coherent dysregulation in females, with previously reported pathways related to DNA binding and neurogenesis being affected. Our study presents several limitations. First, the datasets lacked critical covariates such as BMI or smoking status, preventing us from fully disentangling lifestyle effects from intrinsic aging-related methylation changes. Additionally, both datasets were predominantly composed of individuals of European ancestry, which may limit the generalizability of our findings to more diverse populations. Despite covering a wide age range (25–90 years old), we missed samples in the early infancy/adolescent stages, known to be associated with wide changes in DNAm 78 – 80 . As a result, our characterization of methylation dynamics during early development remains incomplete, potentially missing early-life inflection points critical to the trajectory of aging. Another limitation is the cross-sectional nature of our study. This is a typical limitation in the field due to the lack of large longitudinal datasets. Thus, our analysis relies on the hypothesis that DNAm changes are conserved to a level across different individuals, a hypothesis supported by the large body of literature on DNAm in aging. Nevertheless, adopting a longitudinal approach is an important step towards personalised medicine and is a direction the field should embrace. In this work, we chose to work with a smaller, high-quality dataset to reduce noise and uncover true age-related changes. While this strategy facilitated the detection of robust changes validated in an independent cohort, our relatively small sample size (~ 240 individuals per sex) may have limited our ability to detect subtle but biologically relevant effects, reflecting the small number of age-related CpGs we found. Finally, DNAm is intrinsically difficult to functionally link to changes in the phenotype, in part due to the variety of effects that gain or loss of methylation can have depending on their location. Consequently, we emphasize that our findings are correlative and should be interpreted as surrogate markers rather than direct drivers of phenotypic changes. In the future, we aim to further explore non-linear patterns in other tissues 81 and extend our methodology to other omics, such as transcriptomics and proteomics, where the increase of variance is often omitted from clustering analyses. Our results, along with prior work, reveal enrichment of developmental transcription factor motifs (e.g., REST, NF1/CTF) at age-dysregulated CpGs 36 , 45 . This observation remains underexplored, particularly regarding how such TF-DNAm interactions shift with age and impact transcriptional regulation. Future studies integrating these findings with transcriptomic or chromatin accessibility data could elucidate the functional consequences of these methylation changes. Thus, we will focus on the female NL3 cluster that showed biomarker potential, working to characterise and validate it in other cancer cohorts. Conclusion Altogether, our findings support a paradigm shift in the way we conceptualize aging: not as a steady, linear decline, but as a non-linear and entropic process characterized by temporally confined windows of epigenetic instability. By uncovering sex-specific, wave-like patterns of DNA methylation changes that mirror inflection points reported in other omics layers, our study strengthens the case for integrated, temporal frameworks of aging biology. The critical decades we identified, particularly around the 30s, 50s, and 70s, may represent biologically vulnerable windows, where intervention could yield the most impact. Beyond these insights, our findings underscore the need to move beyond one-size-fits-all models and adopt analytical strategies that reflect the inherent heterogeneity, non-linearity, and sex-specific nature of biological aging. However, to fully harness the translational potential of these observations, a pressing need for molecular validation remains. Future work should prioritize characterising the regulatory mechanisms underpinning these methylation dynamics, including their interaction with chromatin architecture, transcription factor networks, and other epigenetic layers, to distinguish causality from correlation and to illuminate actionable pathways of aging and disease. Methods Cohort and DNA methylation preprocessing Two separate cohorts were used in this study. GSE246337 73 was used in the identification of ageing methylation patterns, and GSE51032 was used to identify biomarkers of cancer. For both datasets, the Raw IDAT files from the Illumina Infinium HumanMethylationEPIC v2 BeadChip array (EPICv2) or 450K, respectively, were retrieved from GEO and processed using the sesame R/Bioconductor package (v1.18.1) 82 . To harmonize probe identifiers and accommodate the EPICv2 platform structure, prefix collapsing was enabled during data import. Initial preprocessing followed the “QCDPB” pipeline within sesame , incorporating: (i) probe quality filtering using the pOOBAH method to remove probes with poor detection p-values, (ii) dye-bias correction to normalize type I and II probe discrepancies, (iii) masking of probes known to be problematic due to non-specific hybridization or SNP interference as defined by Zhou et al. 83 , and (iv) exclusion of probes supported by fewer than four beads. The resulting beta value matrix was further filtered to improve data integrity. Probes with missing values in more than 1% of samples were discarded. Subsequently, samples with more than 1% missing beta values across retained CpGs were also removed, and we removed probes on the sex chromosomes. The remaining missing values were imputed using k-nearest neighbors using the impute.knn functions from the impute package, keeping the default parameters. Beta values were rounded to three decimals for downstream analyses. For the GSE246337 dataset, we only kept probes present on the EPICv1 array, as most tools used in the downstream analysis are not yet compatible with the EPICv2 array. The final number of probes used in the downstream analyses was 556811, and final numbers of samples were 256 females and 238 males for GSE246337. For the GSE51032 cohort, the final number of probes was 382717, for 651 females and 186 males. Description of the SNITCH analysis pipeline The SNITCH pipeline involved three main steps: Heuristic-based classification, Functional Principal Components Analysis (FPCA), and unsupervised clustering. Heuristic-based classification For each CpG site, methylation beta values are modeled as a function of chronological age. The core procedure included the following steps: Model Construction : Linear models (LMs) are fitted using ordinary least squares regression with age as a continuous predictor (lm function from base R). Parallelly, generalized additive models (GAMs) are fitted using restricted maximum likelihood (REML) and thin plate regression splines ( s(Age, k = 5) ), enabling the detection of non-linear trends (gam function - mgcv 84 ). When relevant, covariates are included consistently across both models to preserve interpretability and comparability. Model Comparison and Heteroscedasticity Testing : The explanatory performance of GAMs relative to LMs is evaluated using the Bayesian Information Criterion (BIC) (BIC function from base R). CpGs with ΔBIC (BIC(LM) − BIC(GAM)) > 2 are considered to favor the non-linear model. To characterize potential violations of homoscedasticity that may underlie complex aging dynamics (VMP), the White’s test is applied to each LM, with heteroscedastic CpGs flagged based on a 1% FDR-adjusted significance threshold. Prediction and Effect Size Estimation : For CpGs favoring a GAM fit, DNAm values are predicted across a continuous age grid ranging from the minimum age of the cohort to the maximum with a one-year step increase, holding covariates constant at reference levels (medians for numeric or first level for categorical variables). This effectively smoothed the trajectories for subsequent steps. For linear trajectories, the direction and significance of age-associated change are inferred from the LM coefficient and p-value, respectively. Multiple Testing and Classification : P-values from LM, GAM, and heteroscedasticity tests are corrected using the Benjamini-Hochberg method. CpGs are classified as: LI (Linear Increase): Significant linear association (adj. p(LM) ≤ 0.01) with a positive slope. LD (Linear Decrease): Significant linear association with a negative slope. NL (Non-linear): ΔBIC > 2 and adj. p(GAM) ≤ 0.01. VI (Variance-Increasing): No linear association (adj. p(LM) > 0.01) but significant heteroscedasticity (adj. p(White) ≤ 0.01). NC (Non-correlated): CpGs not meeting any of the above criteria. This classification strategy allows SNITCH to robustly detect a spectrum of epigenetic aging signatures, from canonical linear changes to more complex non-linear patterns. The SNITCH method is available and can be installed as a user-friendly R package ( https://github.com/fishrscale/SNITCH ) , thus facilitating access to non-linear trajectory analyses. Functional Principal Component Analysis (FPCA) To further dissect heterogeneity within non-linear (NL) CpG methylation trajectories, we implemented a two-stage dimensionality reduction and clustering procedure based on functional principal component analysis (FPCA) followed by density-based unsupervised classification. This enabled the grouping of NL CpGs into discrete functional subclusters based on the shape of their smoothed age-associated methylation trajectories. For all CpG sites previously classified as NL by SNITCH, we use the smoothed beta values predicted by the gam model. FPCA is then applied using the fpca.face() function from the refund R package, with age as the functional domain. The number of knots is set dynamically based on the number of time points (min(35, floor(0.8 * timepoints))), and the proportion of variance explained (PVE) threshold is conservatively fixed at 99.99% to retain fine-grained trajectory information. This process decomposes the complex, high-dimensional nonlinear patterns into a reduced set of orthogonal functional basis scores. Unsupervised Clustering The resulting FPCA scores are used as input for unsupervised clustering using either the hdbscan ( dbscan ), fuzzy-clustering ( mfuzz) , or kmeans (base R) algorithm. This procedure allowed data-driven identification of methylation trajectory subtypes without requiring prior knowledge of cluster number or shape. Simulated Data A total of 3,000 synthetic CpG sites were simulated across 300 individuals, each assigned a random age between 1 and 100 years. Fifteen trajectory archetypes were implemented to represent a range of biologically plausible methylation patterns. These included: non-correlated, linear trajectories (increasing, decreasing), quadratic trends (increasing, decreasing), logarithmic transitions (increasing, decreasing), sigmoidal dynamics (increasing, decreasing) with inflection points at ages 25, 40, and 80, variance-increasing profiles, and non-monotonic patterns. For each of the 15 trajectory classes, 200 CpGs were simulated. For all the functions except the variance-increasing, the age-specific methylation expectation (µ) was passed to a Beta distribution (via rbeta) to introduce variability while maintaining biological constraints (bounded between 0 and 1). For the variance-increasing function, age-dependent Gaussian noise was added directly to a mean of 0.5, with standard deviation increasing linearly from 0.01 (age < 25) to 0.3 (age = 100), mimicking stochastic methylation drift. Details are available in Supp. File. Benchmarking SNITCH The benchmarking was done using the simulated patterns and their associated labels as ground truth. Fuzzy c-means clustering was performed using the Mfuzz package. The fuzzification parameter m was estimated using the function mestimate, and the optimal number of clusters was set to 11 after determining the minimum centroid distance across 3 repeated runs. Final cluster assignments were defined by the highest membership value. K-means was run with 25 restarts and centers = 10 by using the elbow method based on total within-cluster sum of squares (WCSS) and k.max = 20. HDBSCAN was applied using the dbscan package, the minimum cluster size was set to 5. The DICNAP pipeline was implemented as originally described 18 , except for the maximal number of clusters for K-means set to 20. SNITCH was run as previously described. The FPCA scores were subsequently used for unsupervised classification by the same three algorithms. The ‘ c’ parameter was set to 11 for fuzzy clustering, and centers = 10 for K-means. Each method's cluster assignments were compared to ground-truth labels to compute ARI and AMI using the same functions from aricode . The results from the two best-performing methods were evaluated by a confusion matrix. Identification of sex-specific CpG Methylation Trajectories Using SNITCH To evaluate aging-associated methylation trajectories independently of immune cell composition, we constructed three models differing in their treatment of cellular heterogeneity. In the Baseline (BL) model, we ran SNITCH with default parameters and without correcting for covariates. In the 7-cell and 12-cell models, we accounted for cell type proportions estimated using the EpiDISH package with the centDHSbloodDMC.m and cent12CT.m reference matrix, respectively, using the Robust Partial Correlations (RPC) method 24 . The 7 cell-matrix contained information on B-, NK-, CD4T, and CD8T-cells, Monocytes, Neutrophils, and Eosinophils. Building on the 7-cell reference matrix, the 12-cell model discriminated between naive and mature B-, CD4T-, and CD8T-cells, and added T-regulatory cells and Basophils. To identify groups of CpGs with similar aging dynamics, we applied both Mfuzz and HDBSCAN to the SNITCH-classified non-linear trajectories. Clustering results were assessed visually, and we retained only the HDBSCAN clusters for downstream analyses due to their superior trajectory homogeneity. Similarly, the optimal minPts parameter for HDBSCAN was determined by visual inspection of cluster resolution. The selected minPts values for the female BL, 7-cell, and 12-cell models were 5, 5, and 7, respectively. For the male models, they were 5, 8, and 4. Notably, HDBSCAN designates sparse or noisy data points as cluster 0. Upon reviewing the trajectories assigned to this group in the female 12-cell model, we observed two distinct patterns within cluster NL0. We therefore re-ran HDBSCAN on this subset to refine the classification, resulting in three clusters: NL10, NL11, and NL12. Then, for each NL cluster, we performed principal component analysis (PCA) and extracted PC1 to represent the dominant methylation pattern. A correlation matrix was then computed across all cluster PC1 using Spearman’s rank correlation. Clusters with correlations exceeding 0.90 were merged to reduce redundancy and capture shared underlying dynamics. Functional analysis Classification of clocks and immune cells CpGs CpGs were retrieved for the following clocks from the Biolearn resource 73 : Horvath v1, Hannum, PhenoAge, GrimAgeV2, DunedinPACE, YingCausAge, YingDamAge, YingAdaptAge, and Zhang_10. Only unique CpG identifiers were retained. We counted the occurrence of each clock CpGs in the different categories identified by SNITCH across our 3 models (BL, 7-cell, 12-cell) in both males and females. A similar analysis was performed with age-related CpGs (169 hypermethylated, 181 hypomethylated) shared across immune cells retrieved from Roy R. et al. 33 . Chromatin state enrichment Chromatin states profiled in peripheral blood mononuclear cells (PBMCs) were obtained from the Roadmap Epigenomics Project 34 and mapped to CpG probe IDs. Fisher’s exact tests were conducted to evaluate the enrichment of each chromatin state across aging trajectory classes against NC CpGs in both sexes. P-values were adjusted using the Benjamini-Hochberg method. Pathway enrichment analysis Both gsameth and gometh functions from the missMethyl 43 R package were used to perform Over Representation Analysis (ORA). These functions account for the differing numbers of CpG probes per gene on the Illumina EPIC array, thereby reducing bias inherent to standard enrichment tools when applied to DNAm data. The gsameth function allows testing a list of CpGs against a custom gene set. We retrieved 2 custom gene sets from the Molecular Signatures Database (MSigDB v2024.1/v2025.1): Human Phenotype Ontology 85 (HPO - c5.hpo.v2024.1.Hs .entrez.gmt and Reactome Pathways 86 (c2.cp.reactome.v2025.1.Hs.entrez.gmt). The Gene Ontology 87 and KEGG 88 databases were tested with the gometh function. We used the set of CpG sites corresponding to each DNA methylation cluster, excluding the non-correlated "NC" sites, as the test set, and the full set of tested CpG sites across all clusters served as the background. Analyses were conducted independently for male and female clusters. The analysis was replicated for the CpGs of each peak identified during the DEswan analysis. Motif enrichment analysis The motif enrichment tool from EWAS datahub 89 was used to perform motif enrichment analysis on each cluster identified in females and males. This tool uses a centered 500bp window on the CpG of interest to perform Hypergeometric Optimization of Motif EnRichment (HOMER) 90 . Motifs with a qvalue < 0.05 were considered significant. The analysis was replicated for the CpGs of each peak identified during the DEswan analysis. Trait association - Cancer The cohort used to assess the sex-specific biomarker potential of cluster eigenvalues has been described elsewhere. Briefly, the EPIC-Italy 59 study includes DNA methylation data collected at baseline and linked to up to 14 years of prospective follow-up. Cancer cases were annotated with time to diagnosis and cancer type, enabling time-to-event analyses. We performed sex-stratified analyses using DNAm of patients preprocessed as described above. Individuals without a cancer event were treated as right-censored. For these, the time to diagnosis was imputed using the maximum observed follow-up time among cases. For each patient, we computed their NL cluster eigenvalues as described above. Importantly, this step was restricted to the CpGs common to both the EPIC and 450K arrays, resulting in smaller sets of CpGs per cluster (Supp.Table). Immune cell-type proportions were estimated from whole blood methylation profiles using the EpiDISH 24 algorithm (RPC method) with a 12-cell reference panel. The coxph function from the survival package was used to fit cox proportional hazards models using “Surv(time_to_diagnosis, status)” as the outcome, with eigenvalues, age, and immune cell proportions as predictors. Logistic regression models (glm(family = binary)) using cancer status as the binary outcome were also fit for comparison. Model coefficients were exponentiated to yield hazard ratios (HR) and odds ratios (OR), with 95% confidence intervals. To assess survival differences, samples were stratified into tertiles based on the significant eigenvalues (Low, Mid, High). Kaplan–Meier curves were generated using the survfit function and plotted with ggsurvplot() (log-rank p-value and 95% CI shown) from the survminer package. Significance was considered as FDR < 0.05. Trait association - Inflammation Separately in males and females, we computed the “eigenvalue” of each cluster, corresponding to that cluster’s first principal component (PC1), by performing principal component analysis (PCA) on centered and scaled methylation beta values across CpGs within that cluster. For each sample, we used the ComputeCRPscore 62 function based on a predefined set of weighted CpGs to generate an estimation of CRP protein levels as a surrogate for inflammation 91 . To test the independent contribution of age-associated methylation modules to CRP variation, we constructed a series of nested linear models: Model 1 included age as the only predictor; CRP ~ age Model 2 included age and the eigenvalue of the NC (non-correlated) CpGs; CRP ~ age + NC Model 3 (full model) included age, the NC eigenvalue, and eigenvalues of the VI, LI, LD, and NL clusters. CRP ~ age + NC + VI + LI + LD + NLi + … + NLn We compared models using ANOVA to evaluate whether the addition of SNITCH-derived modules significantly improved the explanation of CRP variability beyond age and (non-correlated) methylation patterns. Model fit and the significance of individual predictors were assessed using standard linear modeling statistics and p-values < 0.05. Wave of aging – DEswan analysis We performed a DEswan analysis as previously described 1 , 6 . DEswan works as a sliding window, where at each specified time point centered at the middle of the window, CpGs at both ends are compared using a Wilcoxon test for differential methylation. We restricted analyses to age-associated CpGs previously identified (excluding NC CpGs). We used windows centered from 31 to 78 years in 2-year steps with a 15‑year bucket size. To account for cell composition, we estimated per‑sample blood cell‑type proportions with EpiDISH using a 12‑cell reference (cent12CT) and included these estimates as covariates in all models (as described above). P-values were adjusted for multiple testing by the Benjamini-Hochberg method, and significance was considered at FRD < 0.05. To ensure the robustness of the findings, we only considered peaks that were conserved at FDR < 0.01 and 0.001 (Fig. 4 B). We adapted the initial R script to allow parallelization across CpGs using parLapply. Abbreviations DNA methylation – DNAm Linear Increasing – LI Linear Decreasing – LD Variance Increasing – VI Non-Correlated – NC Non-Linear – NL Adjusted Rand index – RI Adjusted Mutual Information – AMI Gene Ontology – GO Kyoto Encyclopedia of Genes and Genomes – KEGG Transcription Factor – TF Transcription Factor Binding site – TFB C-Reactive Protein - CRP Declarations Ethics approval and consent to participate Not applicable Consent for publication Not applicable Availability of data and materials All data analysed during this study are publicly available in the GEO repository under accession numbers GSE246337 and GSE51032. The SNITCH method and scripts used in this study are accessible at: https://github.com/fishrscale/ Competing interests The Regents of the University of California are the sole owners of patents and patent applications directed at epigenetic biomarkers for which Steve Horvath is a named inventor; SH is a founder and paid consultant of the non-profit Epigenetic Clock Development Foundation that licenses these patents. SH is a Principal Investigator at Altos Labs, Cambridge Institute of Science. The other authors declare no conflict of interest. Funding Not applicable Authors' contributions RG performed data analysis and interpretation. RG, AT, and NE conceived and designed the study. AT provided methodological and computational modelling expertise. AT and SH provided guidance on DNA methylation analysis and interpretation of the results. RG drafted the manuscript. MJ and BJF contributed to ideas and revisions on the manuscript. All authors approved the final version of the manuscript. Acknowledgements Not applicable References Shen X, et al. Nonlinear dynamics of multi-omics profiles during human aging. Nat Aging. 2024;4:1619–34. Ye Q, et al. Telomere length and chronological age across the human lifespan: A systematic review and meta-analysis of 414 study samples including 743,019 individuals. Ageing Res Rev. 2023;90:102031. Schaum N, et al. Ageing hallmarks exhibit organ-specific temporal signatures. Nature. 2020;583:596–602. Kang Y-K, Min B, Eom J, Park JS. Different phases of aging in mouse old skeletal muscle. Aging 14, 143–60. Fehlmann T, et al. Common diseases alter the physiological age-related blood microRNA profile. Nat Commun. 2020;11:5958. Lehallier B, et al. Undulating changes in human plasma proteome profiles across the lifespan. Nat Med. 2019;25:1843–50. Aramillo Irizar P, et al. Transcriptomic alterations during ageing reflect the shift from cancer to degenerative diseases in the elderly. Nat Commun. 2018;9:327. Horvath S, Raj K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat Rev Genet. 2018;19:371–84. Seale K, Horvath S, Teschendorff A, Eynon N, Voisin S. Making sense of the ageing methylome. Nat Rev Genet. 2022;23:585–605. Lu AT, et al. Universal DNA methylation age across mammalian tissues. Nat Aging. 2023;3:1144–66. Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14:3156. Johnson ND, et al. Non-linear patterns in age-related DNA methylation may reflect CD4 + T cell differentiation. Epigenetics. 2017;12:492–503. Vershinina O, Bacalini MG, Zaikin A, Franceschi C, Ivanchenko M. Disentangling age-dependent DNA methylation: deterministic, stochastic, and nonlinear. Sci Rep. 2021;11:9201. Olecka M, et al. Nonlinear DNA methylation trajectories in aging male mice. Nat Commun. 2024;15:3074. Luo Q, et al. A meta-analysis of immune-cell fractions at high resolution reveals novel associations with common phenotypes and health outcomes. Genome Med. 2023;15:59. Slieker RC, et al. Age-related accrual of methylomic variability is linked to fundamental ageing mechanisms. Genome Biol. 2016;17:191. Seale K, Teschendorff A, Reiner AP, Voisin S. Eynon, N. A comprehensive map of the aging blood methylome in humans. Genome Biol. 2024;25:240. Okada D, Cheng JH, Zheng C, Kumaki T, Yamada R. Data-driven identification and classification of nonlinear aging patterns reveals the landscape of associations between DNA methylation and aging. Hum Genomics. 2023;17:8. Phyo AZZ, et al. Sex differences in biological aging and the association with clinical measures in older adults. Geroscience. 2024;46:1775–88. Sampathkumar NK, et al. Widespread sex dimorphism in aging and age-related diseases. Hum Genet. 2020;139:333–56. Reicher L, et al. Phenome-wide associations of human aging uncover sex-specific dynamics. Nat Aging. 2024;4:1643–55. Yusipov I et al. Age-related DNA methylation changes are sex-specific: a comprehensive assessment. Aging 12, 24057–80. Qi L, Teschendorff AE. Cell-type heterogeneity: Why we should adjust for it in epigenome and biomarker studies. Clin Epigenetics. 2022;14:31. Teschendorff AE, Breeze CE, Zheng SC, Beck S. A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies. BMC Bioinformatics. 2017;18:105. Teschendorff AE, Horvath S. Epigenetic ageing clocks: statistical methods and emerging computational challenges. Nat Rev Genet. 2025. 10.1038/s41576-024-00807-w . Belsky DW et al. DunedinPACE, a DNA methylation biomarker of the pace of aging. Elife 11, (2022). Hannum G, et al. Genome-wide Methylation Profiles Reveal Quantitative Views of Human Aging Rates. Mol Cell. 2013;49:359–67. Levine ME et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging 10, 573–91. Zhang Y, et al. DNA methylation signatures in peripheral blood strongly predict all-cause mortality. Nat Commun. 2017;8:14617. Lu AT et al. DNA methylation GrimAge version 2. Aging 14, 9484–9549. Ying K, et al. Causality-enriched epigenetic age uncouples damage and adaptation. Nat Aging. 2024;4:231–46. Lu AT et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging 11, 303–27. Roy R, et al. Epigenetic signature of human immune aging in the GESTALT study. Elife. 2023;12:e86136. Bernstein BE, et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol. 2010;28:1045–8. Moqri M, et al. PRC2-AgeIndex as a universal biomarker of aging and rejuvenation. Nat Commun. 2024;15:5956. Teschendorff AE, et al. Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. Genome Res. 2010;20:440–6. Rakyan VK, et al. Human aging-associated DNA hypermethylation occurs preferentially at bivalent chromatin domains. Genome Res. 2010;20:434–9. Slieker RC, Relton CL, Gaunt TR, Slagboom PE, Heijmans BT. Age-related DNA methylation changes are tissue-specific with ELOVL2 promoter methylation as exception. Epigenetics Chromatin. 2018;11:25. Jain N, et al. DNA methylation correlates of chronological age in diverse human tissue types. Epigenetics Chromatin. 2024;17:25. Bartz J, Jung H, Wasiluk K, Zhang L, Dong X. Progress in Discovering Transcriptional Noise in Aging. Int J Mol Sci 24, (2023). Zhao K, Rhee SY. Interpreting omics data with pathway enrichment analysis. Trends Genet. 2023;39:308–19. Maksimovic J, Oshlack A, Phipson B. Gene set enrichment analysis for genome-wide DNA methylation data. Genome Biol. 2021;22:173. Phipson B, Maksimovic J, Oshlack A. missMethyl: an R package for analyzing data from Illumina’s HumanMethylation450 platform. Bioinformatics. 2016;32:286–8. Rimoldi M, et al. DNA methylation patterns of transcription factor binding regions characterize their functional and evolutionary contexts. Genome Biol. 2024;25:146. Yin Y, et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Sci (1979). 2017;356:eaaj2239. Malaymar Pinar D, et al. Nuclear Factor I Family Members are Key Transcription Factors Regulating Gene Expression. Mol Cell Proteom. 2025;24:100890. Chen K-S, Lim JWC, Richards LJ, Bunt J. The convergent roles of the nuclear factor I transcription factors in development and cancer. Cancer Lett. 2017;410:124–38. Mason S, Piper M, Gronostajski RM, Richards LJ. Nuclear Factor One Transcription Factors in CNS Development. Mol Neurobiol. 2009;39:10–23. Liu Y et al. PD-L1-mediated immune evasion in triple-negative breast cancer is linked to the loss of ZNF652. Cell Rep 42, (2023). Kumar R, et al. ZNF652, A Novel Zinc Finger Protein, Interacts with the Putative Breast Tumor Suppressor CBFA2T3 to Repress Transcription. Mol Cancer Res. 2006;4:655–65. Tiyaboonchai, A. et al. GATA6 Plays an Important Role in the Induction of Human Definitive Endoderm, Development of the Pancreas, and Functionality of Pancreatic β Cells. Stem Cell Reports 8, 589–604 (2017). Wang X, et al. HOXC9 directly regulates distinct sets of genes to coordinate diverse cellular processes during neuronal differentiation. BMC Genomics. 2013;14:830. Hur H, et al. HOXC9 Induces Phenotypic Switching between Proliferation and Invasion in Breast Cancer Cells. J Cancer. 2016;7:768–73. Ghazimoradi MH, Babashah S. The transcriptional regulators GATA6 and TET1 regulate the TGF-β pathway in cancer-associated fibroblasts to promote breast cancer progression. Cell Death Discov. 2025;11:164. Giuili E, et al. Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs). Hum Genet. 2023;142:1721–35. Grolaux R, et al. Identification of differentially methylated regions in rare diseases from a single-patient perspective. Clin Epigenetics. 2022;14:174. Draškovič T, Hauptman N. Discovery of novel DNA methylation biomarker panels for the diagnosis and differentiation between common adenocarcinomas and their liver metastases. Sci Rep. 2024;14:3095. Wang T, et al. A multiplex blood-based assay targeting DNA methylation in PBMCs enables early detection of breast cancer. Nat Commun. 2023;14:4724. Riboli E. The European Prospective Investigation into Cancer and Nutrition (EPIC): Plans and Progress. J Nutr. 2001;131:S170–5. López-Otín C, Blasco MA, Partridge L, Serrano M, Kroemer G. Hallmarks of aging: An expanding universe. Cell. 2023;186:243–78. Campisi J, et al. From discoveries in ageing research to therapeutics for healthy ageing. Nature. 2019;571:183–92. Guo X, Teschendorff AE. Epigenetic clocks and inflammaging: pitfalls caused by ignoring cell-type heterogeneity. Geroscience. 2025;47:2707–19. Li J, et al. Determining a multimodal aging clock in a cohort of Chinese women. Med. 2023;4:825–e84813. Reynolds LM, et al. Age-related variations in the methylome associated with gene expression in human monocytes and T cells. Nat Commun. 2014;5:5366. Wang Y, et al. Epigenetic influences on aging: a longitudinal genome-wide methylation study in old Swedish twins. Epigenetics. 2018;13:975–87. Yuan T, et al. An Integrative Multi-scale Analysis of the Dynamic DNA Methylation Landscape in Aging. PLoS Genet. 2015;11:e1004996. Lu T, et al. REST and stress resistance in ageing and Alzheimer’s disease. Nature. 2014;507:448–54. Welsh H, et al. Age-related changes in DNA methylation in a sample of elderly Brazilians. Clin Epigenetics. 2025;17:17. Ballas N, Grunseich C, Lu DD, Speh JC, Mandel G. REST and Its Corepressors Mediate Plasticity of Neuronal Gene Chromatin throughout Neurogenesis. Cell. 2005;121:645–57. dos Santos GA, Chatsirisupachai K, Avelar RA, de Magalhães. J. P. Transcriptomic analysis reveals a tissue-specific loss of identity during ageing and cancer. BMC Genomics. 2023;24:644. Martinez-Jimenez CP, et al. Aging increases cell-to-cell transcriptional variability upon immune stimulation. Sci (1979). 2017;355:1433–6. Mogilenko DA, Shchukina I, Artyomov M. N. Immune ageing at single-cell resolution. Nat Rev Immunol. 2022;22:484–98. Moqri M, et al. Integrative epigenetics and transcriptomics identify aging genes in human blood. bioRxiv. 2024. 10.1101/2024.05.30.596713 . Awasthi N, Liongue C, Ward AC. STAT proteins: a kaleidoscope of canonical and non-canonical functions in immunity and cancer. J Hematol Oncol. 2021;14:198. Yang L, et al. Methylation of a CGATA element inhibits binding and regulation by GATA-1. Nat Commun. 2020;11:2560. Perumal N, et al. Nuclear factor I/B: Duality in action in cancer pathophysiology. Cancer Lett. 2025;609:217349. Ma H-Y, et al. NFIX suppresses breast cancer cell proliferation by delaying mitosis through downregulation of CDK1 expression. Cell Death Discov. 2025;11:77. Han L, et al. Changes in DNA methylation from pre- to post-adolescence are associated with pubertal exposures. Clin Epigenetics. 2019;11:176. Martino DJ, et al. Evidence for age-related and individual-specific changes in DNA methylation profile of mononuclear cells during early immune development in humans. Epigenetics. 2011;6:1085–94. Wikenius E, Moe V, Smith L, Heiervang ER, Berglund A. DNA methylation changes in infants between 6 and 52 weeks. Sci Rep. 2019;9:17587. Jacques M et al. DNA Methylation Ageing Atlas Across 17 Human Tissues. bioRxiv 2025.07.21.665830 (2025) 10.1101/2025.07.21.665830 Zhou W, Triche TJ Jr, Laird PW, Shen H. SeSAMe: reducing artifactual detection of DNA methylation by Infinium BeadChips in genomic deletions. Nucleic Acids Res. 2018;46:e123–123. Zhou W, Laird PW, Shen H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 2017;45:e22–22. Wood SN. Generalized Additive Models: An Introduction with R . (2017). Liberzon A, et al. The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 2015;1:417–25. Milacic M, et al. The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res. 2024;52:D672–8. Consortium TGO, et al. The Gene Ontology knowledgebase in 2023. Genetics. 2023;224:iyad031. Kanehisa M, Goto SKEGG. Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28:27–30. Xiong Z, et al. EWAS Open Platform: integrated data, knowledge and toolkit for epigenome-wide association study. Nucleic Acids Res. 2022;50:D1004–9. Heinz S, et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell. 2010;38:576–89. Wielscher M, et al. DNA methylation signature of chronic low-grade inflammation and its role in cardio-respiratory diseases. Nat Commun. 2022;13:2408. Additional Declarations No competing interests reported. Supplementary Files SuppTable1.xlsx SuppTable2.csv SuppTable3.xlsx SuppTable4.xlsx SuppTable5.xlsx SuppFig.pdf Cite Share Download PDF Status: Published Journal Publication published 04 Feb, 2026 Read the published version in Genome Biology → Version 1 posted Editorial decision: Revision requested 13 Oct, 2025 Reviews received at journal 06 Oct, 2025 Reviews received at journal 24 Sep, 2025 Reviewers agreed at journal 17 Sep, 2025 Reviewers agreed at journal 17 Sep, 2025 Reviewers invited by journal 04 Sep, 2025 Editor assigned by journal 04 Sep, 2025 Submission checks completed at journal 03 Sep, 2025 First submitted to journal 02 Sep, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7516867","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":510827805,"identity":"26b6364a-8fc4-4844-9104-92599297fd1f","order_by":0,"name":"Robin Grolaux","email":"","orcid":"","institution":"Monash University","correspondingAuthor":false,"prefix":"","firstName":"Robin","middleName":"","lastName":"Grolaux","suffix":""},{"id":510827807,"identity":"dfb592f9-74f2-4044-9428-9f1d5b822bb6","order_by":1,"name":"Macsue Jacques","email":"","orcid":"","institution":"Monash University","correspondingAuthor":false,"prefix":"","firstName":"Macsue","middleName":"","lastName":"Jacques","suffix":""},{"id":510827810,"identity":"a0915b1a-e159-4dc6-aa64-fd52c428fd96","order_by":2,"name":"Bernadette Jones-Freeman","email":"","orcid":"","institution":"Monash University","correspondingAuthor":false,"prefix":"","firstName":"Bernadette","middleName":"","lastName":"Jones-Freeman","suffix":""},{"id":510827812,"identity":"ef958b97-204f-4073-a3b3-67c813821c24","order_by":3,"name":"Steve Horvath","email":"","orcid":"","institution":"Altos Labs","correspondingAuthor":false,"prefix":"","firstName":"Steve","middleName":"","lastName":"Horvath","suffix":""},{"id":510827814,"identity":"b670fede-e068-4cdc-b901-33bae5236158","order_by":4,"name":"Andrew Teschendorff","email":"","orcid":"","institution":"Shanghai Institute of Nutrition and Health","correspondingAuthor":false,"prefix":"","firstName":"Andrew","middleName":"","lastName":"Teschendorff","suffix":""},{"id":510827815,"identity":"ecaba7f9-5030-487f-b854-6633091aee24","order_by":5,"name":"Nir Eynon","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA4UlEQVRIiWNgGAWjYBACg8MMDMw/QCz2BrBAAkgQrxYLkBYwi+cAkVpsDsC0SCQQq+U47wNmxrY7cuaSrxMfF9TU5TGwN2+TwKfF7DC7AfPHtmfGlrNzNxvPOHa4mIHnWBkBLWwMQFsOJ264nbtNmoftQGKDRI4ZXi3GcC03zwK1/KtLbJB/g1+LIVzLDd5t0rxtzEBbeAhrOcxw7rCxwRmgX3j7Die28aQVW+DTYnD+GOPjH2WH5QyOn934mOdbXWI/++GNN/BpAYEDjGxIPDac6lDAH+KUjYJRMApGwQgFAJHtSvvL6xtQAAAAAElFTkSuQmCC","orcid":"","institution":"Monash University","correspondingAuthor":true,"prefix":"","firstName":"Nir","middleName":"","lastName":"Eynon","suffix":""}],"badges":[],"createdAt":"2025-09-02 10:53:27","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7516867/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7516867/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s13059-026-03952-z","type":"published","date":"2026-02-04T15:59:38+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":91108914,"identity":"153caaa6-992f-4446-8e2a-d615a2859058","added_by":"auto","created_at":"2025-09-11 16:03:51","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":1034993,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA\u003c/strong\u003e, SNITCH Pipeline. \u003cstrong\u003eB\u003c/strong\u003e, Distribution of the simulated aging patterns. \u003cstrong\u003eC\u003c/strong\u003e, Confusion matrix between the ground truth and predicted clusters. \u003cstrong\u003eD\u003c/strong\u003e, Benchmark of SNITCH compared to stand-alone unsupervised clustering methods and DICNAP. Gam = Generalized Additive Model; lm = Linear Model; BIC = Bayesian Information Criterion; FDR = False Discovery Rate; LI = Linear Increasing; LD = Linear Decreasing; VI = Variance Increasing; NC = Non-Correlated; NL = Non-Linear.\u003c/p\u003e","description":"","filename":"Figure1101.png","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/76a162926ef0459e61812dcd.png"},{"id":91108916,"identity":"e85f0f6f-63ef-4742-b34b-a6f40e3addb8","added_by":"auto","created_at":"2025-09-11 16:03:51","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":754364,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA\u003c/strong\u003e, Repartition of the number of CpGs among clusters in females and males. \u003cstrong\u003eB\u003c/strong\u003e, Conserved CpGs between male and female clusters. \u003cstrong\u003eC\u003c/strong\u003e, Non-linear clusters identified in females (top) and males (bottom). Beta values were centered and scaled prior FPCA and unsupervised clustering and are used to better illustrate the patterns. \u003cstrong\u003eD\u003c/strong\u003e, Chi-square test for the conservation of CpGs among female and male clusters. LI = Linear Increasing; LD = Linear Decreasing; VI = Variance Increasing; NC = Non-Correlated; NL = Non-Linear. Note: In A \u0026amp; B, CpGs that were NC across all 3 models were removed for visualization purposes.\u003c/p\u003e","description":"","filename":"Figure1102.png","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/fb80426046dddd2b46f9c45d.png"},{"id":91108915,"identity":"a307d979-0cef-44ad-946c-6e00a1ceb919","added_by":"auto","created_at":"2025-09-11 16:03:51","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":827973,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA\u003c/strong\u003e, Enrichment of CpG labels across 9 epigenetic clocks. \u003cstrong\u003eB\u003c/strong\u003e, Chromatin state enrichment analysis across age-related clusters identified in females and males. \u003cstrong\u003eC, \u003c/strong\u003eKaplan–Meier survival curves stratified by NL3 cluster tertiles in females. Participants with high NL3 scores had a significantly shorter cancer-free survival period compared to those in the low and mid tertiles (log-rank p = 0.003). The dashed lines represent the median time before diagnosis. Shaded regions indicate 95% confidence intervals. \u003cstrong\u003eD\u003c/strong\u003e, Top: Regression model of the female NL3 cluster eigenvalue against estimated CRP levels. Bottom: Coefficient from a linear regression model predicting estimated CRP levels with each cluster's eigenvalues in males and females. Abbreviations in B correspond to the classes from the Roadmap Epigenomics project\u003csup\u003e\u003cem\u003e34\u003c/em\u003e\u003c/sup\u003e. CRP: C-Reactive Protein.\u003c/p\u003e","description":"","filename":"Figure1103.png","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/b860caf1f6bda6bae8fe6109.png"},{"id":91109979,"identity":"f3cf297b-be11-43b8-9592-09ce3b164e48","added_by":"auto","created_at":"2025-09-11 16:11:51","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":268365,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eA\u003c/strong\u003e, DEswan analysis in males and females. Each point represents the number of CpGs dysregulated between a 15-year windows on each side of the specified age at FDR \u0026lt; 0.05. \u003cstrong\u003eB\u003c/strong\u003e, DEswan analysis at different FDR thresholds. \u003cstrong\u003eC\u003c/strong\u003e, The overlapping CpGs between waves of regulations and NL clusters in females (left), males (center), and between male and female waves (right). \u003cstrong\u003eD\u003c/strong\u003e, Motif enrichment analysis among common CpGs identified across waves of dysregulation in females (left), males (center), and across sex (right).\u003c/p\u003e","description":"","filename":"Figure1104.png","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/dea9dad20b72ed1c0382e7d4.png"},{"id":102234117,"identity":"398ca2e8-a303-4be8-88fd-a28d0fe30098","added_by":"auto","created_at":"2026-02-09 16:06:44","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4197612,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/5b7b2e5f-2934-498e-a382-107f9f50d607.pdf"},{"id":91108947,"identity":"d33a509c-f546-419e-8c25-15043d48f8b3","added_by":"auto","created_at":"2025-09-11 16:03:52","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":24653554,"visible":true,"origin":"","legend":"","description":"","filename":"SuppTable1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/04de52df748751db8750e3a1.xlsx"},{"id":91108920,"identity":"4a83fad1-0a52-488a-94f1-a3f598b1a1db","added_by":"auto","created_at":"2025-09-11 16:03:51","extension":"csv","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":26482,"visible":true,"origin":"","legend":"","description":"","filename":"SuppTable2.csv","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/a3c6ae0a9c377889cd2a8a59.csv"},{"id":91109980,"identity":"228881ea-6ee1-46ef-832e-625f579cf818","added_by":"auto","created_at":"2025-09-11 16:11:51","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":137513,"visible":true,"origin":"","legend":"","description":"","filename":"SuppTable3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/e7a092f9b0bfd9f5cfd894fb.xlsx"},{"id":91108917,"identity":"77edcaaa-63d8-4a9d-9ea0-e3a67b61da95","added_by":"auto","created_at":"2025-09-11 16:03:51","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":43306,"visible":true,"origin":"","legend":"","description":"","filename":"SuppTable4.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/4ca8a185b7403699a8160bd0.xlsx"},{"id":91108918,"identity":"a436a96e-26ad-4531-89aa-bf8e0d91aebd","added_by":"auto","created_at":"2025-09-11 16:03:51","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":14255,"visible":true,"origin":"","legend":"","description":"","filename":"SuppTable5.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/734adbbae47911fcc4af8512.xlsx"},{"id":91109984,"identity":"8e75ff6f-b377-4809-8a04-259c9e9679f0","added_by":"auto","created_at":"2025-09-11 16:11:51","extension":"pdf","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":836946,"visible":true,"origin":"","legend":"","description":"","filename":"SuppFig.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7516867/v1/2ee27e0664b5ef1619d6c055.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Sex-specific non-linear DNA methylation aging trajectories reveal biomarkers of cancer risk and inflammation","fulltext":[{"header":"Background","content":"\u003cp\u003eBiological aging is often modeled as a linear process; yet many molecular and physiological changes accelerate, decelerate, or shift phases with age rather than progressing uniformly\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e. This concept of biological non-linearity, changes that deviate from a constant rate over time, is increasingly supported across diverse molecular and physiological domains. Telomere attrition, a hallmark of aging, follows a non-linear trajectory, with faster shortening in early life and slower decline in later decades\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. Transcriptomics studies in mice have identified cross-tissue non-linear gene expression patterns that correlate with protein expression\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e, as well as late-life shifts in skeletal muscle expression consistent with the ‘elbow’ (i.e. inflection point), typical of non-linear functions\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e. In humans, age-related non-linear patterns have been reported for circulating microRNAs\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e, and undulating changes have been observed in the proteome\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e, transcriptome\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e, and metabolome\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e. Together, these findings suggest that many molecular processes follow complex trajectories across the lifespan, challenging the notion of steady, uniform aging.\u003c/p\u003e\u003cp\u003eDNA methylation (DNAm) is a primary hallmark of aging, and one of the most reliable molecular surrogates for estimating biological age\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e. While many epigenetic clocks appear to have a linear relationship with chronological age, their underlying regression models often assume a more complex, log-linear relationship: notably, the Horvath 2013 pan-tissue clock and the Lu 2023 pan-mammalian clock were built by regressing DNA methylation data on a log-transformed version of age, which suggests that methylation changes rapidly in early life and then slows to a constant rate after adulthood\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. While these regression modeling approaches recognize the non-linear nature of methylation aging, they constrain trajectories to simple monotonic forms and cannot capture more complex patterns such as U-shaped curves, multi-phase dynamics, or abrupt inflection points. Indeed, lifespan analyses have documented non-linear DNAm changes at specific CpG sites\u003csup\u003e\u003cspan additionalcitationids=\"CR13 CR14\" citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e–\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. Measurements of increased variance, such as variably methylated positions, also exhibit non-linear age-related patterns, highlighting that both the mean and variability of DNAm can shift in complex ways over time\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e,\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e,\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. Despite this, most DNAm studies continue to rely on linear regression or other monotonic models, which preferentially detect features that increase or decrease steadily, while potentially missing, or mischaracterizing, non-linear signals \u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eSex-specific differences in aging are widely recognized at physiological and clinical levels \u003csup\u003e\u003cspan additionalcitationids=\"CR20\" citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e–\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e, yet remain underexplored in the context of DNAm. When considered, sex is most often modeled as an interaction term in linear frameworks, limiting the detection of patterns that differ in shape, timing, or magnitude between sexes \u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e,\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e,\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. Non-linear approaches are particularly well-suited to uncover such differences, as they can reveal age windows of abrupt divergence, sex-specific inflection points, or distinct multi-phase dynamics.\u003c/p\u003e\u003cp\u003eHere, we investigated sex-specific non-linear aging patterns of DNAm across the adult human lifespan. To enable this, we developed SNITCH (Semi-supervised Non-linear Identification and Trajectory Clustering for High-dimensional data), a robust framework for detecting and clustering CpG sites with shared linear and non-linear age-related changes. We first validated SNITCH using simulated data, then applied it to a high-quality whole-blood DNAm dataset (n = 238 males, and 256 females; 18–90 years old), accounting for immune cell composition. This analysis identified both sex-dependent and sex-independent non-linear aging trajectories. Replication in a large independent cohort confirmed that specific non-linear clusters are predictive of inflammation and cancer onset in a sex-specific manner. Our findings demonstrate the value of systematic non-linear analysis for uncovering previously hidden dimensions of epigenetic aging and provide new candidate biomarkers for disease risk stratification.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eSNITCH: Semi-supervised approach to cluster CpGs based on their aging pattern\u003c/h2\u003e\u003cp\u003eGiven the widespread evidence of non-linear aging dynamics and the shortcomings of linear analyses, there is a clear need for new methods to detect and characterize non-linear patterns of aging. Understanding aging\u0026rsquo;s complex trajectory requires analytical approaches that can capture inflection points, accelerations/decelerations, and multiphase changes in biological data. By moving beyond linear models, researchers can unveil previously hidden aging signals. Previous attempts to identify non-linear changes in DNAm have relied on the binning of age categories\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e or a priori shape of aging patterns\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e. However, systematic tools to scan for arbitrary non-linear trajectories, without pre-specifying a particular model, are needed to truly let the data reveal aging\u0026rsquo;s patterns. The closest attempt to answer this need has been described by Okada et al.\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e where functional data analysis was used to cluster CpGs as linear increasing (LI), linear decreasing (LD), non-correlated (NC), or non-linear (NL). Nevertheless, this method remained limited in its ability to discriminate between non-linear trajectories and identify increased variance (VI).\u003c/p\u003e\u003cp\u003eTo address this gap, we developed and applied SNITCH (Semi-supervised Non-linear Identification and Trajectory Clustering for High-dimensional data), a heuristic-based statistical framework that distinguishes between linear, non-linear, variable, and non-correlated methylation trajectories. The method leverages both generalized linear modeling and generalized additive modeling to identify CpGs exhibiting distinct age-associated patterns (NC, LI, LD, NL, and VI) while controlling for potential confounders. Unsupervised clustering is then applied on functional principal components of the non-linear positions to highlight CpGs sharing similar non-linear trajectories (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). We benchmarked the classification accuracy of SNITCH, in combination with clustering of non-linear patterns, against three stand-alone unsupervised clustering algorithms and the functional trajectory-based DICNAP method\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Performance was evaluated using Adjusted Rand Index (ARI) and Adjusted Mutual Information (AMI), using simulated methylation data (i.e., bound by 0\u0026ndash;1) from a highly varied pool of distributions with known ground-truth labels (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB, Methods). We found that SNITCH outperformed stand-alone unsupervised clustering algorithms and DICNAP (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eD \u0026amp; Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA). The best results were achieved by using SNITCH\u0026thinsp;+\u0026thinsp;Fuzzy/HDBSCAN (ARI: 0.97; AMI: 0.98). In this setting, we observed a robust concordance between predicted and ground truth labels, with the main missclassifications occurring between the logarithmic decreasing and linear decreasing groups (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC \u0026amp; Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eSex-specific Non-Linear aging patterns in blood\u003c/h3\u003e\n\u003cp\u003eWe then applied SNITCH to a high-quality dataset in blood (GSE246337) containing 238 and 256 males and females, respectively. Blood DNAm is highly influenced by immune cell heterogeneity, and adjustment for cell type composition is essential for the identification of epigenetic modifications independent of the immune profile\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e. Immune cell fractions of whole blood can be effectively estimated by the use of DNAm-based deconvolution methods\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e. Thus, in both females and males, we built three different models of aging trajectories accounting for an increasing number of immune cell types. Our baseline model didn\u0026rsquo;t include any cell fractions. The 7 cell-model was corrected for B-, NK-, CD4T, and CD8T-cells, Monocytes, Neutrophils, and Eosinophils. Finally, the 12-cell model discriminated between naive and mature B-, CD4T-, and CD8T-cells, and added T-regulatory cells and Basophils. This rigorous approach allowed us to identify DNAm aging patterns occurring independently from changes in immune cell fractions.\u003c/p\u003e\u003cp\u003eOverall, CpG classification was highly stable across models in both sexes: In females, 95.9% (N\u0026thinsp;=\u0026thinsp;534,132) retained identical cluster assignments across all three models. This percentage changed only a little in males (95.6%). This was mainly driven by CpGs classified as Non-Correlated (NC) with age (94% in both males and females). Among the remaining CpGs that changed classification, transitions were most frequently directed toward the NC cluster upon inclusion of immune covariates in both females and males (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA, Supp. Table\u0026nbsp;1), highlighting how linear and non-linear trajectories are both confounded by changes in the immune profile. The following analyses describe the results from the 12-cell model if not otherwise stated. In both males and females, the majority of CpGs showed no correlation to age (N_fem\u0026thinsp;=\u0026thinsp;543,972 (97.7%); N_m\u0026thinsp;=\u0026thinsp;548,935 (98.6%)). Whereas the number of Non-Linear (NL) CpGs was markedly different between females (N\u0026thinsp;=\u0026thinsp;1305) and males (N\u0026thinsp;=\u0026thinsp;155) (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB, Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). Those results were surprising in light of a recent meta-analysis showing that almost half of the blood CpGs showed differential methylation with age\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. We investigated whether our analysis was underpowered by combining male and female participants, effectively doubling the size of the cohort, and accounting for sex as a covariate in the model. Consistent with our sex-specific analysis, we found that 93% of the CpGs (N\u0026thinsp;=\u0026thinsp;516836) stayed non-correlated with age, suggesting that sample size plays a negligible part in our results. These contrasting findings may be explained by our strict QC and significance thresholds (Methods), the high quality of our dataset, and the correction for 12 immune cell fractions against the 5 included in the meta-analysis.\u003c/p\u003e\u003cp\u003eTo evaluate the concordance of CpG aging trajectory classifications between sexes, we compared SNITCH-assigned trajectory labels in males and females using a Chi-square test of independence. The analysis revealed a highly significant association (χ\u0026sup2; = 403,770, df\u0026thinsp;=\u0026thinsp;16, pvalue\u0026thinsp;\u0026lt;\u0026thinsp;2.2 \u0026times; 10⁻\u0026sup1;⁶), indicating that CpGs were classified into the same trajectory category in both sexes more frequently than expected by chance. Out of the total CpGs assessed, only 9,938 CpGs (~\u0026thinsp;1.78%) changed trajectory class between males and females, confirming a high degree of overall consistency. This was further supported by the standardized residuals (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD), which showed strong positive values along the diagonal of the contingency matrix, reflecting substantial overlap in classification across sexes, including in the NL\u0026ndash;NL cell, indicating an enrichment of the 39 CpGs classified as non-linear in both sexes despite the discrepancy in the total number of NL CpGs identified. Conversely, the most pronounced negative residuals were observed in off-diagonal cells where CpGs were classified as age-associated (LD or LI) in one sex but NC in the other. This pattern suggests a subset of CpGs with potential sex-specific sensitivity to age-related methylation changes. In addition, NL CpGs showed moderate positive residuals when aligned with LD (+\u0026thinsp;140.3) and LI (+\u0026thinsp;36.0) in the opposite sex, suggesting that a subset of CpGs classified as non-linear in one sex may appear more linear in the other. Finally, the NL\u0026ndash;NC cells exhibited a residual of \u0026minus;\u0026thinsp;138.4 in females and \u0026minus;\u0026thinsp;61.2 in males, indicating that CpGs classified as non-linear in one sex were rarely non-correlated in the other, further supporting their functional relevance. To identify clusters of CpGs sharing similar aging trajectories, we applied the last step of our pipeline by performing unsupervised clustering on the functional components of the NL CpGs (Methods). This revealed similar-shaped trajectories in males and females, with 4 primary clusters identified in females and 5 in males (Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). To avoid redundancy between similar clusters, we merged clusters with a Spearman correlation coefficient\u0026thinsp;\u0026gt;\u0026thinsp;0.9 (Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). This resulted in a final number of 4 principal NL patterns in females and males (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC). Within those, we observed different inflection points, marking a change of pace in the methylation trajectory. In females, clusters 3, 11, and 12 showed an elbow between the 70\u0026ndash;80 years mark, where cluster 2 showed it earlier, around the 50 years mark. In males, the inflection points appeared around 60 years for clusters 1 and 3 and 50 years for clusters 2 and 4. This analysis highlighted the similar aging patterns seen in males and females, but hinted at different temporalities for the inflection points.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eFunctional analysis of Age-related trajectories\u003c/h3\u003e\n\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003eTrajectories of clocks\u0026rsquo; CpGs\u003c/h2\u003e\u003cp\u003eThe first step we took to understand the functional role of the clusters we identified was to investigate the classification of CpGs previously used to train epigenetic clocks. A ubiquitous tool in the field of aging, epigenetic clocks are biomarkers that show relevance in assessing the onset of several age-related conditions as well as the utility of therapeutic strategies. Most epigenetic clocks are machine-learning models based on linear regression\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e (e.g., elastic-net). By construction, this limits the granularity of the aging trajectories they capture by either overlooking non-linear patterns or over-simplifying them as linear, thus limiting their biological interpretability. Further complicating their interpretation, blood-based epigenetic clocks often ignore cell-type heterogeneity when considering age-related DNAm changes. Within our two cohorts, we looked at the classification labels of the CpGs underlying 9 of the most common clocks\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e,\u003cspan additionalcitationids=\"CR27 CR28 CR29 CR30\" citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e across our 3 models (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). As expected, we found that the proportions of CpGs classified as NC increased between each model iteration, reflecting that part of the signal captured by these clocks arises from age-driven changes in immune-cell fractions\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e. In addition, our results highlighted that the majority of the clocks\u0026rsquo; CpGs remained classified as NC in our baseline model. Notably, the Hannum clock was the only clock that showed a higher proportion of CpG associated with age (LI \u0026amp; LD) compared to NC across all three models. This finding is consistent with the inherent designs of each clock: the 2nd-generation clocks (PhenoAge, GrimAge, DunedinPace, Zhang_10, and YingAge) were trained on phenotypic age or mortality risk\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e,\u003cspan additionalcitationids=\"CR29 CR30 CR31\" citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e, overlooking chronological CpGs, and Horvath\u0026rsquo;s clock was trained across multiple tissues\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e, where Hannum\u0026rsquo;s clock was built to estimate chronological age in blood\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e. Finally, we observed that most of the clocks captured NL and VI CpGs in males and females. To complement this analysis, we examined a published set of 350 age-associated CpGs shared across immune cell types\u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e. Applying the same approach to these loci yielded classifications of hyper- and hypo-methylation that aligned with those reported in the original study, reinforcing the validity of our modeling strategy, which incorporates immune cell composition (Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA). Moreover, our results revealed that a subset of these CpGs exhibited non-linear associations with age.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eFunctional analyses of sex-specific aging patterns\u003c/h3\u003e\n\u003cp\u003eAfter successfully identifying sex-specific ageing methylation patterns, we performed separate functional analysis in males and females for all age-associated clusters.\u003c/p\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003eChromatin enrichment analysis\u003c/h2\u003e\u003cp\u003eWe performed chromatin state enrichment analysis in males and females to determine the epigenetic context of age-associated methylation clusters. Each cluster was tested for enrichment across 15 chromatin states from the Roadmap Epigenomics Project\u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e using the NC cluster as a reference (Methods). Overall enrichment results were highly similar between males and females in the LD, LI, and VI clusters (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB, Supp. Table\u0026nbsp;2). Those similarities are consistent with the overlap of CpGs observed in those clusters between males and females (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC \u0026amp; \u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). Among the remarkable trends, we observed a notable enrichment of the Repressed polycomb and bivalent chromatin states in clusters characterized by increased methylation or age-related variance (LI, VI, NL11, NL12 in females \u0026amp; LI, VI, NL0, NL1 in males). This observation is consistent with previous findings that age-related hypermethylation occurs preferentially in those domains\u003csup\u003e\u003cspan additionalcitationids=\"CR36\" citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e. Overall, the enrichment of chromatin states at NL clusters seemed mostly driven by the directionality of the trajectories (gain vs. loss of methylation) rather than by the individual patterns, as they were concordant with the LI or LD clusters, respectively. A noticeable deviation from this was the significant enrichment of \u0026ldquo;weak transcription\u0026rdquo; states in NL3 in females (hypomethylated with age) compared to its depletion in LD (NL3 log₂ OR\u0026thinsp;=\u0026thinsp;0.7, LD log₂ OR = -0.14). This is supported by previous studies highlighting that hypomethylated sites are enriched in active or transcribed genomic regions, including the weak transcription chromatin state\u003csup\u003e\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e,\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e. In particular, this suggests a breaking point around 60 years old in females, where erosion of DNA methylation in weakly transcribed or formerly silent chromatin regions could lead to leaky transcription, supporting the theory of increased transcriptional noise in aging\u003csup\u003e\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003ePathway enrichment analysis\u003c/h3\u003e\n\u003cp\u003eNext, we performed pathway enrichment analysis for our non-linear clusters using different databases (the Gene Ontology (GO) Molecular Function (MF) and Biological Process (BP), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Reactome). Enrichment analyses in DNA methylation datasets are inherently biased toward long genes, and due to CpGs mapping to multiple genes\u003csup\u003e\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e,\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e\u003c/sup\u003e. To account for those, we used \u003cem\u003emissmethyl\u003c/em\u003e, a method that addresses these biases by leveraging prior probabilities\u003csup\u003e\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e,\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e. Across all NL clusters, only NL12 in females was enriched for the term \u0026ldquo;Neuroactive ligand signaling\u0026rdquo; in KEGG (FDR\u0026thinsp;=\u0026thinsp;0.023); no enrichment was observed for NL clusters in males (Supp. Table\u0026nbsp;3).\u003c/p\u003e\n\u003ch3\u003eMotif enrichment analysis\u003c/h3\u003e\n\u003cp\u003eIn addition to the local chromatin context, the underlying DNA sequence also contains biologically relevant information that can help understand the function of a particular set of CpGs. Although the mechanisms haven\u0026rsquo;t been fully elucidated yet, it is widely accepted that the methylation context at a specific motif can positively or negatively impact binding affinity of transcription factors (TFs)\u003csup\u003e\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e,\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u003c/sup\u003e. Focusing on the NL CpGs, we performed cluster-wise motif enrichment analysis to identify the presence of TFs binding sites (TFBs) in their vicinity (Methods). In females, only NL12 and NL3 showed significant enriched motifs (Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB, Supp Table\u0026nbsp;3). NL12 was enriched in ZNF652 binding site (qval\u0026thinsp;=\u0026thinsp;0.0249) where NL3 was enriched for both Nuclear Factor 1 (NF1/CTF) half- (qval\u0026thinsp;=\u0026thinsp;1x10\u003csup\u003e\u0026minus;\u0026thinsp;5\u003c/sup\u003e) and full motif (qval\u0026thinsp;=\u0026thinsp;0.0026), Hoxc9 (qval\u0026thinsp;=\u0026thinsp;0.0411) and Gata6 (qval\u0026thinsp;=\u0026thinsp;0.0476). In males, only NL1 and NL4 were enriched in TFBs (Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC, Supp. Table\u0026nbsp;4). Similar to NL3 in females, the NL4 cluster in males was enriched for NF1 half- (qval\u0026thinsp;=\u0026thinsp;0.0063) and full-site (qval\u0026thinsp;=\u0026thinsp;1x10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e) along with REST/NRSF (qval\u0026thinsp;=\u0026thinsp;1x10\u003csup\u003e\u0026minus;\u0026thinsp;5\u003c/sup\u003e). The NF1-CTF family of transcription factors was shown to regulate cell development in the central nervous system, is associated with cancer, and plays a key role in the regulation of transcription\u003csup\u003e\u003cspan additionalcitationids=\"CR47\" citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e\u003c/sup\u003e. Similarly, ZNF652 acts as a potent tumor suppressor in breast cancer by repressing the transcription of oncogenes\u003csup\u003e\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e,\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e\u003c/sup\u003e, while both Gata6 and Hoxc9 are involved in embryogenesis and have oncogenic properties\u003csup\u003e\u003cspan additionalcitationids=\"CR52 CR53\" citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003eNonlinear clusters as biomarkers of diseases\u003c/h2\u003e\u003cdiv id=\"Sec12\" class=\"Section3\"\u003e\u003ch2\u003eCancer risk\u003c/h2\u003e\u003cp\u003eDNA methylation, or surrogate measures such as epigenetic age-acceleration, have been used as biomarkers to predict the onset of or diagnose a wide range of pathological conditions, ranging from rare developmental diseases to cancer\u003csup\u003e\u003cspan additionalcitationids=\"CR56 CR57\" citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e\u003c/sup\u003e. Given the presence of transcription factor binding motifs known to regulate oncogenic pathways within the methylation clusters, we hypothesized that these epigenetic modules may capture pre-diagnostic signals of cancer risk. To evaluate the predictive value of the NL clusters for cancer onset, we assessed the association between their eigenvalues and cancer development using three complementary approaches: Cox proportional hazards models, Kaplan\u0026ndash;Meier survival analysis, and logistic regression. Analyses were conducted in the EPIC-Italy cohort\u003csup\u003e\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e\u003c/sup\u003e, a large prospective study in which blood DNA methylation was profiled at baseline in healthy participants, with up to 15 years of follow-up for incident cancer diagnoses. The cohort includes time-to-diagnosis information for several cancer types, with breast cancer (C50) and colorectal cancer (C18) being the most prevalent. We first assessed associations across all cancer types. Stratifying by sex revealed a clear sex-specific predictive capacity. In females, NL3 emerged as the most robust predictor of cancer risk across all analyses. In Cox regression, NL3 eigenvalues were significantly associated with a shorter time to cancer diagnosis (HR 1.020, 95% CI: 1.007\u0026ndash;1.032, FDR\u0026thinsp;=\u0026thinsp;0.0058) (Supp. Table\u0026nbsp;4). This was supported by Kaplan\u0026ndash;Meier survival analysis, where NL3 tertiles showed clear separation in cancer-free survival (log-rank p\u0026thinsp;=\u0026thinsp;0.003, Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). Logistic regression confirmed the predictive effect (OR\u0026thinsp;=\u0026thinsp;1.032, 95% CI: 1.012\u0026ndash;1.052, FDR\u0026thinsp;=\u0026thinsp;0.0083). The NL11 cluster also showed a potential association with cancer risk. While its Cox result was borderline (HR\u0026thinsp;=\u0026thinsp;1.039, FDR\u0026thinsp;=\u0026thinsp;0.063), logistic regression yielded a significant effect (OR\u0026thinsp;=\u0026thinsp;1.065, 95% CI: 1.008\u0026ndash;1.126, FDR\u0026thinsp;=\u0026thinsp;0.049), suggesting a possible predictive role that warrants further investigation. No associations were detected for NL2 or NL12 in any models. In males, none of the tested clusters showed significant associations. Next, we stratified the cohort by cancer type to determine whether these associations were driven by specific tumor types. In the breast cancer sub-cohort (female participants only), NL3 remained a robust predictor of disease onset in both models (Cox HR\u0026thinsp;=\u0026thinsp;1.023, 95% CI: 1.009\u0026ndash;1.037, FDR\u0026thinsp;=\u0026thinsp;0.0045; Logistic OR\u0026thinsp;=\u0026thinsp;1.035, 95% CI: 1.013\u0026ndash;1.058, FDR\u0026thinsp;=\u0026thinsp;0.0051) (Supp. Table\u0026nbsp;4). Likewise, NL11 showed consistent predictive value (Cox HR\u0026thinsp;=\u0026thinsp;1.065, 95% CI: 1.019\u0026ndash;1.113, FDR\u0026thinsp;=\u0026thinsp;0.0097; Logistic OR\u0026thinsp;=\u0026thinsp;1.101, 95% CI: 1.034\u0026ndash;1.172, FDR\u0026thinsp;=\u0026thinsp;0.0051). These results suggest that NL3 and NL11 capture biologically relevant, pre-diagnostic methylation signals specifically associated with breast cancer development. In contrast, no non-linear cluster was significantly associated with colorectal cancer (C18) in either males or females. Both Cox and logistic regression models yielded non-significant associations across all clusters, indicating that the epigenetic trajectories captured by these modules do not predict colorectal cancer onset within this cohort. Together, these findings highlight the sex- and cancer-type-specificity of non-linear DNA methylation aging patterns. While NL3 and NL11 showed significant biomarker potential for breast cancer risk in females, their lack of association with colorectal cancer suggests these epigenetic trajectories reflect tissue-specific aging dynamics rather than generalized cancer susceptibility.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003eInflammation - CRP levels\u003c/h2\u003e\u003cp\u003eSenescence of the immune system is termed \u0026ldquo;inflammaging\u0026rdquo;, with chronic inflammation being one of the hallmarks of aging\u003csup\u003e\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e,\u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e\u003c/sup\u003e. Particularly relevant to our blood-based analysis, we decided to examine the association between the non-linear clusters and inflammation. We explored whether some of the age-related clusters we identified shared this predictive capacity. First, we looked at inflammation. Separately in males and females, we calculated each cluster\u0026rsquo;s eigenvalues and looked at their association with estimated C-reactive protein (CRP) levels (Methods), a well-established marker of inflammation\u003csup\u003e\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e\u003c/sup\u003e. Three nested models were used to assess these associations while adjusting for age and NC CpGs (Methods). The analysis revealed sex-specific patterns of association, with several eigenvalues significantly predicting CRP independently of chronological age. In females, the full linear model including eigenvalues from all age-related clusters significantly improved CRP prediction compared to age\u0026thinsp;+\u0026thinsp;NC alone (\u003cem\u003eadjusted R\u0026sup2; =\u003c/em\u003e0.422; ANOVA \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;8.2 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;11\u003c/sup\u003e). Several modules showed robust associations (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD, Supp. Table\u0026nbsp;4). The NL3 module displayed the strongest positive association with CRP (β\u0026thinsp;=\u0026thinsp;0.0221, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;2.06 \u0026times; 10⁻⁶), suggesting that methylation patterns in this cluster closely align with inflammatory status. The LD module was significantly negatively associated with CRP (β = \u0026minus;\u0026thinsp;0.0098, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;1.34 \u0026times; 10⁻⁴), indicating a potential protective or anti-inflammatory methylation pattern. NL11 also revealed a significant negative association (β = \u0026minus;\u0026thinsp;0.0200, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.023), while VI showed a significant positive association (β\u0026thinsp;=\u0026thinsp;0.0179, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;6.9 \u0026times; 10⁻⁴). Notably, the NC eigenvalue was not significantly associated with CRP (\u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.266), although it contained 87% of the CpGs used in CRP estimations (Supp. Figure\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eD). This result likely stems from the fact that we accounted for 12 immune cells fractions in our model, including na\u0026iuml;ve and mature Tells which the CRP signature relies on\u003csup\u003e\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e\u003c/sup\u003e, and further shows that the NL patterns captured methylation variation independently of the immune profile. Together, these results suggest that inflammation in females is selectively captured by distinct age-related methylation patterns, some of which are positively associated with inflammatory burden (e.g., VI, NL3) and others inversely associated (e.g., LD, NL11), reflecting complex and potentially compensatory epigenetic dynamics. In contrast, males exhibited a more restricted profile of significant associations. While the full model remained statistically significant (\u003cem\u003eadjusted R\u0026sup2;\u003c/em\u003e = 0.423; ANOVA \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;4.82 \u0026times; 10\u003csup\u003e\u0026minus;\u003c/sup\u003e⁵), only the LI module showed a significant positive association with estimated CRP levels (β\u0026thinsp;=\u0026thinsp;0.0059, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.0037) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD, Supp. Table\u0026nbsp;4). Associations with other clusters, such as NL1, NL2, and LD, did not reach significance, although some trends were observed. The NC module was again not predictive (\u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.65), reinforcing the specificity of the signal to age-related modules. These sex-specific patterns suggest that while methylation-based inflammation signatures exist in both sexes, females display a more modular and interpretable architecture. In contrast, the inflammatory signal in males may be either more diffuse across the methylome or less tightly coupled to the cluster structure identified in the present analysis. Together, these findings indicate that methylation clusters derived from non-linear age-related trajectories carry relevant information about the inflammatory burden in blood.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003eDifferent waves of regulation in males and females\u003c/h2\u003e\u003cp\u003eIn the analysis above, we identified functional clusters of age-related CpGs sharing similar linear or non-linear patterns. Although this analysis hinted at specific ages at which disruption of methylation trajectories occurs (i.e., inflection points), another analytical method has been previously used to address this question directly\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. We applied the same method used by Shen \u0026amp; al.\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e to reveal peaks of disruption of methylation patterns in our male and female populations (Methods). Our analysis focused on age-related CpGs identified at the last step and revealed distinct waves of dysregulation in males and females (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA). Those peaks were deemed robust as they passed different significance thresholds (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). In females, we observed three peaks at age 33, 51, and 73 that were consistent with 2 of the inflection points observed in the previous analysis. Whereas males only displayed two peaks at 47 and 63 years old, again consistent with the clustering analysis. The number of significantly dysregulated CpGs markedly differed between males and females, with females showing an overall higher number, similar to the discrepancy observed between their respective numbers of NL CpGs. Notably, the peaks at 33 and 51 in women are supported by previous findings highlighting those decades as transitional periods in women\u0026rsquo;s aging\u003csup\u003e\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e\u003c/sup\u003e and previous research in protein expression levels identified three peaks at 34, 60, and 78 years old\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. Recent findings in a multi-omics setting also identified crests of dysregulation around 40 and 60 years old. Both those analyses were performed in a joint cohort of men and women and might represent a combination of the sex-specific peaks we identified. Thus, our results support previous claims that aging occurs in waves across multiple modalities rather than unique molecular components. In addition, we highlighted a distinct temporality in the aging process across sex, as underlined by the earlier onset of the crests in males. Integrating this analysis with the clusters we identified previously revealed that the main CpG overlaps in males and females occur with the linear clusters LI and LD (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC). Most of the NL CpGs in females intersected with wave 3, whereas in males, they were equally represented in waves 1 and 2. This analysis further highlighted conserved CpGs across waves (N_males\u0026thinsp;=\u0026thinsp;2027 \u0026amp; N_females\u0026thinsp;=\u0026thinsp;1453) and across sexes (N\u0026thinsp;=\u0026thinsp;725). Motif enrichment analysis of those sets confirmed the enrichment of NF1 full- and half-binding motifs at age-related CpGs, and revealed the presence of CTCF and NRSF/REST binding sites, two consistent marks of epigenetic aging\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e,\u003cspan additionalcitationids=\"CR65 CR66\" citationid=\"CR64\" class=\"CitationRef\"\u003e64\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e\u003c/sup\u003e, among other TFBs (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eD). To understand the specific pathways that could be affected, we repeated our previous enrichment analysis, focusing on each peak\u0026rsquo;s unique set of CpGs (Methods). In females, waves 1, 2, and 3 showed, respectively, 2057, 873, and 2077 uniquely dysregulated CpGs. Wave 1 showed striking enrichment for neurodevelopmental processes, including \u003cem\u003enervous system development\u003c/em\u003e (FDR\u0026thinsp;=\u0026thinsp;3.9 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e), \u003cem\u003eneuron differentiation\u003c/em\u003e (FDR\u0026thinsp;=\u0026thinsp;8.5 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e), and \u003cem\u003egeneration of neurons\u003c/em\u003e (FDR\u0026thinsp;=\u0026thinsp;1.8 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;3\u003c/sup\u003e) (Supp. Figure\u0026nbsp;5). Additionally, terms related to transcriptional control and chromatin binding, such as \u003cem\u003eDNA-binding transcription factor activity\u003c/em\u003e (FDR\u0026thinsp;=\u0026thinsp;5.3 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e), and \u003cem\u003edouble-stranded DNA binding\u003c/em\u003e (FDR\u0026thinsp;=\u0026thinsp;4.2 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;3\u003c/sup\u003e) were significantly enriched. Wave 2 yielded a single enriched term: \u003cem\u003eregulation of execution phase of apoptosis\u003c/em\u003e (FDR\u0026thinsp;=\u0026thinsp;0.036) (Supp. Table). These findings recapitulate previous observations of neurodevelopmental and transcriptional gene dysregulation with age\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e\u003c/sup\u003e. Interestingly, in males, among the unique CpGs in wave 1 (N\u0026thinsp;=\u0026thinsp;1250) and wave 2 (N\u0026thinsp;=\u0026thinsp;2426), no significant enrichment was found. Likewise, the 725 CpGs common to both sexes showed no detectable enrichment. This highlights sex-specific epigenetic changes in females, with broader and more functionally coherent methylation changes. Notably, REST, a transcriptional repressor enriched in our motif analysis, is known to silence neuronal genes in non-neuronal lineages and is a key regulator of the epigenetic program that maintains cellular identity\u003csup\u003e\u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e,\u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e\u003c/sup\u003e. Its involvement, alongside the dysregulation of neurodevelopmental pathways, supports a model in which age-related methylation drift compromises the fidelity of cell-type\u0026ndash;specific gene regulation, allowing partial re-expression of lineage-inappropriate developmental programs. This aligns with emerging evidence from aging transcriptomes and methylomes indicating that immune cells in older individuals exhibit loss of identity and ectopic activation of developmental gene networks, including those tied to nervous system formation\u003csup\u003e\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e,\u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eBy leveraging a novel analysis pipeline and a high-quality DNA methylation (DNAm) dataset, we identified sex-specific non-linear aging trajectories in blood. Fine-tuning our clustering pipeline on simulated data enabled us to establish robust heuristics facilitating the identification of diverging linear and non-linear patterns (Figs.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA \u0026amp; \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). We showed that our new pipeline, SNITCH, outperforms stand-alone unsupervised clustering methods in discriminating between Variance Increase (VI), Linear Increasing and Decreasing (LI \u0026amp; LD), Non-Correlated (NC), and various Non-Linear (NL) functions (Figs.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC \u0026amp; \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eD). Nevertheless, we observed misclassification between the linear and logarithmic patterns due to their close resemblance. By providing SNITCH as a user-friendly framework, our approach is well-positioned to uncover non-linear, tissue- or context-specific aging dynamics in large cross-sectional or longitudinal datasets, including EWAS, transcriptomic time series, or multi-omic aging studies.\u003c/p\u003e\u003cp\u003eThe immune profile reshapes with age, and analyses in whole blood are particularly sensitive to those changes\u003csup\u003e\u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e72\u003c/span\u003e\u003c/sup\u003e. Applying our pipeline to a blood DNA methylation dataset while accounting for an increasing number of immune cell types highlighted the confounding effect of the immune profile on age-related DNAm changes (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). Stratifying our analyses across sexes revealed a notably higher number of NL CpGs identified in Females (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). Nevertheless, CpGs generally showed concordant aging trajectories across sexes, with significant enrichment for matching classifications (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). A surprising result was the small number of age-related CpGs identified (\u0026lt; 4%), seemingly contradicting findings of a meta-analysis of age-related DNAm in blood\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. We argue that our results are a conservative but robust representation of true changes occurring during aging. We used a gold-standard reference DNAm dataset that minimises batch effects, covers an equivalent number of males and females, and has a homogeneous age distribution\u003csup\u003e\u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e\u003c/sup\u003e. This, combined with our strict quality control and FDR threshold and the inclusion of 12 immune cell types as covariates in our model, ensured the identification of robust changes and explains those numbers. However, we acknowledge that due to our limited sample size, we might have missed smaller effects or population-specific changes.\u003c/p\u003e\u003cp\u003eOur study helped characterise CpGs underlying epigenetic clocks. We assessed the distribution of clock CpGs within our different aging clusters and found that most models relied on CpGs classified as NC in our model (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). Although counterintuitive, we focused on changes independent of the immune profile, where clocks are susceptible to cell fraction changes. This suggests that many clocks rely on CpGs that are stable across age when accounting for immune composition, potentially reflecting their role as proxies for cellular fraction rather than methylation-only age-related change. This was confirmed by our baseline model (not accounting for immune cells), which showed fewer CpGs classified as non-correlated. Additional explanation comes from the data used to train the clocks. Indeed, among the 9 models we investigated, only the Hannum clock was trained solely on blood DNAm using chronological age as the target variable, where the other clocks were trained on multiple tissues (Horvath’s) or phenotypic variables (PhenoAge, GrimAge, DunedinPace, Zhang_10, and YingAge). We found that most of the clocks included VI and NL CpGs. Further analyses should investigate the weight associated with those in each model to understand the dependency of the clocks on non-linear trajectories, and potentially fine-tune them to reflect the non-linear phases of aging.\u003c/p\u003e\u003cp\u003eOur analysis confirmed the enrichment of Polycomb Repressed chromatin states at hypermethylated CpGs\u003csup\u003e\u003cspan additionalcitationids=\"CR36\" citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e–\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e. Overall chromatin state enrichments were dependent on the gain or loss of methylation, irrespective of sex or linearity (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). Only the NL3 cluster in females differed from this trend and showed a small enrichment for the \u003cem\u003eweak transcription\u003c/em\u003e state. Consistent with other results, the motif enrichment analysis revealed an enrichment in the motif of REST-NRSF\u003csup\u003e\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e\u003c/sup\u003e in the male hypermethylated NL1 cluster. The loss of REST is associated with cognitive impairment and Alzheimer's disease\u003csup\u003e\u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e\u003c/sup\u003e, and hypermethylation at its binding sites could disrupt its function. The relevance of this finding in blood should be investigated, as is the biomarker potential of the males' NL1 cluster for Alzheimer’s disease. This analysis also identified new enriched motifs at NL CpGs for the NF1-CTF family, both in males and females, suggesting a potential disruption of their function. Additional transcription factors (TFs) having their binding sites enriched included the STAT family in males (STAT3 \u0026amp; STAT4), underlying a potential disruption of the immune system\u003csup\u003e\u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e74\u003c/span\u003e\u003c/sup\u003e and developmental TFs in females (HOXC9, GATA6)\u003csup\u003e\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e,\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e\u003c/sup\u003e. Notably, the enrichment of motifs from developmental TFs at hypomethylated sites could indicate the loss of a safeguarding mechanism consistent with the epigenetic drift view. Nevertheless, our analyses suggest that this drift is not constant but suffers from episodes of dysregulation.\u003c/p\u003e\u003cp\u003eThe functional role of our NL clusters was further explored by looking at their relationship with inflammation and predictive capabilities for cancer onset in an independent cohort. With a robust statistical model accounting for immune cell fractions, we identified NL3 in females as being strongly associated with both inflammation (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD) and cancer development probabilities (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC), and the female NL11 cluster as mildly associated. The functional specificity of those clusters was underscored by the absence of predictive power from the NC module across sexes. Those results showed the relevance of our identified NL clusters by validating their functionality in an independent cohort. Furthermore, the more robust and multifaceted associations in females strongly support the argument for sex-stratified biomarkers, particularly in the context of aging, cancer, and inflammation. Our motif enrichment analysis laid the ground for further investigating the mechanisms underlying the predictive capacity of females NL3. Notably, NF1/CTF, Hoxc9, and Gata6 binding motifs were overrepresented in this cluster. Gata6 was recently shown to be part of a central mechanism promoting cancer-associated fibroblasts in breast cancer\u003csup\u003e\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e. The study found that the expression of Gata6 was enhanced by the action of TET1, a protein that removes methylation. Evidence shows that binding of members of the Gata family is disrupted by methylation marks\u003csup\u003e\u003cspan citationid=\"CR75\" class=\"CitationRef\"\u003e75\u003c/span\u003e\u003c/sup\u003e. Thus, the hypomethylation observed in NL3 could be associated with increased Gata6 oncogenic activity. Both Hoxc9 and members of the NF1 family have also been associated with breast cancer, although the mechanisms have not been fully elucidated and are not well documented\u003csup\u003e\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e,\u003cspan citationid=\"CR76\" class=\"CitationRef\"\u003e76\u003c/span\u003e,\u003cspan citationid=\"CR77\" class=\"CitationRef\"\u003e77\u003c/span\u003e\u003c/sup\u003e. Nevertheless, how the occurrence of these epigenetic marks in immune cells may affect cancer onset remains to be investigated.\u003c/p\u003e\u003cp\u003eOverall, our clustering analysis revealed sex-specific functional clusters of NL CpGs. They hinted at sex-specific peaks of dysregulation by showing inflection points around the 75 (NL3, NL11, NL12) and 50 (NL2) years mark in females against the 50 (NL2, NL4) and 60 (NL1, NL30) years mark in males. We used the previously described DEswan analysis\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e to formally identify peaks of dysregulation in a sex-specific manner. The DEswan results reinforced the findings of our first analysis by revealing peaks of dysregulation at 51 and 73 years old in females, and 47 and 63 years old in males (Figs.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA \u0026amp; \u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB), plus an earlier peak at 33yo in females. Those results are strongly supported by the literature, with studies in proteomics and multi-omics identifying peaks at similar ages\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e\u003c/sup\u003e. The sex-specificity of our findings indicates an earlier onset of dysregulation in males, with the last wave occurring a decade before the one in females. This also suggests a peak of dysregulation in mid-adulthood occurring in females but not males, although this last finding should be tempered by the minimum age of 25 in our cohort, probably missing prior peaks during adolescence\u003csup\u003e\u003cspan citationid=\"CR78\" class=\"CitationRef\"\u003e78\u003c/span\u003e\u003c/sup\u003e. In both males and females, CpGs identified at those peaks showed a partial overlap with the clusters (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC). In addition, both males and females showed CpGs consistently dysregulated at all peaks, with some of those conserved across sexes. Motifs enrichment for sex-specific and non-sex-specific conserved CpGs supported our previous findings with NF1/CTF and REST amongst the most enriched binding sites. Consistent with other findings\u003csup\u003e\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e,\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e\u003c/sup\u003e but absent from our cluster analysis, the CTCF motif was also enriched. Our functional analysis for the waves of dysregulation pointed toward a larger and more coherent dysregulation in females, with previously reported pathways related to DNA binding and neurogenesis being affected.\u003c/p\u003e\u003cp\u003eOur study presents several limitations. First, the datasets lacked critical covariates such as BMI or smoking status, preventing us from fully disentangling lifestyle effects from intrinsic aging-related methylation changes. Additionally, both datasets were predominantly composed of individuals of European ancestry, which may limit the generalizability of our findings to more diverse populations. Despite covering a wide age range (25–90 years old), we missed samples in the early infancy/adolescent stages, known to be associated with wide changes in DNAm\u003csup\u003e\u003cspan additionalcitationids=\"CR79\" citationid=\"CR78\" class=\"CitationRef\"\u003e78\u003c/span\u003e–\u003cspan citationid=\"CR80\" class=\"CitationRef\"\u003e80\u003c/span\u003e\u003c/sup\u003e. As a result, our characterization of methylation dynamics during early development remains incomplete, potentially missing early-life inflection points critical to the trajectory of aging. Another limitation is the cross-sectional nature of our study. This is a typical limitation in the field due to the lack of large longitudinal datasets. Thus, our analysis relies on the hypothesis that DNAm changes are conserved to a level across different individuals, a hypothesis supported by the large body of literature on DNAm in aging. Nevertheless, adopting a longitudinal approach is an important step towards personalised medicine and is a direction the field should embrace. In this work, we chose to work with a smaller, high-quality dataset to reduce noise and uncover true age-related changes. While this strategy facilitated the detection of robust changes validated in an independent cohort, our relatively small sample size (~ 240 individuals per sex) may have limited our ability to detect subtle but biologically relevant effects, reflecting the small number of age-related CpGs we found. Finally, DNAm is intrinsically difficult to functionally link to changes in the phenotype, in part due to the variety of effects that gain or loss of methylation can have depending on their location. Consequently, we emphasize that our findings are correlative and should be interpreted as surrogate markers rather than direct drivers of phenotypic changes.\u003c/p\u003e\u003cp\u003eIn the future, we aim to further explore non-linear patterns in other tissues\u003csup\u003e\u003cspan citationid=\"CR81\" class=\"CitationRef\"\u003e81\u003c/span\u003e\u003c/sup\u003e and extend our methodology to other omics, such as transcriptomics and proteomics, where the increase of variance is often omitted from clustering analyses. Our results, along with prior work, reveal enrichment of developmental transcription factor motifs (e.g., REST, NF1/CTF) at age-dysregulated CpGs\u003csup\u003e\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e,\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e\u003c/sup\u003e. This observation remains underexplored, particularly regarding how such TF-DNAm interactions shift with age and impact transcriptional regulation. Future studies integrating these findings with transcriptomic or chromatin accessibility data could elucidate the functional consequences of these methylation changes. Thus, we will focus on the female NL3 cluster that showed biomarker potential, working to characterise and validate it in other cancer cohorts.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eAltogether, our findings support a paradigm shift in the way we conceptualize aging: not as a steady, linear decline, but as a non-linear and entropic process characterized by temporally confined windows of epigenetic instability. By uncovering sex-specific, wave-like patterns of DNA methylation changes that mirror inflection points reported in other omics layers, our study strengthens the case for integrated, temporal frameworks of aging biology. The critical decades we identified, particularly around the 30s, 50s, and 70s, may represent biologically vulnerable windows, where intervention could yield the most impact. Beyond these insights, our findings underscore the need to move beyond one-size-fits-all models and adopt analytical strategies that reflect the inherent heterogeneity, non-linearity, and sex-specific nature of biological aging. However, to fully harness the translational potential of these observations, a pressing need for molecular validation remains. Future work should prioritize characterising the regulatory mechanisms underpinning these methylation dynamics, including their interaction with chromatin architecture, transcription factor networks, and other epigenetic layers, to distinguish causality from correlation and to illuminate actionable pathways of aging and disease.\u003c/p\u003e"},{"header":"Methods","content":"\u003ch2\u003eCohort and DNA methylation preprocessing\u003c/h2\u003e\u003cp\u003eTwo separate cohorts were used in this study. GSE246337\u003csup\u003e73\u003c/sup\u003e was used in the identification of ageing methylation patterns, and GSE51032 was used to identify biomarkers of cancer. For both datasets, the Raw IDAT files from the Illumina Infinium HumanMethylationEPIC v2 BeadChip array (EPICv2) or 450K, respectively, were retrieved from GEO and processed using the \u003cem\u003esesame\u003c/em\u003e R/Bioconductor package (v1.18.1)\u003csup\u003e\u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e82\u003c/span\u003e\u003c/sup\u003e. To harmonize probe identifiers and accommodate the EPICv2 platform structure, prefix collapsing was enabled during data import. Initial preprocessing followed the “QCDPB” pipeline within \u003cem\u003esesame\u003c/em\u003e, incorporating: (i) probe quality filtering using the pOOBAH method to remove probes with poor detection p-values, (ii) dye-bias correction to normalize type I and II probe discrepancies, (iii) masking of probes known to be problematic due to non-specific hybridization or SNP interference as defined by Zhou et al.\u003csup\u003e\u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e83\u003c/span\u003e\u003c/sup\u003e, and (iv) exclusion of probes supported by fewer than four beads. The resulting beta value matrix was further filtered to improve data integrity. Probes with missing values in more than 1% of samples were discarded. Subsequently, samples with more than 1% missing beta values across retained CpGs were also removed, and we removed probes on the sex chromosomes. The remaining missing values were imputed using k-nearest neighbors using the impute.knn functions from the \u003cem\u003eimpute\u003c/em\u003e package, keeping the default parameters. Beta values were rounded to three decimals for downstream analyses. For the GSE246337 dataset, we only kept probes present on the EPICv1 array, as most tools used in the downstream analysis are not yet compatible with the EPICv2 array. The final number of probes used in the downstream analyses was 556811, and final numbers of samples were 256 females and 238 males for GSE246337. For the GSE51032 cohort, the final number of probes was 382717, for 651 females and 186 males.\u003c/p\u003e\u003ch2\u003eDescription of the SNITCH analysis pipeline\u003c/h2\u003e\u003cp\u003eThe SNITCH pipeline involved three main steps: Heuristic-based classification, Functional Principal Components Analysis (FPCA), and unsupervised clustering.\u003c/p\u003e\u003ch2\u003eHeuristic-based classification\u003c/h2\u003e\u003cp\u003eFor each CpG site, methylation beta values are modeled as a function of chronological age. The core procedure included the following steps:\u003c/p\u003e\u003col\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eModel Construction\u003c/b\u003e: Linear models (LMs) are fitted using ordinary least squares regression with age as a continuous predictor (lm function from base R). Parallelly, generalized additive models (GAMs) are fitted using restricted maximum likelihood (REML) and thin plate regression splines (\u003cem\u003es(Age, k = 5)\u003c/em\u003e), enabling the detection of non-linear trends (gam function - \u003cem\u003emgcv\u003c/em\u003e\u003csup\u003e\u003cem\u003e\u003cspan citationid=\"CR84\" class=\"CitationRef\"\u003e84\u003c/span\u003e\u003c/em\u003e\u003c/sup\u003e). When relevant, covariates are included consistently across both models to preserve interpretability and comparability.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eModel Comparison and Heteroscedasticity Testing\u003c/b\u003e: The explanatory performance of GAMs relative to LMs is evaluated using the Bayesian Information Criterion (BIC) (BIC function from base R). CpGs with ΔBIC (BIC(LM) − BIC(GAM)) \u0026gt; 2 are considered to favor the non-linear model. To characterize potential violations of homoscedasticity that may underlie complex aging dynamics (VMP), the White’s test is applied to each LM, with heteroscedastic CpGs flagged based on a 1% FDR-adjusted significance threshold.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003ePrediction and Effect Size Estimation\u003c/b\u003e: For CpGs favoring a GAM fit, DNAm values are predicted across a continuous age grid ranging from the minimum age of the cohort to the maximum with a one-year step increase, holding covariates constant at reference levels (medians for numeric or first level for categorical variables). This effectively smoothed the trajectories for subsequent steps. For linear trajectories, the direction and significance of age-associated change are inferred from the LM coefficient and p-value, respectively.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eMultiple Testing and Classification\u003c/b\u003e: P-values from LM, GAM, and heteroscedasticity tests are corrected using the Benjamini-Hochberg method. CpGs are classified as:\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003c/ol\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eLI\u003c/b\u003e (Linear Increase): Significant linear association (adj. p(LM) ≤ 0.01) with a positive slope.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eLD\u003c/b\u003e (Linear Decrease): Significant linear association with a negative slope.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eNL\u003c/b\u003e (Non-linear): ΔBIC \u0026gt; 2 and adj. p(GAM) ≤ 0.01.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eVI\u003c/b\u003e (Variance-Increasing): No linear association (adj. p(LM) \u0026gt; 0.01) but significant heteroscedasticity (adj. p(White) ≤ 0.01).\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eNC\u003c/b\u003e (Non-correlated): CpGs not meeting any of the above criteria.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003cp\u003eThis classification strategy allows SNITCH to robustly detect a spectrum of epigenetic aging signatures, from canonical linear changes to more complex non-linear patterns. The SNITCH method is available and can be installed as a user-friendly R package (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/fishrscale/SNITCH\u003c/span\u003e\u003cspan address=\"https://github.com/fishrscale/SNITCH\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e, thus facilitating access to non-linear trajectory analyses.\u003c/p\u003e\u003ch2\u003eFunctional Principal Component Analysis (FPCA)\u003c/h2\u003e\u003cp\u003eTo further dissect heterogeneity within non-linear (NL) CpG methylation trajectories, we implemented a two-stage dimensionality reduction and clustering procedure based on functional principal component analysis (FPCA) followed by density-based unsupervised classification. This enabled the grouping of NL CpGs into discrete functional subclusters based on the shape of their smoothed age-associated methylation trajectories.\u003c/p\u003e\u003cp\u003eFor all CpG sites previously classified as NL by SNITCH, we use the smoothed beta values predicted by the gam model. FPCA is then applied using the fpca.face() function from the \u003cem\u003erefund\u003c/em\u003e R package, with age as the functional domain. The number of knots is set dynamically based on the number of time points (min(35, floor(0.8 * timepoints))), and the proportion of variance explained (PVE) threshold is conservatively fixed at 99.99% to retain fine-grained trajectory information. This process decomposes the complex, high-dimensional nonlinear patterns into a reduced set of orthogonal functional basis scores.\u003c/p\u003e\u003ch2\u003eUnsupervised Clustering\u003c/h2\u003e\u003cp\u003eThe resulting FPCA scores are used as input for unsupervised clustering using either the hdbscan (\u003cem\u003edbscan\u003c/em\u003e), fuzzy-clustering (\u003cem\u003emfuzz)\u003c/em\u003e, or kmeans (base R) algorithm. This procedure allowed data-driven identification of methylation trajectory subtypes without requiring prior knowledge of cluster number or shape.\u003c/p\u003e\u003ch2\u003eSimulated Data\u003c/h2\u003e\u003cp\u003eA total of 3,000 synthetic CpG sites were simulated across 300 individuals, each assigned a random age between 1 and 100 years. Fifteen trajectory archetypes were implemented to represent a range of biologically plausible methylation patterns. These included: non-correlated, linear trajectories (increasing, decreasing), quadratic trends (increasing, decreasing), logarithmic transitions (increasing, decreasing), sigmoidal dynamics (increasing, decreasing) with inflection points at ages 25, 40, and 80, variance-increasing profiles, and non-monotonic patterns. For each of the 15 trajectory classes, 200 CpGs were simulated. For all the functions except the variance-increasing, the age-specific methylation expectation (µ) was passed to a Beta distribution (via rbeta) to introduce variability while maintaining biological constraints (bounded between 0 and 1). For the variance-increasing function, age-dependent Gaussian noise was added directly to a mean of 0.5, with standard deviation increasing linearly from 0.01 (age \u0026lt; 25) to 0.3 (age = 100), mimicking stochastic methylation drift. Details are available in Supp. File.\u003c/p\u003e\u003ch2\u003eBenchmarking SNITCH\u003c/h2\u003e\u003cp\u003eThe benchmarking was done using the simulated patterns and their associated labels as ground truth. Fuzzy c-means clustering was performed using the \u003cem\u003eMfuzz\u003c/em\u003e package. The fuzzification parameter \u003cem\u003em\u003c/em\u003e was estimated using the function mestimate, and the optimal number of clusters was set to 11 after determining the minimum centroid distance across 3 repeated runs. Final cluster assignments were defined by the highest membership value. K-means was run with 25 restarts and centers = 10 by using the elbow method based on total within-cluster sum of squares (WCSS) and k.max = 20. HDBSCAN was applied using the \u003cem\u003edbscan\u003c/em\u003e package, the minimum cluster size was set to 5. The DICNAP pipeline was implemented as originally described\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e, except for the maximal number of clusters for K-means set to 20. SNITCH was run as previously described. The FPCA scores were subsequently used for unsupervised classification by the same three algorithms. The ‘\u003cem\u003ec’\u003c/em\u003e parameter was set to 11 for fuzzy clustering, and centers = 10 for K-means. Each method's cluster assignments were compared to ground-truth labels to compute ARI and AMI using the same functions from \u003cem\u003earicode\u003c/em\u003e. The results from the two best-performing methods were evaluated by a confusion matrix.\u003c/p\u003e\u003ch2\u003eIdentification of sex-specific CpG Methylation Trajectories Using SNITCH\u003c/h2\u003e\u003cp\u003eTo evaluate aging-associated methylation trajectories independently of immune cell composition, we constructed three models differing in their treatment of cellular heterogeneity. In the Baseline (BL) model, we ran SNITCH with default parameters and without correcting for covariates. In the 7-cell and 12-cell models, we accounted for cell type proportions estimated using the \u003cem\u003eEpiDISH\u003c/em\u003e package with the centDHSbloodDMC.m and cent12CT.m reference matrix, respectively, using the Robust Partial Correlations (RPC) method\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e. The 7 cell-matrix contained information on B-, NK-, CD4T, and CD8T-cells, Monocytes, Neutrophils, and Eosinophils. Building on the 7-cell reference matrix, the 12-cell model discriminated between naive and mature B-, CD4T-, and CD8T-cells, and added T-regulatory cells and Basophils. To identify groups of CpGs with similar aging dynamics, we applied both \u003cem\u003eMfuzz\u003c/em\u003e and \u003cem\u003eHDBSCAN\u003c/em\u003e to the SNITCH-classified non-linear trajectories. Clustering results were assessed visually, and we retained only the \u003cem\u003eHDBSCAN\u003c/em\u003e clusters for downstream analyses due to their superior trajectory homogeneity. Similarly, the optimal minPts parameter for HDBSCAN was determined by visual inspection of cluster resolution. The selected minPts values for the female BL, 7-cell, and 12-cell models were 5, 5, and 7, respectively. For the male models, they were 5, 8, and 4. Notably, HDBSCAN designates sparse or noisy data points as cluster 0. Upon reviewing the trajectories assigned to this group in the female 12-cell model, we observed two distinct patterns within cluster NL0. We therefore re-ran HDBSCAN on this subset to refine the classification, resulting in three clusters: NL10, NL11, and NL12. Then, for each NL cluster, we performed principal component analysis (PCA) and extracted PC1 to represent the dominant methylation pattern. A correlation matrix was then computed across all cluster PC1 using Spearman’s rank correlation. Clusters with correlations exceeding 0.90 were merged to reduce redundancy and capture shared underlying dynamics.\u003c/p\u003e\u003ch2\u003eFunctional analysis\u003c/h2\u003e\u003ch2\u003eClassification of clocks and immune cells CpGs\u003c/h2\u003e\u003cp\u003eCpGs were retrieved for the following clocks from the Biolearn resource\u003csup\u003e\u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e\u003c/sup\u003e: Horvath v1, Hannum, PhenoAge, GrimAgeV2, DunedinPACE, YingCausAge, YingDamAge, YingAdaptAge, and Zhang_10. Only unique CpG identifiers were retained. We counted the occurrence of each clock CpGs in the different categories identified by SNITCH across our 3 models (BL, 7-cell, 12-cell) in both males and females. A similar analysis was performed with age-related CpGs (169 hypermethylated, 181 hypomethylated) shared across immune cells retrieved from Roy R. et al. \u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003ch2\u003eChromatin state enrichment\u003c/h2\u003e\u003cp\u003eChromatin states profiled in peripheral blood mononuclear cells (PBMCs) were obtained from the Roadmap Epigenomics Project \u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003eand mapped to CpG probe IDs. Fisher’s exact tests were conducted to evaluate the enrichment of each chromatin state across aging trajectory classes against NC CpGs in both sexes. P-values were adjusted using the Benjamini-Hochberg method.\u003c/p\u003e\u003ch2\u003ePathway enrichment analysis\u003c/h2\u003e\u003cp\u003eBoth gsameth and gometh functions from the \u003cem\u003emissMethyl\u003c/em\u003e \u003csup\u003e\u003cem\u003e\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/em\u003e\u003c/sup\u003eR package were used to perform Over Representation Analysis (ORA). These functions account for the differing numbers of CpG probes per gene on the Illumina EPIC array, thereby reducing bias inherent to standard enrichment tools when applied to DNAm data. The gsameth function allows testing a list of CpGs against a custom gene set. We retrieved 2 custom gene sets from the Molecular Signatures Database (MSigDB v2024.1/v2025.1): Human Phenotype Ontology\u003csup\u003e\u003cspan citationid=\"CR85\" class=\"CitationRef\"\u003e85\u003c/span\u003e\u003c/sup\u003e (HPO - \u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003ec5.hpo.v2024.1.Hs\u003c/span\u003e.entrez.gmt and Reactome Pathways \u003csup\u003e\u003cspan citationid=\"CR86\" class=\"CitationRef\"\u003e86\u003c/span\u003e\u003c/sup\u003e(c2.cp.reactome.v2025.1.Hs.entrez.gmt). The Gene Ontology\u003csup\u003e\u003cspan citationid=\"CR87\" class=\"CitationRef\"\u003e87\u003c/span\u003e\u003c/sup\u003e and KEGG \u003csup\u003e\u003cspan citationid=\"CR88\" class=\"CitationRef\"\u003e88\u003c/span\u003e\u003c/sup\u003edatabases were tested with the gometh function. We used the set of CpG sites corresponding to each DNA methylation cluster, excluding the non-correlated \"NC\" sites, as the test set, and the full set of tested CpG sites across all clusters served as the background. Analyses were conducted independently for male and female clusters. The analysis was replicated for the CpGs of each peak identified during the DEswan analysis.\u003c/p\u003e\u003ch3\u003eMotif enrichment analysis\u003c/h3\u003e\u003cp\u003eThe motif enrichment tool from EWAS datahub \u003csup\u003e\u003cspan citationid=\"CR89\" class=\"CitationRef\"\u003e89\u003c/span\u003e\u003c/sup\u003e was used to perform motif enrichment analysis on each cluster identified in females and males. This tool uses a centered 500bp window on the CpG of interest to perform Hypergeometric Optimization of Motif EnRichment (HOMER)\u003csup\u003e\u003cspan citationid=\"CR90\" class=\"CitationRef\"\u003e90\u003c/span\u003e\u003c/sup\u003e. Motifs with a qvalue \u0026lt; 0.05 were considered significant. The analysis was replicated for the CpGs of each peak identified during the DEswan analysis.\u003c/p\u003e\u003ch2\u003eTrait association - Cancer\u003c/h2\u003e\u003cp\u003eThe cohort used to assess the sex-specific biomarker potential of cluster eigenvalues has been described elsewhere. Briefly, the EPIC-Italy\u003csup\u003e\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e\u003c/sup\u003e study includes DNA methylation data collected at baseline and linked to up to 14 years of prospective follow-up. Cancer cases were annotated with time to diagnosis and cancer type, enabling time-to-event analyses. We performed sex-stratified analyses using DNAm of patients preprocessed as described above. Individuals without a cancer event were treated as right-censored. For these, the time to diagnosis was imputed using the maximum observed follow-up time among cases. For each patient, we computed their NL cluster eigenvalues as described above. Importantly, this step was restricted to the CpGs common to both the EPIC and 450K arrays, resulting in smaller sets of CpGs per cluster (Supp.Table). Immune cell-type proportions were estimated from whole blood methylation profiles using the EpiDISH \u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003ealgorithm (RPC method) with a 12-cell reference panel. The coxph function from the \u003cem\u003esurvival\u003c/em\u003e package was used to fit cox proportional hazards models using “Surv(time_to_diagnosis, status)” as the outcome, with eigenvalues, age, and immune cell proportions as predictors. Logistic regression models (glm(family = binary)) using cancer status as the binary outcome were also fit for comparison. Model coefficients were exponentiated to yield hazard ratios (HR) and odds ratios (OR), with 95% confidence intervals. To assess survival differences, samples were stratified into tertiles based on the significant eigenvalues (Low, Mid, High). Kaplan–Meier curves were generated using the survfit function and plotted with ggsurvplot() (log-rank p-value and 95% CI shown) from the \u003cem\u003esurvminer\u003c/em\u003e package. Significance was considered as FDR \u0026lt; 0.05.\u003c/p\u003e\u003ch2\u003eTrait association - Inflammation\u003c/h2\u003e\u003cp\u003eSeparately in males and females, we computed the “eigenvalue” of each cluster, corresponding to that cluster’s first principal component (PC1), by performing principal component analysis (PCA) on centered and scaled methylation beta values across CpGs within that cluster. For each sample, we used the \u003cem\u003eComputeCRPscore\u003c/em\u003e\u003csup\u003e\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e\u003c/sup\u003e function based on a predefined set of weighted CpGs to generate an estimation of CRP protein levels as a surrogate for inflammation\u003csup\u003e\u003cspan citationid=\"CR91\" class=\"CitationRef\"\u003e91\u003c/span\u003e\u003c/sup\u003e. To test the independent contribution of age-associated methylation modules to CRP variation, we constructed a series of nested linear models:\u003c/p\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eModel 1\u003c/b\u003e included \u003cb\u003eage\u003c/b\u003e as the only predictor;\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003cp\u003eCRP ~ age\u003c/p\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eModel 2\u003c/b\u003e included age and the eigenvalue of the NC (non-correlated) CpGs;\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003cp\u003eCRP ~ age + NC\u003c/p\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eModel 3 (full model)\u003c/b\u003e included age, the NC eigenvalue, and eigenvalues of the VI, LI, LD, and NL clusters.\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003cp\u003eCRP ~ age + NC + VI + LI + LD + NLi + … + NLn\u003c/p\u003e\u003cp\u003eWe compared models using ANOVA to evaluate whether the addition of SNITCH-derived modules significantly improved the explanation of CRP variability beyond age and (non-correlated) methylation patterns. Model fit and the significance of individual predictors were assessed using standard linear modeling statistics and p-values \u0026lt; 0.05.\u003c/p\u003e\u003ch2\u003eWave of aging – DEswan analysis\u003c/h2\u003e\u003cp\u003eWe performed a DEswan analysis as previously described\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. DEswan works as a sliding window, where at each specified time point centered at the middle of the window, CpGs at both ends are compared using a Wilcoxon test for differential methylation. We restricted analyses to age-associated CpGs previously identified (excluding NC CpGs). We used windows centered from 31 to 78 years in 2-year steps with a 15‑year bucket size. To account for cell composition, we estimated per‑sample blood cell‑type proportions with EpiDISH using a 12‑cell reference (cent12CT) and included these estimates as covariates in all models (as described above). P-values were adjusted for multiple testing by the Benjamini-Hochberg method, and significance was considered at FRD \u0026lt; 0.05. To ensure the robustness of the findings, we only considered peaks that were conserved at FDR \u0026lt; 0.01 and 0.001 (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). We adapted the initial R script to allow parallelization across CpGs using parLapply.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eDNA methylation \u0026ndash; DNAm\u003c/p\u003e\n\u003cp\u003eLinear Increasing \u0026ndash; LI\u003c/p\u003e\n\u003cp\u003eLinear Decreasing \u0026ndash; LD\u003c/p\u003e\n\u003cp\u003eVariance Increasing \u0026ndash; VI\u003c/p\u003e\n\u003cp\u003eNon-Correlated \u0026ndash; NC\u003c/p\u003e\n\u003cp\u003eNon-Linear \u0026ndash; NL\u003c/p\u003e\n\u003cp\u003eAdjusted Rand index \u0026ndash; RI\u003c/p\u003e\n\u003cp\u003eAdjusted Mutual Information \u0026ndash; AMI\u003c/p\u003e\n\u003cp\u003eGene Ontology \u0026ndash; GO\u003c/p\u003e\n\u003cp\u003eKyoto Encyclopedia of Genes and Genomes \u0026ndash; KEGG\u003c/p\u003e\n\u003cp\u003eTranscription Factor \u0026ndash; TF\u003c/p\u003e\n\u003cp\u003eTranscription Factor Binding site \u0026ndash; TFB\u003c/p\u003e\n\u003cp\u003eC-Reactive Protein - CRP\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll data analysed during this study are publicly available in the GEO repository under accession numbers GSE246337 and GSE51032. The SNITCH method and scripts used in this study are accessible at: https://github.com/fishrscale/\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe Regents of the University of California are the sole owners of patents and patent applications directed at epigenetic biomarkers for which Steve Horvath is a named inventor; SH is a founder and paid consultant of the non-profit Epigenetic Clock Development Foundation that licenses these patents. SH is a Principal Investigator at Altos Labs, Cambridge Institute of Science. The other authors declare no conflict of interest.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026apos; contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eRG performed data analysis and interpretation. RG, AT, and NE conceived and designed the study. AT provided methodological and computational modelling expertise. AT and SH provided guidance on DNA methylation analysis and interpretation of the results. RG drafted the manuscript. MJ and BJF contributed to ideas and revisions on the manuscript. All authors approved the final version of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eShen X, et al. Nonlinear dynamics of multi-omics profiles during human aging. Nat Aging. 2024;4:1619\u0026ndash;34.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYe Q, et al. Telomere length and chronological age across the human lifespan: A systematic review and meta-analysis of 414 study samples including 743,019 individuals. Ageing Res Rev. 2023;90:102031.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchaum N, et al. Ageing hallmarks exhibit organ-specific temporal signatures. Nature. 2020;583:596\u0026ndash;602.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKang Y-K, Min B, Eom J, Park JS. Different phases of aging in mouse old skeletal muscle. Aging 14, 143\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFehlmann T, et al. Common diseases alter the physiological age-related blood microRNA profile. Nat Commun. 2020;11:5958.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLehallier B, et al. Undulating changes in human plasma proteome profiles across the lifespan. Nat Med. 2019;25:1843\u0026ndash;50.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAramillo Irizar P, et al. Transcriptomic alterations during ageing reflect the shift from cancer to degenerative diseases in the elderly. Nat Commun. 2018;9:327.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHorvath S, Raj K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat Rev Genet. 2018;19:371\u0026ndash;84.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSeale K, Horvath S, Teschendorff A, Eynon N, Voisin S. Making sense of the ageing methylome. Nat Rev Genet. 2022;23:585\u0026ndash;605.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLu AT, et al. Universal DNA methylation age across mammalian tissues. Nat Aging. 2023;3:1144\u0026ndash;66.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHorvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14:3156.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJohnson ND, et al. Non-linear patterns in age-related DNA methylation may reflect CD4\u0026thinsp;+\u0026thinsp;T cell differentiation. Epigenetics. 2017;12:492\u0026ndash;503.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVershinina O, Bacalini MG, Zaikin A, Franceschi C, Ivanchenko M. Disentangling age-dependent DNA methylation: deterministic, stochastic, and nonlinear. Sci Rep. 2021;11:9201.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOlecka M, et al. Nonlinear DNA methylation trajectories in aging male mice. Nat Commun. 2024;15:3074.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLuo Q, et al. A meta-analysis of immune-cell fractions at high resolution reveals novel associations with common phenotypes and health outcomes. Genome Med. 2023;15:59.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSlieker RC, et al. Age-related accrual of methylomic variability is linked to fundamental ageing mechanisms. Genome Biol. 2016;17:191.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSeale K, Teschendorff A, Reiner AP, Voisin S. Eynon, N. A comprehensive map of the aging blood methylome in humans. Genome Biol. 2024;25:240.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOkada D, Cheng JH, Zheng C, Kumaki T, Yamada R. Data-driven identification and classification of nonlinear aging patterns reveals the landscape of associations between DNA methylation and aging. Hum Genomics. 2023;17:8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePhyo AZZ, et al. Sex differences in biological aging and the association with clinical measures in older adults. Geroscience. 2024;46:1775\u0026ndash;88.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSampathkumar NK, et al. Widespread sex dimorphism in aging and age-related diseases. Hum Genet. 2020;139:333\u0026ndash;56.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eReicher L, et al. Phenome-wide associations of human aging uncover sex-specific dynamics. Nat Aging. 2024;4:1643\u0026ndash;55.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYusipov I et al. Age-related DNA methylation changes are sex-specific: a comprehensive assessment. Aging 12, 24057\u0026ndash;80.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eQi L, Teschendorff AE. Cell-type heterogeneity: Why we should adjust for it in epigenome and biomarker studies. Clin Epigenetics. 2022;14:31.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTeschendorff AE, Breeze CE, Zheng SC, Beck S. A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies. BMC Bioinformatics. 2017;18:105.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTeschendorff AE, Horvath S. Epigenetic ageing clocks: statistical methods and emerging computational challenges. Nat Rev Genet. 2025. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41576-024-00807-w\u003c/span\u003e\u003cspan address=\"10.1038/s41576-024-00807-w\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBelsky DW et al. DunedinPACE, a DNA methylation biomarker of the pace of aging. Elife 11, (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHannum G, et al. Genome-wide Methylation Profiles Reveal Quantitative Views of Human Aging Rates. Mol Cell. 2013;49:359\u0026ndash;67.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLevine ME et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging 10, 573\u0026ndash;91.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang Y, et al. DNA methylation signatures in peripheral blood strongly predict all-cause mortality. Nat Commun. 2017;8:14617.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLu AT et al. DNA methylation GrimAge version 2. \u003cem\u003eAging\u003c/em\u003e 14, 9484\u0026ndash;9549.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYing K, et al. Causality-enriched epigenetic age uncouples damage and adaptation. Nat Aging. 2024;4:231\u0026ndash;46.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLu AT et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging 11, 303\u0026ndash;27.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRoy R, et al. Epigenetic signature of human immune aging in the GESTALT study. Elife. 2023;12:e86136.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBernstein BE, et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol. 2010;28:1045\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMoqri M, et al. PRC2-AgeIndex as a universal biomarker of aging and rejuvenation. Nat Commun. 2024;15:5956.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTeschendorff AE, et al. Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. Genome Res. 2010;20:440\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRakyan VK, et al. Human aging-associated DNA hypermethylation occurs preferentially at bivalent chromatin domains. Genome Res. 2010;20:434\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSlieker RC, Relton CL, Gaunt TR, Slagboom PE, Heijmans BT. Age-related DNA methylation changes are tissue-specific with ELOVL2 promoter methylation as exception. Epigenetics Chromatin. 2018;11:25.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJain N, et al. DNA methylation correlates of chronological age in diverse human tissue types. Epigenetics Chromatin. 2024;17:25.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBartz J, Jung H, Wasiluk K, Zhang L, Dong X. Progress in Discovering Transcriptional Noise in Aging. Int J Mol Sci 24, (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhao K, Rhee SY. Interpreting omics data with pathway enrichment analysis. Trends Genet. 2023;39:308\u0026ndash;19.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMaksimovic J, Oshlack A, Phipson B. Gene set enrichment analysis for genome-wide DNA methylation data. Genome Biol. 2021;22:173.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePhipson B, Maksimovic J, Oshlack A. missMethyl: an R package for analyzing data from Illumina\u0026rsquo;s HumanMethylation450 platform. Bioinformatics. 2016;32:286\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRimoldi M, et al. DNA methylation patterns of transcription factor binding regions characterize their functional and evolutionary contexts. Genome Biol. 2024;25:146.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYin Y, et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Sci (1979). 2017;356:eaaj2239.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMalaymar Pinar D, et al. Nuclear Factor I Family Members are Key Transcription Factors Regulating Gene Expression. Mol Cell Proteom. 2025;24:100890.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen K-S, Lim JWC, Richards LJ, Bunt J. The convergent roles of the nuclear factor I transcription factors in development and cancer. Cancer Lett. 2017;410:124\u0026ndash;38.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMason S, Piper M, Gronostajski RM, Richards LJ. Nuclear Factor One Transcription Factors in CNS Development. Mol Neurobiol. 2009;39:10\u0026ndash;23.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu Y et al. PD-L1-mediated immune evasion in triple-negative breast cancer is linked to the loss of \u0026lt;\u0026thinsp;em\u0026thinsp;\u0026gt;\u0026thinsp;ZNF652. Cell Rep 42, (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKumar R, et al. ZNF652, A Novel Zinc Finger Protein, Interacts with the Putative Breast Tumor Suppressor CBFA2T3 to Repress Transcription. Mol Cancer Res. 2006;4:655\u0026ndash;65.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTiyaboonchai, A. et al. GATA6 Plays an Important Role in the Induction of Human Definitive Endoderm, Development of the Pancreas, and Functionality of Pancreatic β\u0026nbsp;Cells. \u003cem\u003eStem Cell Reports\u003c/em\u003e 8, 589\u0026ndash;604 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang X, et al. HOXC9 directly regulates distinct sets of genes to coordinate diverse cellular processes during neuronal differentiation. BMC Genomics. 2013;14:830.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHur H, et al. HOXC9 Induces Phenotypic Switching between Proliferation and Invasion in Breast Cancer Cells. J Cancer. 2016;7:768\u0026ndash;73.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGhazimoradi MH, Babashah S. The transcriptional regulators GATA6 and TET1 regulate the TGF-β pathway in cancer-associated fibroblasts to promote breast cancer progression. Cell Death Discov. 2025;11:164.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGiuili E, et al. Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs). Hum Genet. 2023;142:1721\u0026ndash;35.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGrolaux R, et al. Identification of differentially methylated regions in rare diseases from a single-patient perspective. Clin Epigenetics. 2022;14:174.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDraškovič T, Hauptman N. Discovery of novel DNA methylation biomarker panels for the diagnosis and differentiation between common adenocarcinomas and their liver metastases. Sci Rep. 2024;14:3095.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang T, et al. A multiplex blood-based assay targeting DNA methylation in PBMCs enables early detection of breast cancer. Nat Commun. 2023;14:4724.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRiboli E. The European Prospective Investigation into Cancer and Nutrition (EPIC): Plans and Progress. J Nutr. 2001;131:S170\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eL\u0026oacute;pez-Ot\u0026iacute;n C, Blasco MA, Partridge L, Serrano M, Kroemer G. Hallmarks of aging: An expanding universe. Cell. 2023;186:243\u0026ndash;78.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCampisi J, et al. From discoveries in ageing research to therapeutics for healthy ageing. Nature. 2019;571:183\u0026ndash;92.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGuo X, Teschendorff AE. Epigenetic clocks and inflammaging: pitfalls caused by ignoring cell-type heterogeneity. Geroscience. 2025;47:2707\u0026ndash;19.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi J, et al. Determining a multimodal aging clock in a cohort of Chinese women. Med. 2023;4:825\u0026ndash;e84813.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eReynolds LM, et al. Age-related variations in the methylome associated with gene expression in human monocytes and T cells. Nat Commun. 2014;5:5366.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang Y, et al. Epigenetic influences on aging: a longitudinal genome-wide methylation study in old Swedish twins. Epigenetics. 2018;13:975\u0026ndash;87.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYuan T, et al. An Integrative Multi-scale Analysis of the Dynamic DNA Methylation Landscape in Aging. PLoS Genet. 2015;11:e1004996.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLu T, et al. REST and stress resistance in ageing and Alzheimer\u0026rsquo;s disease. Nature. 2014;507:448\u0026ndash;54.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWelsh H, et al. Age-related changes in DNA methylation in a sample of elderly Brazilians. Clin Epigenetics. 2025;17:17.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBallas N, Grunseich C, Lu DD, Speh JC, Mandel G. REST and Its Corepressors Mediate Plasticity of Neuronal Gene Chromatin throughout Neurogenesis. Cell. 2005;121:645\u0026ndash;57.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003edos Santos GA, Chatsirisupachai K, Avelar RA, de Magalh\u0026atilde;es. J. P. Transcriptomic analysis reveals a tissue-specific loss of identity during ageing and cancer. BMC Genomics. 2023;24:644.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMartinez-Jimenez CP, et al. Aging increases cell-to-cell transcriptional variability upon immune stimulation. Sci (1979). 2017;355:1433\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMogilenko DA, Shchukina I, Artyomov M. N. Immune ageing at single-cell resolution. Nat Rev Immunol. 2022;22:484\u0026ndash;98.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMoqri M, et al. Integrative epigenetics and transcriptomics identify aging genes in human blood. bioRxiv. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1101/2024.05.30.596713\u003c/span\u003e\u003cspan address=\"10.1101/2024.05.30.596713\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAwasthi N, Liongue C, Ward AC. STAT proteins: a kaleidoscope of canonical and non-canonical functions in immunity and cancer. J Hematol Oncol. 2021;14:198.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYang L, et al. Methylation of a CGATA element inhibits binding and regulation by GATA-1. Nat Commun. 2020;11:2560.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePerumal N, et al. Nuclear factor I/B: Duality in action in cancer pathophysiology. Cancer Lett. 2025;609:217349.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMa H-Y, et al. NFIX suppresses breast cancer cell proliferation by delaying mitosis through downregulation of CDK1 expression. Cell Death Discov. 2025;11:77.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHan L, et al. Changes in DNA methylation from pre- to post-adolescence are associated with pubertal exposures. Clin Epigenetics. 2019;11:176.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMartino DJ, et al. Evidence for age-related and individual-specific changes in DNA methylation profile of mononuclear cells during early immune development in humans. Epigenetics. 2011;6:1085\u0026ndash;94.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWikenius E, Moe V, Smith L, Heiervang ER, Berglund A. DNA methylation changes in infants between 6 and 52 weeks. Sci Rep. 2019;9:17587.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJacques M et al. DNA Methylation Ageing Atlas Across 17 Human Tissues. \u003cem\u003ebioRxiv\u003c/em\u003e 2025.07.21.665830 (2025) \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1101/2025.07.21.665830\u003c/span\u003e\u003cspan address=\"10.1101/2025.07.21.665830\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhou W, Triche TJ Jr, Laird PW, Shen H. SeSAMe: reducing artifactual detection of DNA methylation by Infinium BeadChips in genomic deletions. Nucleic Acids Res. 2018;46:e123\u0026ndash;123.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhou W, Laird PW, Shen H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 2017;45:e22\u0026ndash;22.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWood SN. \u003cem\u003eGeneralized Additive Models: An Introduction with R\u003c/em\u003e. (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiberzon A, et al. The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 2015;1:417\u0026ndash;25.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMilacic M, et al. The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res. 2024;52:D672\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eConsortium TGO, et al. The Gene Ontology knowledgebase in 2023. Genetics. 2023;224:iyad031.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKanehisa M, Goto SKEGG. Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28:27\u0026ndash;30.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eXiong Z, et al. EWAS Open Platform: integrated data, knowledge and toolkit for epigenome-wide association study. Nucleic Acids Res. 2022;50:D1004\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHeinz S, et al. Simple Combinations of Lineage-Determining Transcription Factors Prime\u0026thinsp;\u0026lt;\u0026thinsp;em\u0026thinsp;\u0026gt;\u0026thinsp;cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell. 2010;38:576\u0026ndash;89.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWielscher M, et al. DNA methylation signature of chronic low-grade inflammation and its role in cardio-respiratory diseases. Nat Commun. 2022;13:2408.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"genome-biology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gbio","sideBox":"Learn more about [Genome Biology](https://genomebiology.biomedcentral.com/)","snPcode":"13059","submissionUrl":"https://submission.springernature.com/new-submission/13059/3","title":"Genome Biology","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Aging, Non-linear, DNA methylation, Sex differences, Epigenetic, Biomarkers, Computational biology","lastPublishedDoi":"10.21203/rs.3.rs-7516867/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7516867/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground:\u003c/h2\u003e\u003cp\u003eAging is a multi-modal process, leaving distinct molecular signatures across the epigenome. DNA methylation is among the most robust biomarkers of biological aging, yet most studies assume linear age relationships and analyze mixed-sex cohorts, overlooking known sex differences. Such approaches risk obscuring critical non-linear transitions and sex-specific trajectories.\u003c/p\u003e\u003ch2\u003eResults:\u003c/h2\u003e\u003cp\u003eWe developed SNITCH, a computational framework to detect complex non-linear methylation trajectories and disentangle shared from sex-divergent patterns. Applied to deconvoluted whole-blood methylomes from 252 females and 246 males (ages 19\u0026ndash;90 years), SNITCH revealed convergent and divergent epigenetic aging pathways independent of immune cell composition. Non-linear trajectories were enriched for developmental transcription factor motifs, including NF1/CTF and REST, with known oncogenic roles. Importantly, a female-specific non-linear cluster was prospectively associated with cancer onset and systemic inflammation in an independent cohort, nominating clinically relevant biomarkers.\u003c/p\u003e\u003ch2\u003eConclusion:\u003c/h2\u003e\u003cp\u003eOur results uncover sex-specific, non-linear aging programs that capture the dynamics of epigenetic change beyond linear models. These findings provide candidate biomarkers for early disease risk and advance understanding of how aging trajectories diverge between sexes, with potential applications across multi-omic studies of aging.\u003c/p\u003e","manuscriptTitle":"Sex-specific non-linear DNA methylation aging trajectories reveal biomarkers of cancer risk and inflammation","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-11 16:03:46","doi":"10.21203/rs.3.rs-7516867/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-10-14T02:18:22+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-10-06T16:10:20+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-25T03:29:51+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"80795564955572942727443523784993253415","date":"2025-09-17T20:28:51+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"68401801957244579467306767633370146135","date":"2025-09-17T06:55:02+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-09-05T03:58:24+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-09-04T10:55:49+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-09-03T05:59:08+00:00","index":"","fulltext":""},{"type":"submitted","content":"Genome Biology","date":"2025-09-02T10:45:35+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"genome-biology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"gbio","sideBox":"Learn more about [Genome Biology](https://genomebiology.biomedcentral.com/)","snPcode":"13059","submissionUrl":"https://submission.springernature.com/new-submission/13059/3","title":"Genome Biology","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"f4cc3826-1d16-4a36-abcb-249ec0063ecb","owner":[],"postedDate":"September 11th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2026-02-09T16:02:41+00:00","versionOfRecord":{"articleIdentity":"rs-7516867","link":"https://doi.org/10.1186/s13059-026-03952-z","journal":{"identity":"genome-biology","isVorOnly":false,"title":"Genome Biology"},"publishedOn":"2026-02-04 15:59:38","publishedOnDateReadable":"February 4th, 2026"},"versionCreatedAt":"2025-09-11 16:03:46","video":"","vorDoi":"10.1186/s13059-026-03952-z","vorDoiUrl":"https://doi.org/10.1186/s13059-026-03952-z","workflowStages":[]},"version":"v1","identity":"rs-7516867","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7516867","identity":"rs-7516867","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00