Skin DNA Methylation Encodes Multidimensional Facial Aging Phenotypes with Distinct Biological Architectures

preprint OA: closed CC-BY-NC-4.0
📄 Open PDF Full text JSON View at publisher
Full text 95,144 characters · extracted from oa-pdf · 8 sections · click to expand

Abstract

21 Whether distinct visible aging traits, e.g., wrinkling, pigmentation, and inflammation, reflect 22 shared or independent epigenetic programs remains unknown; existing clocks compress aging 23 into a single chronological axis, leaving the phenotype -specific architecture of cutaneous aging 24 uncharacterized. Here, we integrate AI -derived facial phenotypes with skin DNA methylation 25 profiles from 706 individuals to develop EpiVision, a panel of 21 epigenetic predictors spanning 26 structural, pigmentary, inflammatory, and textural aging traits. Predictors reveal shared and trait-27 specific pathways, including developmental patterning, epithelial remodeling, hormonal signaling, 28 and UV damage responses, and capture environmentally induced acceleration in sun -exposed 29 skin alongside lifestyle and topical treatment–associated variation. These findings establish that 30 visible skin aging comprises molecularly distinct axes with shared regulatory substrates and trait-31 specific drivers, providing a scalable epigenetic framework for intervention evaluation and aging 32 biology research. 33 34 35 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint

Introduction

36 37 Human skin aging is a highly visible and clinically relevant phenotype shaped by intrinsic biology 38 and extrinsic exposures. Yet current approaches to measuring skin aging remain fragmented. 39 Imaging-based assessments capture visible structural features, whereas biophysical and 40 molecular assays often require invasive sampling and typically focus on a limited set of endpoints 41 such as elasticity, hydration, collagen content, or barrier function. As a result, existing measures 42 tend to quantify isolated aspects of skin aging rather than providing a comprehensive, 43 multidimensional representation of how skin ages across structural, pigmentary, inflammatory, 44 and textural domains. Moreover, few studies integrate visible phenotypes with molecular readouts 45 or link these measures to longitudinal outcomes. 46 47 Recent computer -vision systems trained on large dermatology -labeled image corpora enable 48 standardized, high-throughput extraction of continuous facial skin phenotypes from photographs, 49 offering an opportunity to study skin aging as a multidimensional, quantitative trait at scale 1. 50 However, these have not been linked to molecular biomarkers, leaving a necessary step for 51 translational research for mechanisms of skin aging. 52 DNA methylation (DNAm) has emerged as one of the most reproducible molecular readouts of 53 aging across human populations. Early epigenetic clock models demonstrated that DNAm 54 patterns can predict chronological age with high accuracy, establishing DNAm as an integrative 55 biomarker that captures cumulative cellular history across tissues and environmental contexts. 56 However, as DNAm clocks became widely used, it became clear that chronological -age training 57 alone does not guarantee optimal sensitivity to physiology, disease risk, or intervention response. 58 Because first-generation clocks are optimized to minimize error against calendar age, then came 59 the development of second-generation epigenetic clocks trained not on chronological age, but on 60 phenotypes intended to reflect morbidity and mortality risk. DNAm PhenoAge was trained on a 61 composite “phenotypic age” derived from clinical biomarkers and chronological age, and DNAm 62 GrimAge incorporated DNAm surrogates for plasma proteins and smoking exposure to infer 63 biologically meaningful variation that is independent of chronological time, such as heterogeneity 64 in healthspan, tissue -specific decline, or responsiveness to therapeutic intervention. Together, 65 these advances underscored that phenotype -relevant training, rather than chronological age 66 minimization, is necessary to capture biologically meaningful epigenetic variation; a principle that 67 remains largely unapplied in the context of skin aging 2–5. 68 Despite rapid progress in blood-based epigenetic predictors, analogous phenotype-trained DNAm 69 clocks for skin aging remain comparatively underdeveloped. Most widely used skin DNAm clocks 70 remain anchored to chronological time rather than to skin -relevant functional phenotypes. In 71 parallel, the field has begun to develop skin -specific and skin-compatible approaches that lower 72 barriers to sampling and increase sensitivity to cutaneous biology. “MolClock” introduced a skin-73 focused methylation approach aimed at quantifying skin aging dynamics and treatment effects in 74 a skin-relevant context 6. More recently, non-invasive epidermal DNAm profiling from facial tape 75 strips coupled with enzymatic methyl -seq enabled highly accurate chronological age prediction 76 directly from epidermal DNA and demonstrated sensitivity to experimental rejuvenation signals, 77 highlighting the feasibility and translational promise of non -invasive skin epigenomics 7,8. 78 Complementing these approaches, newer work has also explored DNAm clocks tied to skin -79 relevant phenotypes (for example, visual -age progression and wrinkle -grade modeling), 80 underscoring growing interest in phenotype-forward skin aging biomarkers 9. 81 Existing epigenetic clocks vary widely in tissue specificity, training strategy, sample size, and CpG 82 coverage. Early first -generation clocks such as Horvath’s multi -tissue clock (~8,000 samples; 83 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint 450K array) 10 and the Horvath Skin and Blood clock (~896 samples) 11 were trained primarily on 84 chronological age. Subsequent multi-tissue models, including AltumAge (~16,000 samples) 12 and 85 RetroClock (~18,000 samples), 13 expanded training cohorts and probe coverage but remained 86 largely time-based. Skin-focused approaches such as MolClock (~508 samples) 14, Mitra et al. 87 (epidermal EM -seq; ~4 million CpGs) 15, and commercial second -generation clocks (e.g., 88 Beiersdorf AG) differ substantially in platform depth (450K to >1M CpGs), training size (~300–800 89 samples), Cohort demographics (only women) and phenotype focus. Collectively, these models 90 highlight rapid technical progress but also underscore the absence of large -scale, phenotype-91 trained, imaging-linked skin methylation frameworks grounded explicitly in multidimensional facial 92 aging traits and capable of supporting evidence -based, personalized approaches to long -term 93 skin health. 94 A key opportunity, and persistent gap, is to connect visible facial aging phenotypes to molecular 95 aging signatures in skin using scalable, non -invasive sampling. “Perceived age” from facial 96 images is not merely cosmetic: human -rated perceived age predicts short -term mortality in 97 longitudinal cohorts, indicating that facial appearance can encode health -relevant information 98 beyond chronological age 16,17. Consistent with this, blood DNAm has been associated with 99 perceived facial aging, supporting the hypothesis that methylation markers can reflect processes 100 linked to facial aging phenotypes. Advances in deep learning further suggest that facial 101 photographs can support automated estimation of age- and outcome-relevant signals, motivating 102 rigorous efforts to anchor image-derived phenotypes to biological substrates. 103 A central unresolved question in aging biology is whether the distinct visible features of facial 104 aging (e.g., wrinkling, pigmentation change, inflammation, loss of uniformity) reflect a single 105 shared biological process or independent molecular programs with trait -specific regulation. We 106 address this question directly by integrating AI -derived facial aging phenotypes from HAUT 107 imaging with matched DNA methylation profiles from minimally invasive curette skin samples 108 across 706 individuals. Using this framework, we developed EpiVision, a panel of 21 epigenetic 109 predictors spanning perceived age, structural features, pigmentation, inflammation, and texture, 110 trained on quantitative skin -relevant outcomes rather than chronological time. Alongside these 111 predictors, we performed independent epigenome-wide association analyses for each phenotype 112 to characterize the biological architecture connecting methylation variation to visible aging traits 113 and to quantify the degree of shared versus distinct epigenetic structure across aging axes. 114 Together, these analyses ask not only whether methylation encodes phenotypic information, but 115 which molecular pathways mediate that encoding, and whether different faces of aging are written 116 in the same or different parts of the epigenome. 117

Results

118 119 Study collection and design 120 Facial skin exhibits diverse aging phenotypes that reflect both intrinsic biology and cumulative 121 environmental exposures. To capture DNA methylation signatures underlying these visible 122 features, we developed EpiVision, a methylation-based framework that estimates facial skin aging 123 phenotypes from dermal samples paired with standardized facial imaging. We profiled 744 dermal 124 curette samples collected from the cheekbone region using the Illumina EPICv2 BeadChip, 125 yielding 911,580 CpG sites (97.4% autosomal, 2.6% sex chromosomes) per sample after quality 126 control. In parallel, we analyzed 1,463 standardized facial image submissions and quantified 19 127 facial aging-related phenotypes using the HAUT artificial intelligence platform. After linkage of 128 methylation and imaging phenotypes, 706 individuals had matched data and were randomly 129 divided into a training set (N=406–564, 80%) and an independent testing set (N=102–142, 20%) 130 with comparable age and Fitzpatrick skin type distributions ( Fig. 1, Supplementary File 1 ). 131 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint EpiVision generated a panel of epigenetic predictors that were evaluated in the held-out test set 132 using mean absolute error and Spearman correlation. To examine lifestyle and medical 133 correlates, we linked survey data from 282 participants (33 exposures) and tested exposure –134 predictor associations using age/sex -adjusted epigenetic age acceleration where applicable. 135 Separately, we performed epigenome -wide association analyses within the matched cohort 136 (N=706) to identify CpG sites associated with specific facial aging features, adjusting for cell-type 137 heterogeneity estimated using EpiSCORE. 138 139 HAUT Imaging provides baseline for skin health and aging 140 To establish the phenotypic baseline for downstream methylation analyses, we characterized 19 141 AI-derived facial aging and skin -quality traits across 1,413 individuals from 1,463 imaging 142 submissions using the HAUT platform. The subset with paired methylation data (n = 706) showed 143 highly similar phenotype distributions to the full imaging cohort, indicating representativeness 144 (Fig. 2A). Among participants with available demographic data, mean chronological age was 40.3 145 ± 16.3 years (n = 247), and Fitzpatrick skin types among 481 participants with available data were 146 most commonly Type II (n = 154) and Type III (n = 143) ( Fig. 2B–C). As expected, structural 147 aging features correlated with chronological age (n = 191 –246), with perceived age (r = 0.866) 148 and perceived eye age (r = 0.826) showing the strongest positive associations, while overall 149 uniformness (r = −0.529), lines (r = −0.409), and sagging (r = −0.322) declined with age (Fig. 2D). 150 Associations with Fitzpatrick type were also observed (n = 364 –468), with irritation (ρ = 0.488) 151 and erythema (HAUT redness score; ρ = 0.470) positively correlated with higher phototypes, and 152 sun spots (ρ = −0.221) and pigmentation score (dyschromia and uneven skin tone; ρ = −0.175) 153 inversely correlated (Fig. 2E). 154 155 Skin Cell Deconvolution of methylation data 156 Cell-type fractions were estimated for 744 samples using EpiSCORE (Figure S1). Differentiated 157 keratinocytes represented the largest component (median ≈ 0.32), followed by endothelial cells 158 (median ≈ 0.25) and undifferentiated keratinocytes (median ≈ 0.21). Macrophage and T -cell 159 fractions were present with substantial inter -individual variability, whereas fibroblast and 160 melanocyte fractions were generally low. Overall, the inferred composition, dominated by 161 keratinocytes with a notable endothelial contribution, is consistent with superficial sampling that 162 captures predominantly epidermal cell populations. These cell-type estimates were incorporated 163 as covariates in downstream methylation analyses to account for cellular heterogeneity across 164 samples. 165 166 Development and Performance of Skin Trait Predictors 167 Skin DNA methylation encodes multidimensional facial aging phenotypes with strong predictive 168 fidelity, demonstrating that the epigenome captures meaningful visible aging information beyond 169 chronological time. To characterize this systematically, we developed epigenetic predictors for 21 170 skin-related phenotypes using genome-wide DNA methylation profiles from the Illumina EPICv2 171 platform. Predictors were trained on unadjusted phenotype scores using the full CpG array 172 including sex-linked probes (median 4.8% of selected features per model, range 0.5–33.0%), with 173 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint sex and age adjustment reserved for downstream association analyses (see Methods). Models 174 for inflammatory traits such as irritation showed the greatest enrichment for sex -linked features, 175 consistent with known sex differences in cutaneous immune regulation. Models were trained in 176 an 80/20 split framework and evaluated in held -out test sets using Spearman correlation and 177 mean absolute error (Table 1). Multiple machine learning approaches were assessed, and tree-178 based ensemble methods consistently outperformed linear models, with LightGBM selected for 179 17 of 21 predictors (81%) and Random Forest for the remaining four. Age -related predictors 180 showed the strongest performance, including the sun-exposure–informed age model (ρ = 0.836; 181 MAE = 5.30 years), perceived chronological age (ρ = 0.819), and perceived eye age (ρ = 0.765). 182 Several non-age phenotypes also achieved strong correlations, including irritation (ρ = 0.738), 183 uniformness of the left cheek (ρ = 0.709), and redness (ρ = 0.705), demonstrating that methylation 184 profiles capture multidimensional aspects of visible skin biology beyond chronological time. 185 To evaluate the contribution of expanded probe coverage, we compared EPICv2-specific models 186 with models restricted to CpGs shared between EPICv1 and EPICv2 (Fig. 3A–D). EPICv2 models 187 demonstrated superior overall performance, with an average 10.0% increase in Spearman 188 correlation (0.546 vs. 0.601) and 4.6% reduction in MAE (9.83 vs. 9.38) ( Fig. 3A,B). Applying a 189 ≥5% threshold for meaningful change, EPICv2 models showed meaningful MAE improvement in 190 45% of targets with no meaningful degradation in any target, and meaningful Spearman 191 improvement in 40% of targets with meaningful degradation in 20% ( Fig. 3C,D). Improvements 192 were most pronounced for texture-related traits, including lacrimal grooves (ρ = 0.344 to 0.625), 193 overall uniformness, and pores. Although a small subset of predictors (4/20) showed ≥5% 194 decreases in correlation, no models exhibited meaningful increases in MAE. Complete 195 performance metrics for all EPICv2 and V1 –V2 overlap models are provided in Supplementary 196 File 2. 197 Exposure and Clinical Correlates of EpiVision Predictors 198 Building on the performance and biological validity of the EpiVision predictors, we examined 199 whether methylation-derived facial aging measures captured expected associations with self -200 reported lifestyle and clinical factors ( Fig. 4A –B). Across 282 participants with survey data, 201 exposures were evaluated against 21 methylation -based predictors using age/sex -adjusted 202 regression models to isolate exposure -specific effects from demographic variation captured by 203 the unadjusted predictors, with age -related outcomes analyzed as epigenetic age acceleration 204 (EAA). 205 Lifestyle exposures showed phenotype-specific rather than global aging associations, suggesting 206 that behavioral factors influence distinct dimensions of visible skin biology. Greater water intake 207 was associated with reduced dark circles (β = −0.206, p = 0.013), and higher self-reported diet 208 quality was associated with fewer acneiform breakouts (HAUT breakouts score) in the unadjusted 209 model (β = −0.204, p = 0.001), with a consistent direction after age/sex adjustment (β = −0.152, 210 p = 0.065). Sleep quality was significantly associated with fine lines after adjustment (β = 0.194, 211 p = 0.011), with differing effect patterns across line subtypes, indicating that even a single 212 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint behavioral exposure maps onto structurally distinct aspects of facial aging. Tobacco use 213 demonstrated nominal correlations with pigmentation and line traits in smaller subgroups but did 214 not survive covariate adjustment. The specificity of these associations across phenotypically 215 distinct predictors supports the discriminant validity of the EpiVision framework and highlights the 216 value of decomposing skin aging into trait -level components rather than relying solely on 217 composite aging measures. 218 Among clinical and medication variables, hyaluronic acid use was the only exposure significantly 219 associated with epigenetic age acceleration, showing slower perceived -age acceleration across 220 both the standard (β = −0.198, p = 0.015) and sun -exposure-informed (β = −0.161, p = 0.048) 221 perceived-age models (Fig. 4B). At the phenotype level, retinoid use was associated with reduced 222 pore size appearance (β = −0.184, p = 0.022), spironolactone use with reduced deep line 223 appearance (β = −0.174, p = 0.027), consistent with its known anti-androgenic effects on dermal 224 remodeling, and testosterone use with increased sagging skin (β = 0.119, p = 0.028), an 225 unexpected finding that may reflect residual confounding by age or hormonal context and 226 warrants prospective replication . Notably, unadjusted age analyses revealed numerous 227 significant associations with age -related predictors, including testosterone, hormone therapy, 228 diabetes, and cancer history (all p < 0.005), but these attenuated after EAA adjustment, consistent 229 with confounding by chronological age and reinforcing the necessity of acceleration -based 230 modeling to isolate exposure -specific effects. Many nominal associations attenuated after 231 covariate adjustment, underscoring the influence of age and exposure imbalance, and several 232 large apparent effects were driven by single exposed individuals and are therefore considered 233 exploratory. All association results can be found in Supplementary File 3. 234 To assess the impact of chronic sun exposure on biological aging, we analyzed paired sun -235 exposed and sun -protected skin samples from 20 participants. The EpiVision perceived -age 236 predictor showed a mean 1.30-year acceleration in sun-exposed tissue relative to sun-protected 237 tissue (paired t -test; Cohen's d_z = 0.50, p = 0.042; Fig. 4D ), providing molecular evidence 238 consistent with chronic UV exposure advancing the biological age of skin. The eyes-age predictor 239 showed a directionally consistent and nominally larger acceleration (+1.61 years, p = 0.013; Fig. 240 4C); although trained on periorbital aging phenotypes, the methylation features underlying this 241 predictor are derived from skin broadly and likely capture UV -responsive epigenetic programs 242 shared across cutaneous sites rather than periorbital-specific anatomy. 243 Beyond epigenetic acceleration, sun exposure broadly altered visible skin phenotypes 244 (Supplementary File 4). Nine of 18 non-age traits (50%) differed significantly between sites, with 245 the largest effects observed for erythema, sagging skin, pigmentary change, and lines, all 246 demonstrating large effect sizes (|d| > 0.8). Acneiform breakouts were more prevalent in sun -247 exposed skin, consistent with UV -induced disruption of barrier integrity, increased sebaceous 248 activity, and follicular occlusion in photodamaged tissue. The convergence of structural, 249 pigmentary, and inflammatory changes across sites indicates that chronic UV exposure perturbs 250 multiple biological axes simultaneously rather than isolated features. Together, these paired -251 sample analyses demonstrate that chronic sun exposure accelerates both molecular epigenetic 252 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint aging and a broad spectrum of visible skin phenotypes, and establish the EpiVision perceived -253 age predictor as a biologically grounded, skin -appropriate molecular readout of environmentally 254 induced aging. 255 Identification of differentially methylated positions associated with facial skin features 256 Having established that skin methylation encodes phenotype -level aging information, we next 257 asked what biological processes underlie this connection and whether distinct visible aging traits 258 share a common epigenetic substrate or arise from independent molecular programs. To address 259 this, we conducted a parallel, independently designed epigenome-wide association study (EWAS) 260 for each of the 19 HAUT phenotypes. Critically, this analysis was structured to adjust for 261 confounders that predictive models absorb implicitly but do not decompose: technical batch 262 effects (Illumina BeadChip), inter -individual variation in skin cell -type composition (EpiSCORE; 263 six fractions), and biological sex. This covariate-adjusted design isolates trait-specific epigenetic 264 signals from artifacts of cellular heterogeneity and demographic imbalance. The number of 265 significant DMPs varied substantially across traits, ranging from 3,050 for lacrimal grooves (0.33% 266 of tested sites) to 367,506 for acneiform breakouts (40.3%), reflecting marked differences in 267 epigenetic architecture across phenotypes. 268 To characterize the shared epigenetic architecture underlying these phenotypes, we performed a 269 cross-trait pleiotropy analysis across all 19 EWAS traits (Supplementary File 5; two traits from the 270 original 21-trait panel, perceived age standard error and quality front, were excluded from the 271 EWAS owing to insufficient phenotypic variance after quality filtering). Of 586,859 unique CpG 272 sites reaching significance (FDR < 0.05) in at least one trait, 43.5% were trait -specific, 48.4% 273 were significant in two to four traits, and 7.8% in five to nine traits. Only 1,483 CpGs (0.25%) were 274 significant in ten or more traits, a distribution that departs dramatically from the expectation under 275 independence (χ² = 2,118,347, df = 3, P < 10 ⁻³⁰⁰). This small set of highly pleiotropic loci, 276 significant across more than half the phenotype panel, represents a shared epigenetic substrate 277 of facial aging. Of these, 55.0% mapped to intergenic regions, suggesting that long -range 278 regulatory elements play a substantial role in coordinating multi -trait aging biology. Pairwise 279 Jaccard similarity of significant CpG sets revealed biologically coherent clustering: perceived age 280 and perceived eye age shared the greatest overlap (J = 0.78), the three uniformness sub-regions 281 formed a tight cluster (J = 0.41 –0.62), pigmentation and sun spots (solar hyperpigmentation) 282 grouped together (J = 0.47), lines and sagging co-clustered (J = 0.37), and irritation and erythema 283 formed a distinct inflammatory module (J = 0.31). Nearly all highly pleiotropic CpGs (99.8%) 284 showed mixed directionality, with hypermethylation in some traits and hypomethylation in others, 285 rather than uniform effects. Because all models adjusted for six EpiSCORE cell-type fractions and 286 technical batch, this mixed directionality is unlikely to reflect compositional confounding and 287 instead suggests that pleiotropic loci participate in genuinely distinct, trait -specific regulatory 288 programs. 289 These multi-trait loci represent candidate targets for future functional and intervention studies: 290 modulating a pleiotropic locus could, in principle, exert coordinated effects across structural, 291 pigmentary, and inflammatory aging axes simultaneously. The most broadly associated CpG was 292 cg16867657 at the ELOVL2 locus, significant in 15 of 19 traits spanning deep lines, fine lines, 293 perceived age, pigmentary change, erythema, sagging, and all four uniformness measures, 294 among others. Other highly pleiotropic loci included cg08870112 (RANBP17; 14 traits), 295 cg07553761 (TRIM59; 14 traits), and cg23571812 (FOXG1; 13 traits), all established epigenetic 296 aging biomarkers. Recurrent gene annotations among the 1,483 highly pleiotropic CpGs included 297 ZIC1 (13 CpGs), PRRT1 (11 CpGs), ZNF154 (7 CpGs), and FOXG1 (6 CpGs), implicating 298 developmental transcription factors and zinc -finger proteins in the shared architecture of skin 299 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint aging. The direction of effect also differed systematically by phenotype category: structural aging 300 features such as lines and sagging showed predominantly hypermethylated DMPs, consistent 301 with progressive locus-specific gain of methylation with aging, while inflammatory traits including 302 irritation and acneiform breakouts showed predominantly hypomethylated positions, suggesting 303 loss of epigenetic regulation at immune-relevant loci. Full DMP results for all 19 traits are provided 304 in Supplementary File 5. 305 Perceived age and perceived eye age. 306 For perceived age (ρ = 0.82), enrichment was dominated by neuronal development and 307 morphogenesis terms including axonogenesis (FDR < 10⁻²²), alongside epithelial–mesenchymal 308 transition (EMT; FDR = 3.43×10 ⁻⁴), apical junction, NOTCH signaling, and estrogen response 309 programs; this implicates coordinated tissue -architectural remodeling rather than isolated 310 molecular damage. Many enriched neuronal guidance genes (SLIT2, ROBO1/2, SEMA3A/F, 311 NCAM1) also have established roles in cutaneous innervation and epithelial organization, 312 underscoring shared developmental programs between neural and skin tissues. Perceived eye 313 age (ρ = 0.765) showed a largely overlapping architecture, with developmental patterning and 314 axonogenesis (FDR < 10⁻²⁴), EMT, and estrogen response as top enriched terms, with additional 315 myogenesis enrichment consistent with periorbital muscle –skin structural interactions. The 316 substantial overlap between these two aging phenotypes, corroborated by their exceptionally high 317 Jaccard similarity (J = 0.78), indicates that shared developmental and structural epigenetic 318 programs underlie aging across anatomically distinct facial regions. 319 Additional traits (ρ > 0.7). Irritation (ρ = 0.74) highlighted innate immune activation, 320 cytokine production, and neurogenesis regulation pathways, pointing to epigenetic encoding of 321 inflammatory skin states. Redness (ρ = 0.70) showed enrichment for neurogenesis regulation and 322 vascular processes including angiogenesis and vascular permeability, reflecting the 323 neurovascular biology underlying cutaneous erythema. Uniformness (left cheek; ρ = 0.71) 324 implicated epithelial organization and keratinocyte turnover programs. Traits with weaker 325 epigenetic–phenotypic correlations (ρ < 0.70), including uniformness overall (ρ = 0.68), pores (ρ 326 = 0.68), lacrimal grooves (tear trough areas; ρ = 0.63), and pigmentation (ρ = 0.55), showed 327 analogous but attenuated enrichment themes involving developmental signaling and epithelial 328 architecture. For pigmentation, despite the more modest predictive correlation, pathway 329 enrichment revealed UV -response and melanocyte morphogenesis programs consistent with 330 photoaging-driven dysregulation of pigment biology, including both hyper- and hypopigmentation. 331 Across all traits, developmental signaling, epithelial junction biology, estrogen response, 332 and UV-associated programs recurred as shared substrates of facial aging, while trait -specific 333 signals, such as UV response in pigmentary change, metabolic stress in irritation, and vascular 334 programs in erythema, indicate layered biological specialization atop this common aging 335 architecture. CpG loci and pathways enriched across multiple phenotypes, including the 336 developmental and epithelial -junction programs shared by perceived age, uniformness, and 337 erythema, may represent particularly actionable intervention targets, as modulating them would 338 be expected to influence multiple aging dimensions simultaneously. Complete pathway 339 enrichment results across GO, KEGG, and Hallmark databases (28,583 enriched terms across 340 19 traits) are provided in Supplementary File 6. 341

Discussion

342 Visible human facial aging is not a single biological process. The findings presented here 343 demonstrate that the skin epigenome encodes distinct molecular programs underlying structural 344 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint aging, pigmentary change, inflammation, and textural change, programs that are partially shared 345 across traits but that also carry trait-specific biological signatures. By integrating AI-derived facial 346 phenotypes with matched skin DNA methylation profiles across 706 individuals, EpiVision 347 establishes that methylation variation in superficial skin samples captures multidimensional aging 348 information with biological coherence, environmental sensitivity, and responsiveness to clinical 349 exposures. Three primary advances emerge: skin methylation accurately predicts specific facial 350 aging dimensions well beyond what chronological age alone explains; trait -specific epigenome-351 wide analyses reveal biologically coherent and partially independent molecular architectures 352 underlying each visible aging axis; and EpiVision predictors capture expected relationships with 353 UV exposure, lifestyle factors, and dermatologic interventions, validating both their biological 354 grounding and practical utility. 355 A particularly compelling observation concerns the relationship between EpiVision's trait 356 architecture and clinically recognized photoaging subtypes. Sachs et al. characterized two 357 divergent forms of UV-induced skin aging: hypertrophic photoaging, marked by coarse wrinkling, 358 skin thickening, and sallowness, and atrophic photoaging, characterized by fine wrinkling, 359 erythema, telangiectasia, and elevated keratinocyte cancer risk 18. Subsequent histological work 360 confirmed distinct structural features underlying these phenotypes, including differential solar 361 elastosis and collagen VII distribution 19. EpiVision's EWAS enrichment signatures map onto this 362 clinical axis with notable specificity. The erythema and irritation predictors showed neurogenesis 363 regulation among their most significantly enriched processes, alongside innate immune 364 activation, vascular processes, and angiogenesis. This profile mirrors the neurogenic and 365 vascular inflammatory biology described as hallmarks of atrophic photoaging. Conversely, the 366 lines, sagging, and deep lines predictors were enriched for extracellular matrix organization, 367 collagen fibril organization, and fibroblast migration, reflecting the structural ECM remodeling and 368 elastotic thickening that define the hypertrophic phenotype. Although the current cohort lacks 369 formal AP/HP clinical classification, the convergence of these orthogonal enrichment signatures 370 suggests that EpiVision's trait architecture may already capture the molecular underpinnings of 371 this clinical dichotomy. Prospective integration of AP/HP classification with EpiVision outputs 372 would provide a framework for testing whether distinct epigenetic signatures predict photoaging 373 subtype or modulate responsiveness to subtype-specific interventions. 374 A key insight from this work is that skin DNA methylation encodes biologically meaningful 375 information about visible aging phenotypes beyond what is predicted by chronological age. This 376 connection has not previously been demonstrated at this phenotypic resolution. Age -related 377 predictors achieved robust test -set correlations (sun -exposure–informed model, ρ = 0.836; 378 perceived age, ρ = 0.819; eye age, ρ = 0.765), while non -age phenotypes including irritation, 379 erythema, and regional uniformness exceeded ρ = 0.70, indicating that methylation captures 380 multidimensional cutaneous biology beyond chronological time. 381 The parallel EWAS analyses, performed independently of model training and adjusted for batch 382 effects, cell -type composition, and sex, provide biological grounding for the predictive 383 performance. For phenotypes with strong epigenetic coupling, enrichment analyses converged 384 on developmental and structural pathways. Perceived age, for instance, was enriched for 385 neuronal development alongside epithelial–mesenchymal transition, apical junction, and estrogen 386 response programs, implicating coordinated tissue -architectural remodeling rather than passive 387 damage accumulation. Several enriched neuronal guidance genes, including SLIT2, ROBO1/2, 388 SEMA3A/F, and NCAM1, also play established roles in cutaneous innervation and epithelial 389 organization, underscoring shared developmental programs between neural and skin tissues. 390 Trait-specific signals aligned with known dermatologic drivers, including UV -response pathways 391 for pigmentation and innate immune activation for irritation. Cross -trait pleiotropy analysis 392 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint identified shared "core" aging loci: cg16867657 (ELOVL2) was significant in 15 of 19 traits, 393 consistent with the established role of ELOVL2 methylation as a pan-tissue aging marker 20. The 394 direction of effect also differed systematically. Structural features showed predominantly 395 hypermethylated DMPs consistent with progressive locus -specific methylation gain, while 396 inflammatory traits showed predominantly hypomethylated positions suggesting loss of epigenetic 397 regulation at immune-relevant loci. 398 We validated EpiVision's biological sensitivity through paired sun-exposure analysis and survey-399 linked lifestyle associations. In 20 participants with matched sun-exposed and non–sun-exposed 400 samples, sun-exposed skin showed accelerated perceived age (+1.30 years, p = 0.042) alongside 401 deterioration in 9 of 18 phenotypic traits. In the lifestyle analyses, exposures showed phenotype-402 specific rather than global aging associations: water intake was linked to reduced dark circles, 403 sleep quality to fine line measures, and self reported topical hyaluronic acid use was the only 404 exposure significantly associated with epigenetic age acceleration across both perceived -age 405 models. This pattern, where lifestyle factors influence individual trait predictors rather than shifting 406 a composite clock, reinforces the value of decomposing skin aging into trait -level components 407 and positions EpiVision as a more informative endpoint than single -score approaches. Notably, 408 numerous raw associations with age -related predictors (testosterone, HRT, diabetes, cancer 409 history) attenuated after EAA adjustment, reinforcing the necessity of acceleration -based 410 modeling to isolate exposure-specific effects from chronological age confounding. 411 EpiVision extends the phenotype -forward direction introduced by Bienkowska et al., who 412 developed VisAgeX, a skin -specific second -generation epigenetic clock predicting visual age 413 progression from epidermal methylation profiles of female volunteers 21. While VisAgeX targeted 414 wrinkle grade and visual facial age using 450K -array data from a single -sex cohort, EpiVision 415 broadens both the phenotypic scope (21 predictors spanning structural, pigmentary, 416 inflammatory, and textural axes) and population diversity (both sexes, multiple Fitzpatrick types) 417 while leveraging EPICv2 coverage for enhanced regulatory CpG representation. EpiVision also 418 complements earlier skin -specific chronological -age clocks 22, senotherapeutic validation 419 platforms 23, and emerging non-invasive epidermal methylomics approaches, including tape-strip 420 whole-genome profiling 24 and sequencing-based epigenetic age predictors optimized for non -421 invasively collected epidermis 25. Together, these advances position skin methylation as an 422 increasingly tractable molecular layer for both aging research and intervention evaluation. 423 Several limitations warrant consideration. The cross-sectional design precludes causal inference 424 and assessment of within -person trajectories; longitudinal tracking will be essential to quantify 425 intervention responsiveness and determine whether EpiVision captures true biological change 426 over time. The cohort's ancestry and Fitzpatrick diversity is limited (Types V–VI comprised 6.5% 427 of the methylation cohort), and expansion to more diverse populations is needed to establish 428 generalizability. Integration with transcriptomic and proteomic layers would connect methylation 429 shifts to functional outputs and strengthen mechanistic interpretation. Continued refinement of 430 deconvolution and sampling protocols will help reduce residual confounding by cellular 431 heterogeneity. Biophysical measures such as transepidermal water loss (TEWL) and skin 432 hydration were not captured in the current study and would complement methylation -derived 433 readouts in future work. Future work should establish whether EpiVision predictors capture true 434 biological trajectories across diverse populations and serve as scalable endpoints for 435 mechanistically informed, data-driven skin-aging models. More broadly, the demonstration that 436 skin methylation encodes biologically distinct aging axes, rather than a single chronological signal, 437 opens a path toward understanding facial aging as a decomposable biological phenotype, with 438 implications for how epigenetic tools are designed, validated, and applied across aging research, 439 clinical dermatology, and regenerative dermatologic practice. 440 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint

Methods

441 442 Study Population and Sample Collection 443 Facial skin samples were collected from participants enrolled in a skin aging cohort, 444 including the TruDiagnostic cohort, Integrative Skin Research (led by Dr. Raja Sivamani), and 445 Boynton Beach Dermatology / Beauty Within (led by Dr. Joseph C. Gretzula), using sterile dermal 446 curettes (Integra Miltex, Integra LifeSciences), an instrument designation reflecting the tool rather 447 than the tissue depth, to obtain predominantly epidermal skin cells, as confirmed by EpiSCORE 448 deconvolution. Skin samples were collected from the cheekbone region and subsequently 449 transferred to collection vials containing DNA/RNA Shield stabilization buffer (Zymo Research, 450 Irvine, CA) and shipped to the laboratory for processing. DNA was extracted from collected skin 451 samples using standard protocols and quantified prior to bisulfite conversion. 452 Participants completed questionnaires, facial images, and provided informed consent prior 453 to participation. The questionnaires captured demographic information as well as relevant skin 454 history, health history, and potential environmental exposures. All study procedures were 455 approved by the relevant institutional review boards. The final analytic sample comprised 508 to 456 706 individuals per phenotype depending on data availability (see Results for phenotype-specific 457 sample sizes). 458 459 Facial Phenotyping Using HAUT Artificial Intelligence Platform 460 461 Image Acquisition. Participants submitted standardized facial photographs (selfies) using 462 their personal mobile devices. Images were captured in three standard orientations: frontal view, 463 left profile, and right profile. Participants were instructed to photograph themselves in well -lit 464 environments without makeup, with hair pulled back to expose the full face, and with neutral facial 465 expressions. Multiple submissions from the same participant were allowed over time to 466 accommodate technical failures or quality issues. 467 468 Automated Phenotype Extraction. All facial images were processed using HAUT Face 469 Analysis 3.0 software (Haut.AI, Tallinn, Estonia), an artificial intelligence-powered computer vision 470 platform designed for automated, objective assessment of skin aging and appearance 471 characteristics from digital photographs. The HAUT platform employs deep convolutional neural 472 networks trained on large -scale annotated datasets of facial images labeled by expert 473 dermatologists for a comprehensive array of skin features. 474 For each submitted image, the HAUT pipeline performs the following automated steps: (1) 475 facial landmark detection to identify anatomically defined regions of interest (forehead, periorbital 476 area, cheeks, perioral region, nose); (2) image quality assessment across multiple technical 477 dimensions including resolution, blur, exposure, illumination uniformity, proper facial positioning, 478 absence of occlusions (eyeglasses, hair, hands), and sufficient visible skin area; (3) feature -479 specific detection using specialized deep learning models for each phenotype (e.g., wrinkle 480 detection models identify line segments and quantify depth, length, and spatial distribution; 481 pigmentation models segment hyperpigmented lesions and compute density and color metrics; 482 inflammatory lesion models identify and count pimples, papules, and pustules); and (4) scoring 483 on continuous scales (0 -100 for most traits, with higher scores generally indicating better skin 484 quality or less severe manifestation; or estimated age in years for perceived aging phenotypes). 485 The HAUT platform generates the following phenotypic outputs that were used in the 486 current epigenome -wide association studies: (1) aging -related phenotypes including 487 chronological age (self-reported age in years) and perceived eye age (AI-estimated apparent age 488 of the periorbital region based on comparison to age -labeled reference datasets); (2) structural 489 aging phenotypes including general lines score (overall facial wrinkle burden), fine lines score 490 (shallow wrinkles 1mm depth), sagging 491 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint score (gravitational descent of facial soft tissues and jowl formation), and lacrimal grooves score 492 (infraorbital hollowing and tear trough deformity); (3) pigmentation phenotypes including overall 493 pigmentation score (dyschromia and uneven skin tone) and sun spots score (discrete solar 494 lentigines and hyperpigmented macules); (4) inflammatory and skin quality phenotypes including 495 breakouts score (acneiform lesions, pustules, papules), irritation score (erythema, scaling, 496 roughness), and redness score (vascular prominence and telangiectasia); (5) textural phenotypes 497 including pores score (visible pore size and density) and puffiness score (periorbital edema); (6) 498 additional localized features including dark circles score (infraorbital hyperpigmentation and 499 shadowing); and (7) skin tone uniformness assessed globally (overall uniformness score) and 500 regionally (forehead, left cheek, right cheek uniformness scores measuring color heterogeneity 501 and blotchiness within each facial zone). 502 503 Quality Control and Best-Row Selection 504 Embedded quality control algorithms within the HAUT platform assigned composite quality scores 505 to each image based on the aforementioned technical criteria. For participants with multiple image 506 submissions, we selected a single "best-row" image per individual defined as the submission with 507 the highest composite quality score, thereby maximizing measurement reliability while avoiding 508 pseudoreplication from repeated measures. Images failing to meet minimum quality thresholds 509 were excluded from analysis. This quality control process resulted in the retention of 1,413 to 510 1,463 individuals with usable HAUT phenotype data depending on the specific trait (see Results 511 for details). 512 513 DNA Methylation Profiling and Preprocessing 514 515 Array-Based Methylation Measurement. Genomic DNA extracted from skin curette 516 samples was bisulfite-converted using the EZ DNA Methylation Kit (Zymo Research, Irvine, CA, 517 USA) according to the manufacturer's protocol. Bisulfite -converted DNA was hybridized to the 518 Illumina Infinium MethylationEPIC v2 BeadChip (Illumina, San Diego, CA, USA), which 519 interrogates DNA methylation at over 935,000 CpG sites across the human genome with 520 enhanced coverage of regulatory regions including gene promoters, CpG islands, enhancers, and 521 CTCF binding sites. Arrays were processed following the Illumina standard protocol, with samples 522 randomly distributed across multiple BeadChips (recorded as the "Beadchip" batch variable) to 523 mitigate batch effects. Fluorescence intensity data (IDAT files) were generated using the Illumina 524 iScan System. 525 Quality Control. Raw IDAT files were imported into R version 4.4.0 (R Core Team, 2024) 526 and processed using the minfi Bioconductor package (Aryee et al., 2014). Quality control was 527 performed using detection p -value filtering with the following criteria: (1) samples with mean 528 detection p-value >0.15 across all probes were excluded; (2) probes with mean detection p-value 529 >0.10 across all samples were removed (n = 19,078 probes excluded). Sex chromosome probes 530 were retained in the methylation feature matrix, as X - and Y-linked CpG methylation captures 531 biologically relevant sex-differential variation in skin biology (15,722 chrX probes, 46 chrY probes 532 retained). 533 Data Normalization. Normalization was performed using the ssNoob (single -sample 534 normal-exponential out -of-band) method implemented in the preprocessNoob function with 535 dyeMethod = "single" (Triche et al., 2013), which models background correction and dye -bias 536 normalization in a single unified framework using out -of-band probe intensities. The ssNoob 537

Method

performs within-array normalization, making it suitable for datasets where samples are 538 processed in multiple batches and providing robustness against batch effects. For epigenetic 539 clock prediction, missing values at required clock CpG sites were imputed using reference-based 540 imputation implemented in the SeSAMe package (Zhou et al., 2018). 541 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint Beta extraction. Following quality control and normalization, 911,580 CpG sites remained 542 for downstream epigenome -wide association analyses, including both autosomal and sex 543 chromosome probes. Methylation levels were represented as beta values (β = M / [M + U + 100], 544 where M and U represent methylated and unmethylated signal intensities, respectively), ranging 545 from 0 (completely unmethylated) to 1 (completely methylated). Genomic coordinates are 546 reported relative to the hg38 human genome reference build (EPICv2 custom array annotation 547 version 20a1). 548 CpG Annotation. CpG sites were annotated using the 549 IlluminaHumanMethylationEPICv2anno.20a1.hg38 R package, which provides gene annotations 550 (mapping to UCSC gene symbols via proximity to transcription start sites), genomic locations 551 (chromosome and position), and regulatory context (CpG island, shore, shelf, or open sea 552 classification). Intergenic CpG sites (those not mapping to annotated gene bodies or promoters 553 within 1.5 kb of transcription start sites) were labeled as "intergenic" throughout the Results. 554 Cell Type Deconvolution for Adjustment of Cellular Heterogeneity. Skin is a 555 heterogeneous tissue comprising multiple cell types with distinct DNA methylation profiles, 556 including keratinocytes, fibroblasts, melanocytes, endothelial cells, and resident immune cells. To 557 distinguish true within -cell methylation changes from artifacts arising due to inter -individual 558 differences in skin cell -type composition, we estimated cell -type proportions from the DNA 559 methylation data using a reference-based deconvolution approach. 560 Cell type fractions were estimated using the EpiSCORE algorithm (Teschendorff et al., 561 2017) with the skin -specific methylation reference matrix mrefSkin.m, as implemented in the 562 EpiSCORE R package version 1.0. Although EpiSCORE was originally developed for blood, we 563 applied the skin-specific reference matrix mrefSkin.m because our samples are derived from skin 564 tissue and the skin-specific reference captures the relevant cell-type heterogeneity in our dataset. 565 The mrefSkin.m reference was derived from purified cell populations isolated from human skin 566 tissue and includes methylation signatures for six cell types: fibroblasts (Fib, dermal cells 567 responsible for extracellular matrix production), differentiated keratinocytes (Kera_diff, terminally 568 differentiated epidermal barrier cells), undifferentiated keratinocytes (Kera_undiff, progenitor 569 keratinocytes in the basal layer), macrophages (Macro, tissue -resident immune cells), 570 melanocytes (Mela, pigment-producing cells), and T cells (Tcell, adaptive immune cells). 571 The EpiSCORE deconvolution procedure consists of two steps. First, CpG -level 572 methylation data (beta values) are aggregated to gene-level methylation summaries by averaging 573 CpG sites within 200 base pairs of annotated transcription start sites (TSS) using the 574 constAvBetaTSS function with the "850k" array type specification (compatible with EPICv2 575 annotation). This gene-level aggregation reduces technical noise and increases the robustness 576 of deconvolution by focusing on regulatory regions most likely to differ across cell types. Second, 577 the gene -level methylation matrix is input to the weighted robust partial correlation (wRPC) 578 function, which estimates cell type proportions by comparing the sample methylation profile to the 579

Reference

cell type signatures while down -weighting unreliable genes (useW = TRUE, weight 580 threshold wth = 0.4). This approach is robust to batch effects and outliers. The resulting cell type 581 fraction estimates (proportion of each cell type per sample, summing to 1 across the six cell types) 582 were merged with phenotypic data and included as covariates in all epigenome-wide association 583 models. 584 585 Development of skin predictors 586 We developed a two -stage framework to identify robust and biologically interpretable DNA 587 methylation features and to train and validate predictive models for multiple biomarker targets 588 using complementary machine learning architectures with systematic hyperparameter 589 optimization. The workflow was designed to capture complex, potentially non -linear and non -590 monotonic relationships between CpG methylation and disease -relevant phenotypes while 591 preserving directional interpretability for selected features. 592 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint To quantify feature importance with a conservative selection threshold, we implemented 593 a hybrid feature -importance procedure that integrates a noise -based baseline, repeated 594 ensemble-derived importance estimation, and post hoc directionality inference. A synthetic 595 Gaussian noise feature was generated using the empirical distribution of the input data and 596 appended to the feature matrix to establish a minimum importance “floor.” During model fitting, 597 this noise feature was treated identically to biological features, enabling prioritization of CpG sites 598 whose importance exceeded that of the noise feature and reducing retention of spurious 599 predictors. Feature importance was estimated using random forest regression under repeated 600 instantiation to reduce variance due to stochastic training effects and mitigate overfitting during 601 feature selection. Specifically, the random forest model was refitted 100 times using independent 602 random seeds and aggregated importance values were used to identify features that were 603 consistently informative across runs. Because tree-based importance measures do not preserve 604 the direction of feature effects, directional relationships for retained features were inferred using 605 multiple linear regression fit to the selected feature set, and the sign of each regression coefficient 606 was used to assign positive or negative directionality relative to the modeled target. This 607 combined approach supports global feature evaluation rather than pairwise screening, captures 608 non-linear and non-monotonic relationships through ensemble modeling, and improves biological 609 interpretability by providing signed effect estimates that clarify how each feature contributes to 610 age estimation, disease prediction, or biomarker quantification. 611 To identify robust predictive signatures in DNA methylation data, we implemented a 612 comprehensive model development and validation framework that evaluates multiple regression 613 and classification algorithms under rigorous hyperparameter optimization. For each target, four 614 model families were trained and compared: ElasticNet, random forest regression, LightGBM 615 gradient boosting, and a multi -layer perceptron. This set was chosen to span regularized linear 616 models, bagged tree ensembles, gradient -boosted trees, and neural networks, reflecting prior 617 observations that optimal modeling architectures can vary across biomarkers as a function of 618 underlying data dispersion and distributional characteristics. 619 All models underwent systematic hyperparameter tuning using the Optuna framework to 620 efficiently navigate complex search spaces while limiting overfitting. ElasticNet models were 621 optimized across the regularization strength and L1 ratio to identify sparse yet stable methylation 622 signatures. Random forest regressors were tuned across the number of estimators, maximum 623 depth, minimum samples required to split, and feature selection settings. LightGBM models were 624 tuned across learning rate, number of leaves, tree depth, feature and bagging fractions, and 625 regularization parameters. The neural network architecture was optimized over the number of 626 hidden layers, nodes per layer, activation function, learning rate, and batch size. 627 628 Model performance was evaluated using a standard hold -out validation strategy with a random 629 80/20 train–test split. Models were trained on unadjusted HAUT phenotype scores to preserve 630 the full range of biologically meaningful variation, including sex -dependent and age -dependent 631 differences in skin aging; biological sex and chronological age were instead included as 632 covariates in all downstream association and EWAS analyses to isolate exposure - and trait -633 specific effects. Predictive performance was quantified using mean absolute error to assess 634 absolute accuracy and Spearman correlation to assess rank preservation between predicted and 635 observed values. Final model selection was based on minimizing a composite objective in which 636 absolute error was penalized based on Spearman’s rho, thereby balancing accuracy with 637 correlation strength and prioritizing models that were both accurate and rank -consistent for the 638 outcome of interest. 639 640 Epigenome-Wide Association Studies: Statistical Framework 641 642 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint Data Preparation and Transformation. For each of the 19 HAUT -derived phenotypes, samples 643 with missing phenotype values were excluded from the analysis. The remaining samples were 644 subset from the full methylation matrix, and beta values were logit-transformed to M-values using 645 the formula M = log₂(β / [1 - β]) to better satisfy the assumptions of linear regression, as M-values 646 are more homoscedastic (constant variance) across the methylation range compared to beta 647 values (Du et al., 2010). CpG sites with any infinite or missing M-values (which can occur if beta 648 values are exactly 0 or 1 due to technical artifacts) were excluded from the analysis. 649 650 Differential Methylation Analysis. Differential methylation analysis was performed using the limma 651 package version 3.62.1 (Ritchie et al., 2015), which implements linear models for each CpG site 652 with empirical Bayes moderation of standard errors to improve statistical power, particularly for 653 small to moderate sample sizes. For each phenotype, we fit the following linear model at each 654 CpG site: 655 656 M value ~ phenotype + Beadchip + Fib + Kera_diff + Kera_undiff + Macro + Mela 657 658 where M represents the M-value (logit-transformed methylation: log₂[β/(1-β)]) at a given CpG site; 659 phenotype is the continuous HAUT -derived score (0 –100 scale), representing the primary 660 independent variable of interest; Beadchip is a categorical variable (factor) representing the 661 Illumina BeadChip identifier on which the sample was processed, included to adjust for technical 662 batch effects; and EC, Fib, Kera_diff, Kera_undiff, Macro, and Mela are continuous numeric 663 variables representing the estimated cell type fractions (proportions) for endothelial cells, 664 fibroblasts, differentiated keratinocytes, undifferentiated keratinocytes, macrophages, and 665 melanocytes, respectively, included to adjust for confounding due to inter -individual variation in 666 cellular composition. All models used treatment (dummy) coding for categorical variables, as 667 specified by options(contrasts = c("contr.treatment", "contr.treatment")) in R. Note: Tcell (T cells) 668 was initially included but was non-estimable across all phenotypes due to perfect collinearity with 669 other cell type fractions and was automatically dropped by the model fitting algorithm. 670 The linear model was fit using the `lmFit` function, which efficiently estimates regression 671 coefficients across all CpG sites simultaneously via matrix operations. Empirical Bayes shrinkage 672 of residual variances was then applied using the `eBayes` function, which borrows information 673 across CpG sites to stabilize variance estimates and compute moderated t-statistics. Differential 674 methylation statistics for the phenotype coefficient (logFC, t -statistic, unadjusted P-value) were 675 extracted using the `topTable` function with `coef = "phenotype"`. 676 677 Functional Enrichment analysis. To identify biological pathways associated with differentially 678 methylated positions, we performed functional enrichment analysis using three complementary 679 approaches. Gene Ontology (GO) enrichment analysis (Biological Process, Molecular Function, 680 and Cellular Component) was conducted using clusterProfiler (v4.0) with gene -centric mapping 681 of CpG sites to their nearest genes. KEGG pathway enrichment was performed using 682 clusterProfiler with the same gene mapping strategy. MSigDB Hallmark gene set enrichment was 683 conducted using missMethyl (v1.26.0), which accounts for the varying number of CpG probes per 684 gene on the Illumina EPIC array. All enrichment analyses were performed independently for each 685 of the 19 facial aging traits using differentially methylated positions at FDR < 0.05 as input. 686 Enriched pathways were considered significant at FDR < 0.05 after Benjamini -Hochberg 687 correction for multiple testing. 688 689 Interpretation of Effect Sizes. The logFC (log fold -change) coefficient from the limma model 690 represents the estimated change in M -value per unit increase in the phenotype. For continuous 691 HAUT scores (0-100 scale), logFC represents the M-value change per 1-point increase in score 692 (e.g., from score 50 to score 51). For age (years), logFC represents the M -value change per 1-693 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint year increase in age. Positive logFC values indicate hypermethylation (increased methylation 694 associated with higher phenotype values), while negative logFC values indicate hypomethylation 695 (decreased methylation associated with higher phenotype values). Because HAUT scores use an 696 inverse convention for many traits (higher scores = better skin quality = less severe phenotype), 697 the biological interpretation of effect direction must account for this scoring system (see Results 698 for detailed interpretation). 699 700 Multiple Testing Correction. To account for multiple hypothesis testing across 911,580 CpG sites, 701 we applied the Benjamini -Hochberg false discovery rate (FDR) correction using the 702 `adjust.method = "fdr"` option in limma's `topTable` function. The FDR -adjusted P -values 703 (adj.P.Val) control the expected proportion of false positives among all CpG sites declared 704 significant. We used FDR < 0.05 as the primary significance threshold throughout this study, 705 meaning that among all CpG sites called significant, we expect fewer than 5% to be false positives 706 on average. Secondary significance thresholds of FDR < 0.10 and FDR < 0.01 were also 707 computed for sensitivity analyses. 708 709 Genomic Inflation Assessment. For each phenotype, we computed the genomic inflation factor 710 (lambda, λ) as the ratio of the median observed chi-squared test statistic to the expected median 711 under the null hypothesis (0.456), based on the conversion of P-values to chi-squared statistics. 712 Lambda values substantially greater than 1.0 can indicate either genuine epigenome-wide signal 713 (many true associations distributed across the genome) or systematic bias (e.g., residual 714 population stratification, uncorrected batch effects, or incomplete adjustment for confounders). 715 We examined quantile -quantile (QQ) plots of observed versus expected P -values for each 716 phenotype to distinguish between these scenarios: genuine polygenic signal manifests as 717 deviation from the null distribution primarily in the upper tail (most significant CpGs), while 718 systematic bias produces uniform inflation across the entire P-value distribution. 719 720 Cross-Trait Pleiotropy Analysis. To assess the extent of shared epigenetic architecture across 721 skin aging phenotypes, we identified all unique CpG sites reaching significance (FDR < 0.05) in 722 at least one phenotype. For each such CpG site, we counted the number of traits in which it was 723 significant, yielding a "trait count" ranging from 1 (trait -specific) up to 19 (significant in all 724 phenotypes). CpG sites significant in 10 or more traits were classified as "highly pleiotropic" and 725 were examined for evidence of directional consistency (i.e., whether the sign of the effect was 726 consistent across the traits in which the CpG was significant). Pairwise overlap between 727 phenotypes was quantified using the Jaccard similarity index, defined as the size of the 728 intersection (CpGs significant in both traits) divided by the size of the union (CpGs significant in 729 either trait), ranging from 0 (no overlap) to 1 (perfect overlap). 730 731 Visualization. QQ plots, Manhattan plots, and volcano plots were generated for each phenotype 732 using the ggplot2 package version 3.5.1. Manhattan plots display -log₁₀(P-value) for each CpG 733 site on the y -axis against genomic position (chromosome and position) on the x -axis, with 734 horizontal lines indicating genome -wide significance thresholds (FDR < 0.05). Volcano plots 735 display -log₁₀(P-value) on the y -axis against logFC on the x -axis, highlighting both statistical 736 significance and magnitude of effect. QQ plots display observed -log₁₀(P-values) against 737 expected values under the null hypothesis to assess calibration and identify inflation. All plots 738 were saved as high-resolution JPG images. 739 740 Software and Package Versions 741 All analyses were conducted in R version 4.4.0 (2024-04-24, "Puppy Cup"). Key R packages and 742 versions used include: 743 744 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint 1. Methylation data processing: 745 a. SeSAMe (version 1.20.0): Preprocessing, quality control, normalization (ssNoob 746 method) 747 b. IlluminaHumanMethylationEPICv2manifest (version 0.99.0): Array manifest 748 c. IlluminaHumanMethylationEPICv2anno.20a1.hg38 (version 0.99.0): CpG 749 annotation 750 2. Cell type deconvolution: 751 a. EpiSCORE (version 1.0.0): Reference-based deconvolution with mrefSkin.m 752 3. Differential methylation analysis: 753 a. limma (version 3.58.1): Linear models and empirical Bayes moderation 754 b. DMRcate (version 3.0.0): Differentially methylated region identification (planned 755 for future analysis) 756 4. Data manipulation and visualization: 757 a. data.table (version 1.15.4): Efficient data import and manipulation 758 b. dplyr (version 1.1.4): Data wrangling 759 c. tidyr (version 1.3.1): Data tidying 760 d. ggplot2 (version 3.5.1): Publication -quality graphicsggrepel (version 0.9.5): Non -761 overlapping text labels in plots 762 763 Scripts implementing these analyses are available from the authors upon request. 764 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint Acknowledgements. We would like to thank Chatlin Helm and Razelle Ann Piculados for their 765 valuable help with data management. We would also like to thank the many volunteers who 766 supported this study by providing skin samples which were used for the development of 767 predictors. 768 Data Availability. Due to privacy considerations and institutional review board restrictions, 769 individual-level methylation and phenotype data cannot be publicly shared. The HAUT Face 770 Analysis 3.0 software is proprietary and available under commercial license from Haut.AI 771 (https://haut.ai). Raw methylation array data (Illumina EPIC v2) will be available upon reasonable 772 request. The crosswalk linking anonymous subject identifiers to internal sample IDs is available 773 from the corresponding author under a data use agreement (DUA) to protect participant privacy. 774 Additional data supporting the findings of this study are available from the corresponding author 775 upon reasonable request. Full supplementary data and information is available on Zenodo at 776 https://zenodo.org/records/19006544. 777 Code Availability. Analysis code for the EWAS, cross -trait pleiotropy analysis, lifestyle 778 association analyses, and figure generation will be deposited on Zenodo (779 https://zenodo.org/records/19006544) upon acceptance and made publicly available at that time. 780 Author Contributions. Design and conceptualization: VBD, RS [Ryan Smith], LP. Methylation 781 data generation and processing: VBD. Statistical analyses: VBD, SH. Predictive model 782 development: SH, VBD. Clinical data generation and sample provision: RS [Sivamani], JG. Facial 783 imaging data generation and processing: AG, KK. Sample collection and processing: DS, JW. 784 Scientific discussion and interpretation: GMF, YC, KS. Manuscript drafting: VBD, SH. All authors 785 reviewed, edited, and approved the final manuscript. 786 Competing Interests. VBD, SH, KS, DS, JW, and RMS are employees of TruDiagnostic Inc. AG 787 and KK are employees of Haut.AI. LP is an employee of Olaplex Inc. GMF, YC, JG, and RS 788 declare no competing interests. 789 Consent for Publication. Facial images were collected for automated phenotyping via HAUT.ai 790 but are not reproduced in this article. All participants provided informed consent for the use of 791 their biological samples and imaging data in research. No identifiable participant data appear in 792 this publication. 793 794 795 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint TABLE 796 797 Table 1 | Performance of V2 Models for Skin Trait Prediction. Summary of the final selected 798 model for each of the 21 predictors, sorted by descending Spearman R correlation. MAE (Mean 799 Absolute Error), best Model Type (LightGBM or RandomForest), and sample sizes (total, train, 800 and test) are shown for each predictor. 801 802 803 Predictor Model Type MAE Spearman rho) Total Samples Train Samples Test Samples age_age_se LightGBM 5.3 0.836 706 564 142 age_age LightGBM 5.14 0.819 706 564 142 eyes_age_eyes_age LightGBM 5.37 0.765 706 564 142 irritation_score LightGBM 6.8 0.738 584 467 117 uniformness_left_cheek_score LightGBM 10.31 0.709 679 543 136 redness_score LightGBM 9.55 0.705 584 467 117 uniformness_score LightGBM 8.18 0.682 706 564 142 pores_score LightGBM 5.7 0.681 706 564 142 uniformness_right_cheek_score LightGBM 9.81 0.645 678 542 136 lacrimal_grooves_score LightGBM 12.72 0.625 706 564 142 uniformness_forehead_score LightGBM 10.27 0.582 691 552 139 sun_spots_score LightGBM 6.09 0.566 627 501 126 puffiness_score RandomForest 8.65 0.563 706 564 142 pigmentation_score RandomForest 7.87 0.551 627 501 126 fine_lines_score LightGBM 11.52 0.545 508 406 102 sagging_score RandomForest 17.42 0.537 508 406 102 lines_score LightGBM 12.07 0.494 508 406 102 quality_front_score RandomForest 11.6 0.478 706 564 142 breakouts_score LightGBM 5.19 0.471 706 564 142 deep_lines_score LightGBM 13.69 0.44 508 406 102 dark_circles_score LightGBM 9.58 0.416 706 564 142 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint Figures 804 805 806 807 Figure 1 | Study design and analytic workflow. Facial photographs were processed with HAUT 808 AI to derive 19 phenotypes (1,463 submissions; n = 1,413 individuals after QC). Facial skin 809 methylation was profiled on the Illumina EPICv2 array (n = 744; 911,580 CpGs after QC). A 810 matched subset (n = 706) was used for two parallel analyses: (i) cell -type-adjusted EWAS of 811 CpG–phenotype associations (EpiSCORE; 6 cell types) and (ii) EpiVision model development 812 using an 80:20 train –test split, evaluated by MAE and Spearman ρ. EpiVision comprises 21 813 predictors (3 age-related, 18 phenotype-specific). A survey-linked subset (n = 282; 33 exposures) 814 was used for association testing with both unadjusted and age/sex -adjusted models. Sample 815 sizes, algorithm selection, and model details are provided in Supplementary Files 1 and 2. 816 817 818 819 QC/Normalization .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint 820 821 Figure 2 | HAUT imaging phenotypes in the full cohort and demographic correlations. (A) 822 Distributions of HAUT-derived phenotype scores across the full imaging cohort (blue; n = 1,413) 823 and the subset with paired DNA methylation data (orange; n = 706). Violin plots show score 824 distributions; boxplots indicate Q1 –Q3 with median, whiskers denote min –max, and diamonds 825 denote the mean. (B) Chronological age distribution for participants with age metadata (n = 247; 826 mean 40.3 ± 16.3 years). (C) Fitzpatrick skin type composition for participants with available data 827 (n = 481), with counts shown above bars. ( D) Pearson correlations between chronological age 828 and HAUT phenotypes (n = 191 –246 depending on phenotype). ( E) Spearman correlations 829 between Fitzpatrick skin type and HAUT phenotypes (n = 364 –468). Bar direction and color 830 indicate positive and negative correlations; values are labeled on each bar. 831 832 833 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint 834 835 Figure 3 | Model performance using EPICv2 -only features versus the V1 –V2 overlapping 836 CpG set. Performance is shown for 20 targets (chronological age and 19 HAUT phenotypes) 837 trained with identical modeling procedures but different CpG feature spaces. (A) Mean absolute 838 error (MAE; lower is better) for each target. Dashed lines indicate across -target means (V1–V2 839 overlap: 9.83; V2 -only: 9.38). (B) Spearman correlation (ρ; higher is better) for each target. 840 Dashed lines indicate across-target means (V1–V2 overlap: 0.546; V2-only: 0.601). (C) Percent 841 change in performance for V2-only relative to V1–V2 overlap (negative MAE change and positive 842 Spearman change indicate improvement). (D) Summary of meaningful change (≥5% threshold): 843 9/20 targets (45%) showed meaningful MAE improvement with no meaningful degradation in any 844 target; 8/20 targets (40%) showed meaningful Spearman improvement, with 4/20 (20%) showing 845 meaningful degradation. 846 847 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint 848 849 Figure 4 | Lifestyle, clinical, and sun -exposure correlates of EpiVision predictors. (A,B) 850 Forest plots showing standardized effect sizes (β) and 95% confidence intervals for associations 851 between self-reported exposures and EpiVision predictors in the survey-linked subset (n = 282). 852 Blue circles denote unadjusted estimates; orange squares denote covariate -adjusted estimates 853 (age/sex-adjusted for phenotype predictors; epigenetic age acceleration for age -related 854 predictors). Sample sizes for each test are shown as n (raw → adjusted), reflecting differences 855 due to covariate availability. (A) Lifestyle correlates. (B) Clinical and medication correlates. 856 Significance asterisks are based on the adjusted model (*P < 0.05, **P < 0.01, ***P < 0.001). (C–857 E) Paired comparisons of EpiVision age predictions between sun-protected and sun-exposed skin 858 from the same individuals (n = 20). Grey lines connect within-participant measurements. Boxplots 859 show median, interquartile range, and 1.5×IQR whiskers. (C) Eyes Age. (D) Perceived Age. (E) 860 Sun-exposure-informed Perceived Age (SE). Cohen's d_z and P values from paired t -tests are 861 shown above each panel. 862 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint Supplementary Figure 863 864 865 866 Supplementary FIGURE S1 | Skin cell -type composition estimates across samples. 867 Boxplots summarize EpiSCORE-inferred skin cell-type fractions across 744 facial skin samples, 868 including endothelial cells (EC), fibroblasts (Fib), differentiated keratinocytes (Kera_diff), 869 undifferentiated keratinocytes (Kera_undiff), macrophages (Macro), melanocytes (Mela), and T 870 cells (Tcell). Fractions are expressed as percentages summing to ~100% per sample. 871 Differentiated keratinocytes comprised the largest component (median 32.82%), followed by 872 endothelial cells (26.01%) and undifferentiated keratinocytes (21.39%). Macrophage fractions 873 were variable (median 13.53%), whereas T cells were lower on average (median 0.83%). 874 Fibroblast (median 0.00%) and melanocyte (median 0.00%) fractions were generally minimal 875 across samples. Center lines indicate the median, boxes denote the interquartile range (Q1–Q3), 876 whiskers extend to 1.5×IQR, and points represent outliers. The predominance of keratinocytes 877 with a notable endothelial contribution is consistent with superficial, epidermis-enriched sampling, 878 and these estimates were included as covariates in downstream methylation analyses to account 879 for cellular heterogeneity. 880 881 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint Supplementary File Legends 882 883 SUPPLEMENTARY FILE 1 | Cohort characteristics and demographic composition. 884 Summary of sample collection, data availability, and key demographics for the HAUT imaging 885 cohort, the Illumina EPIC v2 methylation cohort, the survey -linked subset, and the matched 886 HAUT–methylation subset used for model development and downstream analyses. Age statistics 887 are reported for participants with available metadata. Fitzpatrick skin type counts are shown for 888 the methylation cohort; “not reported” indicates missing Fitzpatrick data. ᵃSample size varies 889 across the 21 EpiVision predictors due to phenotype -specific completeness and covariate 890 availability. 891 892 SUPPLEMENTARY FILE 2 | Complete performance metrics for all EPICv2 (V2 -Only) and 893 V1–V2 overlap predictive models across skin trait targets. All metrics are reported on the 894 held-out test set (20% of total samples). ΔMAE and ΔSpearman R reflect the percent change of 895 the EPICv2 model relative to the V1–V2 overlap model (negative ΔMAE indicates improvement; 896 positive ΔSpearman R indicates improvement). Targets are sorted alphabetically. Probe counts: 897 EPICv2 probe set uses all stable EPICv2 probes; V1–V2 overlap probe set is restricted to probes 898 present on both EPICv1 and EPICv2 arrays. 899 900 SUPPLEMENTARY FILE 3 | Association results between self-reported lifestyle, clinical, and 901 medication exposures and EpiVision methylation-based facial aging predictors. Results are 902 reported for 33 exposures tested against 21 EpiVision predictors in the survey-linked subset (n = 903 282) across four models: unadjusted raw predicted age (Age – Raw); epigenetic age acceleration 904 (Age – EAA), in which predicted age is residualized on chronological age to isolate biological 905 aging effects independent of time; unadjusted phenotype scores (Phenotype – Raw); and 906 age/sex-adjusted phenotype scores (Phenotype – Model 2), restricted to participants ≥16 years. 907 Exposures include binary clinical and medication variables (e.g., retinoid use, hormone therapy, 908 spironolactone, testosterone) and ordinal lifestyle variables (e.g., sleep quality, diet quality, water 909 intake, tobacco status). Standardized beta coefficients are reported throughout to enable cross -910 exposure comparability. FDR correction applied within adjusted models using the Benjamini –911 Hochberg method. Associations highlighted in the main text reach p 1; associations with n_exposed = 1 are considered exploratory. 913 914 SUPPLEMENTARY FILE 4 | Paired comparison of HAUT -derived skin phenotype scores 915 between sun-protected and sun-exposed skin sites. Results from paired t-tests comparing 21 916 facial aging and skin quality phenotypes between sun -protected buttock skin and sun -exposed 917 forearm skin collected from the same individuals (n = 20 per trait). Traits are sorted by ascending 918 p-value. Mean Difference reflects forearm minus buttock values; negative values indicate lower 919 scores in sun-exposed skin (i.e., worse phenotype given HAUT's scoring convention where higher 920 = better quality), and positive values indicate higher scores in sun -exposed skin. Cohen's d_z is 921 the within-subject standardized mean difference, calculated as the mean paired difference divided 922 by the standard deviation of the paired differences, and reflects effect size for paired designs. The 923 Panel column corresponds to figure panel labels in the associated supplementary figure. 924 Statistical significance threshold: p < 0.05 (two -sided paired t -test, no correction for multiple 925 comparisons across traits). 926 927 SUPPLEMENTARY FILE 5 | Epigenome-Wide Association Study Results for 19 Skin Aging 928 and Appearance Traits. N, sample size; Lambda, genomic inflation factor; DMPs, differentially 929 methylated positions at FDR < 0.05; % Sig, percentage of tested CpGs (911,580 total) reaching 930 significance; % Hyper, percentage of significant DMPs showing hypermethylation (positive effect); 931 Top CpG, most significant DMP by P-value; logFC, log fold-change in M-values per unit increase 932 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint in phenotype. All models adjusted for technical batch (Illumina BeadChip) and six skin cell type 933 fractions (fibroblasts, differentiated keratinocytes, undifferentiated keratinocytes, macrophages, 934 melanocytes, T cells) estimated using EpiSCORE with skin-specific reference. 935 936 SUPPLEMENTARY FILE 6 | Functional enrichment analyses for differentially methylated 937 positions associated with 19 facial aging and skin quality traits. Pathway enrichment was 938 performed independently for each trait using genes mapped to significant DMPs (FDR < 0.05) 939 identified in the covariate-adjusted EWAS, across four complementary databases: Gene Ontology 940 Biological Process (GO_BP), Molecular Function (GO_MF), and Cellular Component (GO_CC); 941 KEGG Pathways; and MSigDB Hallmark gene sets. In total, 28,583 enriched pathways were 942 identified across all 19 traits, sorted by trait and ascending p-value. Significance threshold: FDR 943 < 0.05. 944 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint

References

945 1. Bobrov, E. et al. PhotoAgeClock: deep learning algorithms for development of non -946 invasive visual biomarkers of aging. Aging 10, 3249–3259 (2018). 947 2. Levine, M. E. et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging 948 10, 573–591 (2018). 949 3. Lu, A. T. et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging 950 11, 303–327 (2019). 951 4. Lu, A. T. et al. DNA methylation GrimAge version 2. Aging 14, 9484–9549 (2022). 952 5. Chen, Q. et al. OMICmAge: An integrative multi-omics approach to quantify biological age 953 with electronic medical records. bioRxiv 2023.10.16.562114 (2023) 954 doi:10.1101/2023.10.16.562114. 955 6. Boroni, M. et al. Highly accurate skin-specific methylome analysis algorithm as a platform 956 to screen and validate therapeutics for healthy aging. Clin. Epigenetics 12, 105 (2020). 957 7. Bienkowska, A. et al. Development of an epigenetic clock to predict visual age progression 958 of human skin. Front. Aging 4, 1258183 (2024). 959 8. Menendez Vazquez, A. et al. Epigenetic age predictors for non -invasive assessment of 960 human skin. Npj Aging 12, 11 (2025). 961 9. Zonari, A. et al. Senotherapeutic peptide treatment reduces biological age and 962 senescence burden in human skin models. NPJ Aging 9, 10 (2023). 963 10. Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol. 14, 3156 964 (2013). 965 11. Horvath, S. et al. Epigenetic clock for skin and blood cells applied to Hutchinson Gilford 966 Progeria Syndrome and ex vivo studies. Aging 10, 1758–1775 (2018). 967 12. de Lima Camillo, L. P., Lapierre, L. R. & Singh, R. A pan -tissue DNA -methylation 968 epigenetic clock based on deep learning. Npj Aging 8, 4 (2022). 969 13. Ndhlovu, L. C. et al. Retro-age: A unique epigenetic biomarker of aging captured by DNA 970 methylation states of retroelements. Aging Cell 23, e14288 (2024). 971 14. Boroni, M. et al. Highly accurate skin-specific methylome analysis algorithm as a platform 972 to screen and validate therapeutics for healthy aging. Clin. Epigenetics 12, 105 (2020). 973 15. Menendez Vazquez, A. et al. Epigenetic age predictors for non -invasive assessment of 974 human skin. Npj Aging 12, 11 (2025). 975 16. Christensen, K. et al. Perceived age as clinically useful biomarker of ageing: cohort study. 976 BMJ 339, b5262 (2009). 977 17. Gunn, D. A., Larsen, L. A., Lall, J. S., Rexbye, H. & Christensen, K. Mortality is Written on 978 the Face. J. Gerontol. A. Biol. Sci. Med. Sci. 71, 72–77 (2016). 979 18. Sachs, D. L. et al. Atrophic and hypertrophic photoaging: Clinical, histologic, and 980 molecular features of 2 distinct phenotypes of photoaged skin. J. Am. Acad. Dermatol. 81, 480–981 488 (2019). 982 19. Langton, A. K. et al. Distinctive clinical and histological characteristics of atrophic and 983 hypertrophic facial photoageing. J. Eur. Acad. Dermatol. Venereol. 35, 762–768 (2021). 984 20. Garagnani, P. et al. Methylation of ELOVL2 gene as a new epigenetic marker of age. 985 Aging Cell 11, 1132–1134 (2012). 986 21. Bienkowska, A. et al. Development of an epigenetic clock to predict visual age progression 987 of human skin. Front. Aging 4, 1258183 (2024). 988 22. Boroni, M. et al. Highly accurate skin-specific methylome analysis algorithm as a platform 989 to screen and validate therapeutics for healthy aging. Clin. Epigenetics 12, 105 (2020). 990 23. Zonari, A. et al. Senotherapeutic peptide treatment reduces biological age and 991 senescence burden in human skin models. NPJ Aging 9, 10 (2023). 992 24. Banila, C. et al. A noninvasive method for whole-genome skin methylome profiling. Br. J. 993 Dermatol. 189, 750–759 (2023). 994 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint 25. Menendez Vazquez, A. et al. Epigenetic age predictors for non -invasive assessment of 995 human skin. Npj Aging 12, 11 (2025). 996 .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted March 16, 2026. ; https://doi.org/10.64898/2026.03.13.711642doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-NC-4.0