From Polygenic Scores to Phenotypic Screening: A Multi-Trait Framework for Cost-Free Risk Stratification in Endometriosis

In: Epidemiology, Biostatistics, and Public Health · 2025 · doi:10.54103/2282-0930/29553 · W4414148610
article OA: diamond CC0

Abstract

Introduction: Endometriosis is a chronic inflammatory condition affecting approximately 10% of women of reproductive age and is often diagnosed late due to nonspecific symptoms, overlap with common conditions such as primary dysmenorrhea, and the reliance on invasive laparoscopy [1,2]. Early detection could reduce patient burden and long-term complications, but current diagnostic tools remain limited. Genome-wide association studies (GWAS) have identified several genetic risk variants [3], yet their individual effects are modest. Polygenic risk scores (PRSs), which aggregate the effects of multiple variants, show promise but still lack the accuracy required for clinical application due to limited replication, small effect sizes, and population-specific variability [4,5]. Recent findings suggest that endometriosis is linked to a range of genetically influenced traits—such as immune, metabolic, and psychiatric characteristics—pointing toward the potential of multi-trait approaches to improve early, non-invasive risk stratification. Objectives: This study aims to develop a non-invasive, cost-effective strategy for endometriosis risk stratification using a genetics-informed, two-phase approach. First, we evaluated whether polygenic scores (PRSs) related to a broad spectrum of complex traits could predict disease risk and reveal genetically defined subgroups among patients. Next, we identified the most informative traits associated with these genetic risk profiles and translated them into a targeted phenotypic questionnaire. We then assessed whether this phenotype-only model could accurately classify endometriosis cases, offering a feasible alternative to genetic testing for early detection in real-world settings. Methods: We analyzed 1,996 genotyped women (862 cases, 1,134 controls) and computed 4,490 PRSs across complex traits. After filtering and trait mapping, 645 scores were retained; one per trait was selected via bootstrap logistic regression (218), then reduced to 40 via LASSO. Supervised machine learning models (logistic regression, random forest, XGBoost, neural networks) [6,7] were trained to evaluate the predictive performance of the PRS-based model. Top-ranking PRSs from the best-performing model were used to cluster endometriosis cases, identifying genetically defined subgroups. Traits linked to these PRSs were used to design a targeted phenotypic questionnaire. The questionnaire was tested in an independent cohort (n = 506), where curated phenotypic features were used to train classification models. The best model was then used to generate a non-invasive, phenotype-only risk score for endometriosis stratification. Results: The multi-PRS model significantly outperformed the endometriosis-specific PRS (AUC = 0.636 vs. 0.546, p < 0.001), with key contributions from traits related to height, early menarche, schizophrenia, and autoimmune disorders. Clustering based on the most informative PRSs identified two genetically defined subgroups with distinct clinical characteristics, including differences in endometrioma prevalence, gastrointestinal symptoms, and disease stage. A phenotype-only model trained on questionnaire data demonstrated high discriminative ability (AUC = 0.904), with CA125, fatigue, gynecological symptoms, and muscle pain emerging as the most informative features, supporting its potential as a cost-effective and non-invasive tool for early risk stratification. Conclusions: Our results demonstrate that leveraging polygenic information to identify trait-level predictors enables the development of accurate, phenotype-based models for endometriosis risk stratification. The use of AI-driven approaches allows robust prediction from a minimal set of non-invasive, low-cost clinical features—reducing reliance on genetic testing and supporting more accessible, early diagnostic strategies within precision gynecology.

My notes (saved in your browser only)

Condition tags

endometriosisendometriomadysmenorrhea

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

openalex
last seen: 2026-06-10T17:14:06.276822+00:00
License: CC0 · commercial use OK